Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Harmanci AO, Sharma G, Mathews DH. PARTS: probabilistic alignment for RNA joinT secondary structure prediction. Nucleic Acids Res 2008;36:2406-17. [PMID: 18304945 PMCID: PMC2367733 DOI: 10.1093/nar/gkn043] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

For:	Harmanci AO, Sharma G, Mathews DH. PARTS: probabilistic alignment for RNA joinT secondary structure prediction. Nucleic Acids Res 2008;36:2406-17. [PMID: 18304945 PMCID: PMC2367733 DOI: 10.1093/nar/gkn043] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Number

Cited by Other Article(s)

Chen CC, Chan YM, Jeong H. REDalign: accurate RNA structural alignment using residual encoder-decoder network. BMC Bioinformatics 2024;25:346. [PMID: 39501155 PMCID: PMC11539752 DOI: 10.1186/s12859-024-05956-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 10/11/2024] [Indexed: 11/08/2024] Open

Abstract

BACKGROUND

RNA secondary structural alignment serves as a foundational procedure in identifying conserved structural motifs among RNA sequences, crucially advancing our understanding of novel RNAs via comparative genomic analysis. While various computational strategies for RNA structural alignment exist, they often come with high computational complexity. Specifically, when addressing a set of RNAs with unknown structures, the task of simultaneously predicting their consensus secondary structure and determining the optimal sequence alignment requires an overwhelming computational effort of O ( L 6 ) for each RNA pair. Such an extremely high computational complexity makes these methods impractical for large-scale analysis despite their accurate alignment capabilities.

RESULTS

In this paper, we introduce REDalign, an innovative approach based on deep learning for RNA secondary structural alignment. By utilizing a residual encoder-decoder network, REDalign can efficiently capture consensus structures and optimize structural alignments. In this learning model, the encoder network leverages a hierarchical pyramid to assimilate high-level structural features. Concurrently, the decoder network, enhanced with residual skip connections, integrates multi-level encoded features to learn detailed feature hierarchies with fewer parameter sets. REDalign significantly reduces computational complexity compared to Sankoff-style algorithms and effectively handles non-nested structures, including pseudoknots, which are challenging for traditional alignment methods. Extensive evaluations demonstrate that REDalign provides superior accuracy and substantial computational efficiency.

CONCLUSION

REDalign presents a significant advancement in RNA secondary structural alignment, balancing high alignment accuracy with lower computational demands. Its ability to handle complex RNA structures, including pseudoknots, makes it an effective tool for large-scale RNA analysis, with potential implications for accelerating discoveries in RNA research and comparative genomics.

Collapse

Mittal A, Ali SE, Mathews DH. Using the RNAstructure Software Package to Predict Conserved RNA Structures. Curr Protoc 2024;4:e70054. [PMID: 39540715 DOI: 10.1002/cpz1.70054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]

Abstract

The structures of many non-coding RNAs (ncRNA) are conserved by evolution to a greater extent than their sequences. By predicting the conserved structure of two or more homologous sequences, the accuracy of secondary structure prediction can be improved as compared to structure prediction for a single sequence. Here, we provide protocols for the use of four programs in the RNAstructure suite to predict conserved structures: Multilign, TurboFold, Dynalign, and PARTS. TurboFold iteratively aligns multiple homologous sequences and estimates the pairing probabilities for the conserved structure. Dynalign, PARTS, and Multilign are dynamic programming algorithms that simultaneously align sequences and identify the common secondary structure. Dynalign uses a pair of homologs and finds the lowest free energy common structure. PARTS uses a pair of homologs and estimates pairing probabilities from the base pairing probabilities estimated for each sequence. Multilign uses two or more homologs and finds the lowest free energy common structure using multiple pairwise calculations with Dynalign. It scales linearly with the number of sequences. We outline the strengths of each program. These programs can be run through web servers, on the command line, or with graphical user interfaces. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Predicting a structure conserved in three or more sequences with the RNAstructure web server Basic Protocol 2: Predicting a structure conserved in two sequences with the RNAstructure web server Alternative Protocol 1: Predicting a structure conserved in multiple sequences in the RNAstructure graphical user interface Alternative Protocol 2: Predicting a structure conserved in two sequences with Dynalign in the RNAstructure graphical user interface Alternative Protocol 3: Running TurboFold on the command line.

Collapse

Ali SE, Mittal A, Mathews DH. RNA Secondary Structure Analysis Using RNAstructure. Curr Protoc 2023;3:e846. [PMID: 37487054 PMCID: PMC11267465 DOI: 10.1002/cpz1.846] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]

Network-Based Structural Alignment of RNA Sequences Using TOPAS. Methods Mol Biol 2023;2586:147-162. [PMID: 36705903 DOI: 10.1007/978-1-0716-2768-6_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Akiyama M, Sakakibara Y. Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genom Bioinform 2022;4:lqac012. [PMID: 35211670 PMCID: PMC8862729 DOI: 10.1093/nargab/lqac012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 01/08/2022] [Accepted: 02/05/2022] [Indexed: 01/17/2023] Open

Zambrano RAI, Hernandez-Perez C, Takahashi MK. RNA Structure Prediction, Analysis, and Design: An Introduction to Web-Based Tools. Methods Mol Biol 2022;2518:253-269. [PMID: 35666450 DOI: 10.1007/978-1-0716-2421-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Barcytė D, Eikrem W, Engesmo A, Seoane S, Wohlmann J, Horák A, Yurchenko T, Eliáš M. Olisthodiscus represents a new class of Ochrophyta. JOURNAL OF PHYCOLOGY 2021;57:1094-1118. [PMID: 33655496 DOI: 10.1111/jpy.13155] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 12/08/2020] [Accepted: 01/04/2021] [Indexed: 06/12/2023]

Chen CC, Jeong H, Qian X, Yoon BJ. TOPAS: network-based structural alignment of RNA sequences. Bioinformatics 2020;35:2941-2948. [PMID: 30629122 DOI: 10.1093/bioinformatics/btz001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 12/07/2018] [Accepted: 01/04/2019] [Indexed: 11/14/2022] Open

Abstract

MOTIVATION

For many RNA families, the secondary structure is known to be better conserved among the member RNAs compared to the primary sequence. For this reason, it is important to consider the underlying folding structures when aligning RNA sequences, especially for those with relatively low sequence identity. Given a set of RNAs with unknown structures, simultaneous RNA alignment and folding algorithms aim to accurately align the RNAs by jointly predicting their consensus secondary structure and the optimal sequence alignment. Despite the improved accuracy of the resulting alignment, the computational complexity of simultaneous alignment and folding for a pair of RNAs is O(N6), which is too costly to be used for large-scale analysis.

RESULTS

In order to address this shortcoming, in this work, we propose a novel network-based scheme for pairwise structural alignment of RNAs. The proposed algorithm, TOPAS, builds on the concept of topological networks that provide structural maps of the RNAs to be aligned. For each RNA sequence, TOPAS first constructs a topological network based on the predicted folding structure, which consists of sequential edges and structural edges weighted by the base-pairing probabilities. The obtained networks can then be efficiently aligned by using probabilistic network alignment techniques, thereby yielding the structural alignment of the RNAs. The computational complexity of our proposed method is significantly lower than that of the Sankoff-style dynamic programming approach, while yielding favorable alignment results. Furthermore, another important advantage of the proposed algorithm is its capability of handling RNAs with pseudoknots while predicting the RNA structural alignment. We demonstrate that TOPAS generally outperforms previous RNA structural alignment methods on RNA benchmarks in terms of both speed and accuracy.

AVAILABILITY AND IMPLEMENTATION

Source code of TOPAS and the benchmark data used in this paper are available at https://github.com/bjyoontamu/TOPAS.

Collapse

Tan Z, Fu Y, Sharma G, Mathews DH. TurboFold II: RNA structural alignment and secondary structure prediction informed by multiple homologs. Nucleic Acids Res 2017;45:11570-11581. [PMID: 29036420 PMCID: PMC5714223 DOI: 10.1093/nar/gkx815] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2017] [Accepted: 09/12/2017] [Indexed: 12/26/2022] Open

Tahi F, Du T Tran V, Boucheham A. In Silico Prediction of RNA Secondary Structure. Methods Mol Biol 2017;1543:145-168. [PMID: 28349425 DOI: 10.1007/978-1-4939-6716-2_7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Bridging the gap between in vitro and in vivo RNA folding. Q Rev Biophys 2016;49:e10. [PMID: 27658939 DOI: 10.1017/s003358351600007x] [Citation(s) in RCA: 98] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Abstract

Deciphering the folding pathways and predicting the structures of complex three-dimensional biomolecules is central to elucidating biological function. RNA is single-stranded, which gives it the freedom to fold into complex secondary and tertiary structures. These structures endow RNA with the ability to perform complex chemistries and functions ranging from enzymatic activity to gene regulation. Given that RNA is involved in many essential cellular processes, it is critical to understand how it folds and functions in vivo. Within the last few years, methods have been developed to probe RNA structures in vivo and genome-wide. These studies reveal that RNA often adopts very different structures in vivo and in vitro, and provide profound insights into RNA biology. Nonetheless, both in vitro and in vivo approaches have limitations: studies in the complex and uncontrolled cellular environment make it difficult to obtain insight into RNA folding pathways and thermodynamics, and studies in vitro often lack direct cellular relevance, leaving a gap in our knowledge of RNA folding in vivo. This gap is being bridged by biophysical and mechanistic studies of RNA structure and function under conditions that mimic the cellular environment. To date, most artificial cytoplasms have used various polymers as molecular crowding agents and a series of small molecules as cosolutes. Studies under such in vivo-like conditions are yielding fresh insights, such as cooperative folding of functional RNAs and increased activity of ribozymes. These observations are accounted for in part by molecular crowding effects and interactions with other molecules. In this review, we report milestones in RNA folding in vitro and in vivo and discuss ongoing experimental and computational efforts to bridge the gap between these two conditions in order to understand how RNA folds in the cell.

Collapse

Prediction of Secondary Structures Conserved in Multiple RNA Sequences. Methods Mol Biol 2016;1490:35-50. [PMID: 27665591 DOI: 10.1007/978-1-4939-6433-8_3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

DiChiacchio L, Sloma MF, Mathews DH. AccessFold: predicting RNA-RNA interactions with consideration for competing self-structure. Bioinformatics 2015;32:1033-9. [PMID: 26589271 DOI: 10.1093/bioinformatics/btv682] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 11/15/2015] [Indexed: 01/09/2023] Open

Abstract

MOTIVATION

There are numerous examples of RNA-RNA complexes, including microRNA-mRNA and small RNA-mRNA duplexes for regulation of translation, guide RNA interactions with target RNA for post-transcriptional modification and small nuclear RNA duplexes for splicing. Predicting the base pairs formed between two interacting sequences remains difficult, at least in part because of the competition between unimolecular and bimolecular structure.

RESULTS

Two algorithms were developed for improved prediction of bimolecular RNA structure that consider the competition between self-structure and bimolecular structure. These algorithms utilize two novel approaches to evaluate accessibility: free energy density minimization and pseudo-energy minimization. Free energy density minimization minimizes the folding free energy change per nucleotide involved in an intermolecular secondary structure. Pseudo-energy minimization (called AccessFold) minimizes the sum of free energy change and a pseudo-free energy penalty for bimolecular pairing of nucleotides that are unlikely to be accessible for bimolecular structure. The pseudo-free energy, derived from unimolecular pairing probabilities, is applied per nucleotide in bimolecular pairs, and this approach is able to predict binding sites that are split by unimolecular structures. A benchmark set of 17 bimolecular RNA structures was assembled to assess structure prediction. Pseudo-energy minimization provides a statistically significant improvement in sensitivity over the method that was found in a benchmark to be the most accurate previously available method, with an improvement from 36.8% to 57.8% in mean sensitivity for base pair prediction.

AVAILABILITY AND IMPLEMENTATION

Pseudo-energy minimization is available for download as AccessFold, under an open-source license and as part of the RNAstructure package, at: http://rna.urmc.rochester.edu/RNAstructure.html

CONTACT

david_mathews@urmc.rochester.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Fu Y, Sharma G, Mathews DH. Dynalign II: common secondary structure prediction for RNA homologs with domain insertions. Nucleic Acids Res 2015;42:13939-48. [PMID: 25416799 PMCID: PMC4267632 DOI: 10.1093/nar/gku1172] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Mathews DH. RNA Secondary Structure Analysis Using RNAstructure. ACTA ACUST UNITED AC 2014;46:12.6.1-12.6.25. [DOI: 10.1002/0471250953.bi1206s46] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Mathews DH. Using the RNAstructure Software Package to Predict Conserved RNA Structures. ACTA ACUST UNITED AC 2014;46:12.4.1-12.4.22. [PMID: 24939126 DOI: 10.1002/0471250953.bi1204s46] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Stern HA, Mathews DH. Accelerating calculations of RNA secondary structure partition functions using GPUs. Algorithms Mol Biol 2013;8:29. [PMID: 24180434 PMCID: PMC4175106 DOI: 10.1186/1748-7188-8-29] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Accepted: 10/14/2013] [Indexed: 01/06/2023] Open

Abstract

Background

RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts between pairs of complementary bases. Prediction of the secondary structure of RNA from its sequence is therefore of great interest, but can be computationally expensive. In this work we accelerate computations of base-pair probababilities using parallel graphics processing units (GPUs).

Results

Calculation of the probabilities of base pairs in RNA secondary structures using nearest-neighbor standard free energy change parameters has been implemented using CUDA to run on hardware with multiprocessor GPUs. A modified set of recursions was introduced, which reduces memory usage by about 25%. GPUs are fastest in single precision, and for some hardware, restricted to single precision. This may introduce significant roundoff error. However, deviations in base-pair probabilities calculated using single precision were found to be negligible compared to those resulting from shifting the nearest-neighbor parameters by a random amount of magnitude similar to their experimental uncertainties. For large sequences running on our particular hardware, the GPU implementation reduces execution time by a factor of close to 60 compared with an optimized serial implementation, and by a factor of 116 compared with the original code.

Conclusions

Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision. The source code is integrated into the RNAstructure software package and available for download at http://rna.urmc.rochester.edu.

Collapse

Bellaousov S, Reuter JS, Seetin MG, Mathews DH. RNAstructure: Web servers for RNA secondary structure prediction and analysis. Nucleic Acids Res 2013;41:W471-4. [PMID: 23620284 PMCID: PMC3692136 DOI: 10.1093/nar/gkt290] [Citation(s) in RCA: 298] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open

Will S, Joshi T, Hofacker IL, Stadler PF, Backofen R. LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA (NEW YORK, N.Y.) 2012;18:900-14. [PMID: 22450757 PMCID: PMC3334699 DOI: 10.1261/rna.029041.111] [Citation(s) in RCA: 269] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2011] [Accepted: 01/18/2012] [Indexed: 05/18/2023]

Hamada M, Asai K. A classification of bioinformatics algorithms from the viewpoint of maximizing expected accuracy (MEA). J Comput Biol 2012;19:532-49. [PMID: 22313125 DOI: 10.1089/cmb.2011.0197] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Seetin MG, Mathews DH. TurboKnot: rapid prediction of conserved RNA secondary structures including pseudoknots. ACTA ACUST UNITED AC 2012;28:792-8. [PMID: 22285566 DOI: 10.1093/bioinformatics/bts044] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Seetin MG, Mathews DH. RNA structure prediction: an overview of methods. Methods Mol Biol 2012;905:99-122. [PMID: 22736001 DOI: 10.1007/978-1-61779-949-5_8] [Citation(s) in RCA: 89] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Wang X, Cai Y, Sun Y, Knight R, Mai V. Secondary structure information does not improve OTU assignment for partial 16s rRNA sequences. ISME JOURNAL 2011;6:1277-80. [PMID: 22170430 DOI: 10.1038/ismej.2011.187] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Harmanci AO, Sharma G, Mathews DH. TurboFold: iterative probabilistic estimation of secondary structures for multiple RNA sequences. BMC Bioinformatics 2011;12:108. [PMID: 21507242 PMCID: PMC3120699 DOI: 10.1186/1471-2105-12-108] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2010] [Accepted: 04/20/2011] [Indexed: 01/07/2023] Open

Abstract

Background

The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented.

Results

TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold.

TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms.

Conclusions

TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.

Collapse

Sahraeian SME, Yoon BJ. PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach. BMC Bioinformatics 2011;12 Suppl 1:S38. [PMID: 21342569 PMCID: PMC3044294 DOI: 10.1186/1471-2105-12-s1-s38] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open

Xu Z, Mathews DH. Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences. ACTA ACUST UNITED AC 2010;27:626-32. [PMID: 21193521 DOI: 10.1093/bioinformatics/btq726] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Andronescu M, Condon A, Hoos HH, Mathews DH, Murphy KP. Computational approaches for RNA energy parameter estimation. RNA (NEW YORK, N.Y.) 2010;16:2304-18. [PMID: 20940338 PMCID: PMC2995392 DOI: 10.1261/rna.1950510] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]

Mathews DH, Moss WN, Turner DH. Folding and finding RNA secondary structure. Cold Spring Harb Perspect Biol 2010;2:a003665. [PMID: 20685845 DOI: 10.1101/cshperspect.a003665] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Taneda A. Multi-objective pairwise RNA sequence alignment. Bioinformatics 2010;26:2383-90. [DOI: 10.1093/bioinformatics/btq439] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Engelen S, Tahi F. Tfold: efficient in silico prediction of non-coding RNA secondary structures. Nucleic Acids Res 2010;38:2453-66. [PMID: 20047957 PMCID: PMC2853104 DOI: 10.1093/nar/gkp1067] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2009] [Revised: 10/30/2009] [Accepted: 11/02/2009] [Indexed: 11/12/2022] Open

Reuter JS, Mathews DH. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics 2010;11:129. [PMID: 20230624 PMCID: PMC2984261 DOI: 10.1186/1471-2105-11-129] [Citation(s) in RCA: 1399] [Impact Index Per Article: 93.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2009] [Accepted: 03/15/2010] [Indexed: 11/16/2022] Open

Hamada M, Sato K, Kiryu H, Mituyama T, Asai K. CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs score. Bioinformatics 2009;25:3236-43. [DOI: 10.1093/bioinformatics/btp580] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Harmanci AO, Sharma G, Mathews DH. Stochastic sampling of the RNA structural alignment space. Nucleic Acids Res 2009;37:4063-75. [PMID: 19429694 PMCID: PMC2709569 DOI: 10.1093/nar/gkp276] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Taneda A. An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast. BMC Bioinformatics 2008;9:521. [PMID: 19061486 PMCID: PMC2630964 DOI: 10.1186/1471-2105-9-521] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2008] [Accepted: 12/05/2008] [Indexed: 11/30/2022] Open

Abstract

Background

Aligning RNA sequences with low sequence identity has been a challenging problem since such a computation essentially needs an algorithm with high complexities for taking structural conservation into account. Although many sophisticated algorithms for the purpose have been proposed to date, further improvement in efficiency is necessary to accelerate its large-scale applications including non-coding RNA (ncRNA) discovery.

Results

We developed a new genetic algorithm, Cofolga2, for simultaneously computing pairwise RNA sequence alignment and consensus folding, and benchmarked it using BRAliBase 2.1. The benchmark results showed that our new algorithm is accurate and efficient in both time and memory usage. Then, combining with the originally trained SVM, we applied the new algorithm to novel ncRNA discovery where we compared S. cerevisiae genome with six related genomes in a pairwise manner. By focusing our search to the relatively short regions (50 bp to 2,000 bp) sandwiched by conserved sequences, we successfully predict 714 intergenic and 1,311 sense or antisense ncRNA candidates, which were found in the pairwise alignments with stable consensus secondary structure and low sequence identity (≤ 50%). By comparing with the previous predictions, we found that > 92% of the candidates is novel candidates. The estimated rate of false positives in the predicted candidates is 51%. Twenty-five percent of the intergenic candidates has supports for expression in cell, i.e. their genomic positions overlap those of the experimentally determined transcripts in literature. By manual inspection of the results, moreover, we obtained four multiple alignments with low sequence identity which reveal consensus structures shared by three species/sequences.

Conclusion

The present method gives an efficient tool complementary to sequence-alignment-based ncRNA finders.

Collapse

Seemann SE, Gorodkin J, Backofen R. Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments. Nucleic Acids Res 2008;36:6355-62. [PMID: 18836192 PMCID: PMC2582601 DOI: 10.1093/nar/gkn544] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Mathews DH. RNA secondary structure analysis using RNAstructure. ACTA ACUST UNITED AC 2008;Chapter 12:Unit 12.6. [PMID: 18428759 DOI: 10.1002/0471250953.bi1206s13] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]