1
|
Entzian G, Raden M. pourRNA-a time- and memory-efficient approach for the guided exploration of RNA energy landscapes. Bioinformatics 2020; 36:462-469. [PMID: 31350881 DOI: 10.1093/bioinformatics/btz583] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 06/25/2019] [Accepted: 07/22/2019] [Indexed: 01/03/2023] Open
Abstract
MOTIVATION The folding dynamics of ribonucleic acids (RNAs) are typically studied via coarse-grained models of the underlying energy landscape to face the exponential growths of the RNA secondary structure space. Still, studies of exact folding kinetics based on gradient basin abstractions are currently limited to short sequence lengths due to vast memory requirements. In order to compute exact transition rates between gradient basins, state-of-the-art approaches apply global flooding schemes that require to memorize the whole structure space at once. pourRNA tackles this problem via local flooding techniques where memorization is limited to the structure ensembles of individual gradient basins. RESULTS Compared to the only available tool for exact gradient basin-based macro-state transition rates (namely barriers), pourRNA computes the same exact transition rates up to 10 times faster and requires two orders of magnitude less memory for sequences that are still computationally accessible for exhaustive enumeration. Parallelized computation as well as additional heuristics further speed up computations while still producing high-quality transition model approximations. The introduced heuristics enable a guided trade-off between model quality and required computational resources. We introduce and evaluate a macroscopic direct path heuristics to efficiently compute refolding energy barrier estimations for the co-transcriptionally trapped RNA sv11 of length 115 nt. Finally, we also show how pourRNA can be used to identify folding funnels and their respective energetically lowest minima. AVAILABILITY AND IMPLEMENTATION pourRNA is freely available at https://github.com/ViennaRNA/pourRNA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gregor Entzian
- Department of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Vienna 1090, Austria
| | - Martin Raden
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg 79110, Germany
| |
Collapse
|
2
|
Villa F, Panel N, Chen X, Simonson T. Adaptive landscape flattening in amino acid sequence space for the computational design of protein:peptide binding. J Chem Phys 2018; 149:072302. [PMID: 30134674 DOI: 10.1063/1.5022249] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
For the high throughput design of protein:peptide binding, one must explore a vast space of amino acid sequences in search of low binding free energies. This complex problem is usually addressed with either simple heuristic scoring or expensive sequence enumeration schemes. Far more efficient than enumeration is a recent Monte Carlo approach that adaptively flattens the energy landscape in sequence space of the unbound peptide and provides formally exact binding free energy differences. The method allows the binding free energy to be used directly as the design criterion. We propose several improvements that allow still more efficient sampling and can address larger design problems. They include the use of Replica Exchange Monte Carlo and landscape flattening for both the unbound and bound peptides. We used the method to design peptides that bind to the PDZ domain of the Tiam1 signaling protein and could serve as inhibitors of its activity. Four peptide positions were allowed to mutate freely. Almost 75 000 peptide variants were processed in two simulations of 109 steps each that used 1 CPU hour on a desktop machine. 96% of the theoretical sequence space was sampled. The relative binding free energies agreed qualitatively with values from experiment. The sampled sequences agreed qualitatively with an experimental library of Tiam1-binding peptides. The main assumption limiting accuracy is the fixed backbone approximation, which could be alleviated in future work by using increased computational resources and multi-backbone designs.
Collapse
Affiliation(s)
- Francesco Villa
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
3
|
Ceolin L, Romitti M, Rodrigues Siqueira D, Vaz Ferreira C, Oliboni Scapineli J, Assis-Brazil B, Vieira Maximiano R, Dias Amarante T, de Souza Nunes MC, Weber G, Maia AL. Effect of 3'UTR RET Variants on RET mRNA Secondary Structure and Disease Presentation in Medullary Thyroid Carcinoma. PLoS One 2016; 11:e0147840. [PMID: 26829565 PMCID: PMC4734678 DOI: 10.1371/journal.pone.0147840] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 01/08/2016] [Indexed: 12/21/2022] Open
Abstract
Background The RET S836S variant has been associated with early onset and increased risk for metastatic disease in medullary thyroid carcinoma (MTC). However, the mechanism by which this variant modulates MTC pathogenesis is still open to discuss. Of interest, strong linkage disequilibrium (LD) between RET S836S and 3'UTR variants has been reported in Hirschsprung's disease patients. Objective To evaluate the frequency of the RET 3’UTR variants (rs76759170 and rs3026785) in MTC patients and to determine whether these variants are in LD with S836S polymorphism. Methods Our sample comprised 152 patients with sporadic MTC. The RET S836S and 3’UTR (rs76759170 and rs3026785) variants were genotyped using Custom TaqMan Genotyping Assays. Haplotypes were inferred using the phase 2.1 program. RET mRNA structure was assessed by Vienna Package. Results The mean age of MTC diagnosis was 48.5±15.5 years and 57.9% were women. The minor allele frequencies of RET polymorphisms were as follows: S836S, 5.6%; rs76759170, 5.6%; rs3026785, 6.2%. We observed a strong LD among S836S and 3’UTR variants (|D’| = -1, r2 = 1 and |D’| = -1, r2 = 0,967). Patients harboring the S836S/3’UTR variants presented a higher percentage of lymph node and distant metastasis (P = 0.013 and P<0.001, respectively). Accordingly, RNA folding analyses demonstrated different RNA secondary structure predictions for WT(TCCGT), S836S(TTCGT) or 3’UTR(GTCAC) haplotypes. The S836S/3’UTR haplotype presented a greater number of double helices sections and lower levels of minimal free energy when compared to the wild-type haplotype, suggesting that these variants provides the most thermodynamically stable mRNA structure, which may have functional consequences on the rate of mRNA degradation. Conclusion The RET S836S polymorphism is in LD with 3’UTR variants. In silico analysis indicate that the 3’UTR variants may affect the secondary structure of RET mRNA, suggesting that these variants might play a role in posttranscriptional control of the RET transcripts.
Collapse
Affiliation(s)
- Lucieli Ceolin
- Thyroid Section, Endocrine Division, Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brasil
| | - Mirian Romitti
- Thyroid Section, Endocrine Division, Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brasil
| | - Débora Rodrigues Siqueira
- Thyroid Section, Endocrine Division, Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brasil
| | - Carla Vaz Ferreira
- Thyroid Section, Endocrine Division, Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brasil
| | - Jessica Oliboni Scapineli
- Thyroid Section, Endocrine Division, Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brasil
| | - Beatriz Assis-Brazil
- Pathology Department, Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Rodolfo Vieira Maximiano
- Department of Physics, Computational Biophysics Group, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Tauanne Dias Amarante
- Department of Physics, Computational Biophysics Group, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Miriam Celi de Souza Nunes
- Department of Physics, Computational Biophysics Group, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Gerald Weber
- Department of Physics, Computational Biophysics Group, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Ana Luiza Maia
- Thyroid Section, Endocrine Division, Hospital de Clínicas de Porto Alegre, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brasil
- * E-mail:
| |
Collapse
|
4
|
Chitsaz H, Forouzmand E, Haffari G. An efficient algorithm for upper bound on the partition function of nucleic acids. J Comput Biol 2013; 20:486-94. [PMID: 23829650 DOI: 10.1089/cmb.2013.0003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
It has been shown that minimum free-energy structure for RNAs and RNA-RNA interaction is often incorrect due to inaccuracies in the energy parameters and inherent limitations of the energy model. In contrast, ensemble-based quantities such as melting temperature and equilibrium concentrations can be more reliably predicted. Even structure prediction by sampling from the ensemble and clustering those structures by Sfold has proven to be more reliable than minimum free energy structure prediction. The main obstacle for ensemble-based approaches is the computational complexity of the partition function and base-pairing probabilities. For instance, the space complexity of the partition function for RNA-RNA interaction is O(n4) and the time complexity is O(n6), which is prohibitively large. Our goal in this article is to present a fast algorithm, based on sparse folding, to calculate an upper bound on the partition function. Our work is based on the recent algorithm of Hazan and Jaakkola (2012). The space complexity of our algorithm is the same as that of sparse folding algorithms, and the time complexity of our algorithm is O(MFE(n)ℓ) for single RNA and O(MFE(m, n)ℓ) for RNA-RNA interaction in practice, in which MFE is the running time of sparse folding and ℓ≤n (ℓ≤n+m) is a sequence-dependent parameter.
Collapse
Affiliation(s)
- Hamidreza Chitsaz
- Department of Computer Science, Wayne State University, Detroit, Michigan 48202, USA.
| | | | | |
Collapse
|
5
|
Sahoo S, Albrecht AA. Approximating the set of local minima in partial RNA folding landscapes. ACTA ACUST UNITED AC 2011; 28:523-30. [PMID: 22210870 DOI: 10.1093/bioinformatics/btr715] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION We study a stochastic method for approximating the set of local minima in partial RNA folding landscapes associated with a bounded-distance neighbourhood of folding conformations. The conformations are limited to RNA secondary structures without pseudoknots. The method aims at exploring partial energy landscapes pL induced by folding simulations and their underlying neighbourhood relations. It combines an approximation of the number of local optima devised by Garnier and Kallel (2002) with a run-time estimation for identifying sets of local optima established by Reeves and Eremeev (2004). RESULTS The method is tested on nine sequences of length between 50 nt and 400 nt, which allows us to compare the results with data generated by RNAsubopt and subsequent barrier tree calculations. On the nine sequences, the method captures on average 92% of local minima with settings designed for a target of 95%. The run-time of the heuristic can be estimated by O(n(2)Dνlnν), where n is the sequence length, ν is the number of local minima in the partial landscape pL under consideration and D is the maximum number of steepest descent steps in attraction basins associated with pL.
Collapse
Affiliation(s)
- S Sahoo
- Centre for Cancer Research and Cell Biology, Queen's University Belfast, Belfast BT9 7BL, UK
| | | |
Collapse
|