1
|
Qiu X. Robust RNA secondary structure prediction with a mixture of deep learning and physics-based experts. Biol Methods Protoc 2025; 10:bpae097. [PMID: 39811444 PMCID: PMC11729747 DOI: 10.1093/biomethods/bpae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 12/01/2024] [Accepted: 12/25/2024] [Indexed: 01/16/2025] Open
Abstract
A mixture-of-experts (MoE) approach has been developed to mitigate the poor out-of-distribution (OOD) generalization of deep learning (DL) models for single-sequence-based prediction of RNA secondary structure. The main idea behind this approach is to use DL models for in-distribution (ID) test sequences to leverage their superior ID performances, while relying on physics-based models for OOD sequences to ensure robust predictions. One key ingredient of the pipeline, named MoEFold2D, is automated ID/OOD detection via consensus analysis of an ensemble of DL model predictions without requiring access to training data during inference. Specifically, motivated by the clustered distribution of known RNA structures, a collection of distinct DL models is trained by iteratively leaving one cluster out. Each DL model hence serves as an expert on all but one cluster in the training data. Consequently, for an ID sequence, all but one DL model makes accurate predictions consistent with one another, while an OOD sequence yields highly inconsistent predictions among all DL models. Through consensus analysis of DL predictions, test sequences are categorized as ID or OOD. ID sequences are subsequently predicted by averaging the DL models in consensus, and OOD sequences are predicted using physics-based models. Instead of remediating generalization gaps with alternative approaches such as transfer learning and sequence alignment, MoEFold2D circumvents unpredictable ID-OOD gaps and combines the strengths of DL and physics-based models to achieve accurate ID and robust OOD predictions.
Collapse
Affiliation(s)
- Xiangyun Qiu
- Department of Physics, George Washington University, Washington, DC 20052, United States
| |
Collapse
|
2
|
Zhu M, Zuber J, Tan Z, Sharma G, Mathews DH. DecoyFinder: Identification of Contaminants in Sets of Homologous RNA Sequences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.12.618037. [PMID: 39464058 PMCID: PMC11507696 DOI: 10.1101/2024.10.12.618037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Motivation RNA structure is essential for the function of many non-coding RNAs. Using multiple homologous sequences, which share structure and function, secondary structure can be predicted with much higher accuracy than with a single sequence. It can be difficult, however, to establish a set of homologous sequences when their structure is not yet known. We developed a method to identify sequences in a set of putative homologs that are in fact non-homologs. Results Previously, we developed TurboFold to estimate conserved structure using multiple, unaligned RNA homologs. Here, we report that the positive predictive value of TurboFold is significantly reduced by the presence of contamination by non-homologous sequences, although the reduction is less than 1%. We developed a method called DecoyFinder, which applies machine learning trained with features determined by TurboFold, to detect sequences that are not homologous with the other sequences in the set. This method can identify approximately 45% of non-homologous sequences, at a rate of 5% misidentification of true homologous sequences. Availability DecoyFinder and TurboFold are incorporated in RNAstructure, which is provided for free and open source under the GPL V2 license. It can be downloaded at http://rna.urmc.rochester.edu/RNAstructure.html.
Collapse
Affiliation(s)
- Mingyi Zhu
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| | - Jeffrey Zuber
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| | - Zhen Tan
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| | - Gaurav Sharma
- University of Rochester, Department of Electrical and Computer Engineering, Rochester, NY, United States
- University of Rochester, Department of Computer Science, Rochester, NY, United States
| | - David H Mathews
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, United States
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, United States
| |
Collapse
|
3
|
von Löhneysen S, Mörl M, Stadler PF. Limits of experimental evidence in RNA secondary structure prediction. FRONTIERS IN BIOINFORMATICS 2024; 4:1346779. [PMID: 38456157 PMCID: PMC10918467 DOI: 10.3389/fbinf.2024.1346779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/09/2024] [Indexed: 03/09/2024] Open
Affiliation(s)
- Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Mario Mörl
- Institute for Biochemistry, Leipzig University, Leipzig, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
- Competence Center for Scalable Data Analytics and Artificial Intelligence, School of Embedded and Compositive Artificial Intelligence (SECAI), Leipzig University, Leipzig, Germany
- Department of Theoretical Chemistry, University of Vienna, Wien, Austria
- Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia
- Center for Non-Coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
- Santa Fe Institute, Santa Fe, NM, United States
| |
Collapse
|
4
|
Zuber J, Mathews DH. Estimating RNA Secondary Structure Folding Free Energy Changes with efn2. Methods Mol Biol 2024; 2726:1-13. [PMID: 38780725 DOI: 10.1007/978-1-0716-3519-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
A number of analyses require estimates of the folding free energy changes of specific RNA secondary structures. These predictions are often based on a set of nearest neighbor parameters that models the folding stability of a RNA secondary structure as the sum of folding stabilities of the structural elements that comprise the secondary structure. In the software suite RNAstructure, the free energy change calculation is implemented in the program efn2. The efn2 program estimates the folding free energy change and the experimental uncertainty in the folding free energy change. It can be run through the graphical user interface for RNAstructure, from the command line, or a web server. This chapter provides detailed protocols for using efn2.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, USA.
- Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY, USA.
| |
Collapse
|
5
|
Metkar M, Pepin CS, Moore MJ. Tailor made: the art of therapeutic mRNA design. Nat Rev Drug Discov 2024; 23:67-83. [PMID: 38030688 DOI: 10.1038/s41573-023-00827-x] [Citation(s) in RCA: 38] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/09/2023] [Indexed: 12/01/2023]
Abstract
mRNA medicine is a new and rapidly developing field in which the delivery of genetic information in the form of mRNA is used to direct therapeutic protein production in humans. This approach, which allows for the quick and efficient identification and optimization of drug candidates for both large populations and individual patients, has the potential to revolutionize the way we prevent and treat disease. A key feature of mRNA medicines is their high degree of designability, although the design choices involved are complex. Maximizing the production of therapeutic proteins from mRNA medicines requires a thorough understanding of how nucleotide sequence, nucleotide modification and RNA structure interplay to affect translational efficiency and mRNA stability. In this Review, we describe the principles that underlie the physical stability and biological activity of mRNA and emphasize their relevance to the myriad considerations that factor into therapeutic mRNA design.
Collapse
|
6
|
Pham TM, Miffin T, Sun H, Sharp KK, Wang X, Zhu M, Hoshika S, Peterson RJ, Benner SA, Kahn JD, Mathews DH. DNA Structure Design Is Improved Using an Artificially Expanded Alphabet of Base Pairs Including Loop and Mismatch Thermodynamic Parameters. ACS Synth Biol 2023; 12:2750-2763. [PMID: 37671922 PMCID: PMC10510751 DOI: 10.1021/acssynbio.3c00358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Indexed: 09/07/2023]
Abstract
We show that in silico design of DNA secondary structures is improved by extending the base pairing alphabet beyond A-T and G-C to include the pair between 2-amino-8-(1'-β-d-2'-deoxyribofuranosyl)-imidazo-[1,2-a]-1,3,5-triazin-(8H)-4-one and 6-amino-3-(1'-β-d-2'-deoxyribofuranosyl)-5-nitro-(1H)-pyridin-2-one, abbreviated as P and Z. To obtain the thermodynamic parameters needed to include P-Z pairs in the designs, we performed 47 optical melting experiments and combined the results with previous work to fit free energy and enthalpy nearest neighbor folding parameters for P-Z pairs and G-Z wobble pairs. We find G-Z pairs have stability comparable to that of A-T pairs and should therefore be included as base pairs in structure prediction and design algorithms. Additionally, we extrapolated the set of loop, terminal mismatch, and dangling end parameters to include the P and Z nucleotides. These parameters were incorporated into the RNAstructure software package for secondary structure prediction and analysis. Using the RNAstructure Design program, we solved 99 of the 100 design problems posed by Eterna using the ACGT alphabet or supplementing it with P-Z pairs. Extending the alphabet reduced the propensity of sequences to fold into off-target structures, as evaluated by the normalized ensemble defect (NED). The NED values were improved relative to those from the Eterna example solutions in 91 of 99 cases in which Eterna-player solutions were provided. P-Z-containing designs had average NED values of 0.040, significantly below the 0.074 of standard-DNA-only designs, and inclusion of the P-Z pairs decreased the time needed to converge on a design. This work provides a sample pipeline for inclusion of any expanded alphabet nucleotides into prediction and design workflows.
Collapse
Affiliation(s)
- Tuan M. Pham
- Department
of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, United States
| | - Terrel Miffin
- Department
of Chemistry & Biochemistry, University
of Maryland, College
Park, Maryland 20742, United States
| | - Hongying Sun
- Department
of Surgery, University of Rochester Medical
Center, Rochester, New York 14642, United States
| | - Kenneth K. Sharp
- Department
of Chemistry & Biochemistry, University
of Maryland, College
Park, Maryland 20742, United States
| | - Xiaoyu Wang
- Department
of Chemistry & Biochemistry, University
of Maryland, College
Park, Maryland 20742, United States
| | - Mingyi Zhu
- Department
of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, United States
| | - Shuichi Hoshika
- Foundation
for Applied Molecular Evolution, Alachua, Florida 32615, United States
| | | | - Steven A. Benner
- Foundation
for Applied Molecular Evolution, Alachua, Florida 32615, United States
| | - Jason D. Kahn
- Department
of Chemistry & Biochemistry, University
of Maryland, College
Park, Maryland 20742, United States
| | - David H. Mathews
- Department
of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, United States
| |
Collapse
|
7
|
Pham TM, Miffin T, Sun H, Sharp KK, Wang X, Zhu M, Hoshika S, Peterson RJ, Benner SA, Kahn JD, Mathews DH. DNA Structure Design Is Improved Using an Artificially Expanded Alphabet of Base Pairs Including Loop and Mismatch Thermodynamic Parameters. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.06.543917. [PMID: 37333404 PMCID: PMC10274641 DOI: 10.1101/2023.06.06.543917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
We show that in silico design of DNA secondary structures is improved by extending the base pairing alphabet beyond A-T and G-C to include the pair between 2-amino-8-(1'-β-D-2'-deoxyribofuranosyl)-imidazo-[1,2- a ]-1,3,5-triazin-(8 H )-4-one and 6-amino-3-(1'-β-D-2'-deoxyribofuranosyl)-5-nitro-(1 H )-pyridin-2-one, simply P and Z. To obtain the thermodynamic parameters needed to include P-Z pairs in the designs, we performed 47 optical melting experiments and combined the results with previous work to fit a new set of free energy and enthalpy nearest neighbor folding parameters for P-Z pairs and G-Z wobble pairs. We find that G-Z pairs have stability comparable to A-T pairs and therefore should be considered quantitatively by structure prediction and design algorithms. Additionally, we extrapolated the set of loop, terminal mismatch, and dangling end parameters to include P and Z nucleotides. These parameters were incorporated into the RNAstructure software package for secondary structure prediction and analysis. Using the RNAstructure Design program, we solved 99 of the 100 design problems posed by Eterna using the ACGT alphabet or supplementing with P-Z pairs. Extending the alphabet reduced the propensity of sequences to fold into off-target structures, as evaluated by the normalized ensemble defect (NED). The NED values were improved relative to those from the Eterna example solutions in 91 of 99 cases where Eterna-player solutions were provided. P-Z-containing designs had average NED values of 0.040, significantly below the 0.074 of standard-DNA-only designs, and inclusion of the P-Z pairs decreased the time needed to converge on a design. This work provides a sample pipeline for inclusion of any expanded alphabet nucleotides into prediction and design workflows.
Collapse
Affiliation(s)
- Tuan M. Pham
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY
| | - Terrel Miffin
- Department of Chemistry & Biochemistry, University of Maryland, College Park, MD
| | - Hongying Sun
- Department of Surgery, University of Rochester Medical Center, Rochester, NY
| | - Kenneth K. Sharp
- Department of Chemistry & Biochemistry, University of Maryland, College Park, MD
| | - Xiaoyu Wang
- Department of Chemistry & Biochemistry, University of Maryland, College Park, MD
| | - Mingyi Zhu
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY
| | | | | | | | - Jason D. Kahn
- Department of Chemistry & Biochemistry, University of Maryland, College Park, MD
| | - David H. Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY
| |
Collapse
|
8
|
Poppleton E, Urbanek N, Chakraborty T, Griffo A, Monari L, Göpfrich K. RNA origami: design, simulation and application. RNA Biol 2023; 20:510-524. [PMID: 37498217 PMCID: PMC10376919 DOI: 10.1080/15476286.2023.2237719] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 06/20/2023] [Accepted: 07/12/2023] [Indexed: 07/28/2023] Open
Abstract
Design strategies for DNA and RNA nanostructures have developed along parallel lines for the past 30 years, from small structural motifs derived from biology to large 'origami' structures with thousands to tens of thousands of bases. With the recent publication of numerous RNA origami structures and improved design methods-even permitting co-transcriptional folding of kilobase-sized structures - the RNA nanotechnolgy field is at an inflection point. Here, we review the key achievements which inspired and enabled RNA origami design and draw comparisons with the development and applications of DNA origami structures. We further present the available computational tools for the design and the simulation, which will be key to the growth of the RNA origami community. Finally, we portray the transition from RNA origami structure to function. Several functional RNA origami structures exist already, their expression in cells has been demonstrated and first applications in cell biology have already been realized. Overall, we foresee that the fast-paced RNA origami field will provide new molecular hardware for biophysics, synthetic biology and biomedicine, complementing the DNA origami toolbox.
Collapse
Affiliation(s)
- Erik Poppleton
- Biophysical Engineering Group, Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg University, Heidelberg, Germany
- Biophysical Engineering Group, Max Planck Institute for Medical Research, Heidelberg, Germany
- Molecular Biomechanics, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
| | - Niklas Urbanek
- Biophysical Engineering Group, Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg University, Heidelberg, Germany
- Biophysical Engineering Group, Max Planck Institute for Medical Research, Heidelberg, Germany
| | - Taniya Chakraborty
- Biophysical Engineering Group, Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg University, Heidelberg, Germany
- Biophysical Engineering Group, Max Planck Institute for Medical Research, Heidelberg, Germany
| | - Alessandra Griffo
- Biophysical Engineering Group, Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg University, Heidelberg, Germany
- Biophysical Engineering Group, Max Planck Institute for Medical Research, Heidelberg, Germany
| | - Luca Monari
- Biophysical Engineering Group, Max Planck Institute for Medical Research, Heidelberg, Germany
- Institut de Science Et D’ingénierie Supramoléculaires (ISIS), Université de Strasbourg, Strasbourg, France
| | - Kerstin Göpfrich
- Biophysical Engineering Group, Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg University, Heidelberg, Germany
- Biophysical Engineering Group, Max Planck Institute for Medical Research, Heidelberg, Germany
| |
Collapse
|
9
|
Rolband L, Beasock D, Wang Y, Shu YG, Dinman JD, Schlick T, Zhou Y, Kieft JS, Chen SJ, Bussi G, Oukhaled A, Gao X, Šulc P, Binzel D, Bhullar AS, Liang C, Guo P, Afonin KA. Biomotors, viral assembly, and RNA nanobiotechnology: Current achievements and future directions. Comput Struct Biotechnol J 2022; 20:6120-6137. [PMID: 36420155 PMCID: PMC9672130 DOI: 10.1016/j.csbj.2022.11.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 11/13/2022] Open
Abstract
The International Society of RNA Nanotechnology and Nanomedicine (ISRNN) serves to further the development of a wide variety of functional nucleic acids and other related nanotechnology platforms. To aid in the dissemination of the most recent advancements, a biennial discussion focused on biomotors, viral assembly, and RNA nanobiotechnology has been established where international experts in interdisciplinary fields such as structural biology, biophysical chemistry, nanotechnology, cell and cancer biology, and pharmacology share their latest accomplishments and future perspectives. The results summarized here highlight advancements in our understanding of viral biology and the structure-function relationship of frame-shifting elements in genomic viral RNA, improvements in the predictions of SHAPE analysis of 3D RNA structures, and the understanding of dynamic RNA structures through a variety of experimental and computational means. Additionally, recent advances in the drug delivery, vaccine design, nanopore technologies, biomotor and biomachine development, DNA packaging, RNA nanotechnology, and drug delivery are included in this critical review. We emphasize some of the novel accomplishments, major discussion topics, and present current challenges and perspectives of these emerging fields.
Collapse
Affiliation(s)
- Lewis Rolband
- University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Damian Beasock
- University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Yang Wang
- Wenzhou Institute, University of China Academy of Sciences, 1st, Jinlian Road, Longwan District, Wenzhou, Zhjiang 325001, China
| | - Yao-Gen Shu
- Wenzhou Institute, University of China Academy of Sciences, 1st, Jinlian Road, Longwan District, Wenzhou, Zhjiang 325001, China
| | | | - Tamar Schlick
- New York University, Department of Chemistry and Courant Institute of Mathematical Sciences, Simons Center for Computational Physical Chemistry, New York, NY 10012, USA
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong 518107, China
| | - Jeffrey S. Kieft
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Shi-Jie Chen
- University of Missouri at Columbia, Columbia, MO 65211, USA
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati, via Bonomea 265, 34136 Trieste, Italy
| | | | - Xingfa Gao
- National Center for Nanoscience and Technology of China, Beijing 100190, China
| | - Petr Šulc
- Arizona State University, Tempe, AZ, USA
| | | | | | - Chenxi Liang
- The Ohio State University, Columbus, OH 43210, USA
| | - Peixuan Guo
- The Ohio State University, Columbus, OH 43210, USA
| | - Kirill A. Afonin
- University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
10
|
Szabat M, Prochota M, Kierzek R, Kierzek E, Mathews DH. A Test and Refinement of Folding Free Energy Nearest Neighbor Parameters for RNA Including N 6-Methyladenosine. J Mol Biol 2022; 434:167632. [PMID: 35588868 PMCID: PMC11235186 DOI: 10.1016/j.jmb.2022.167632] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 04/29/2022] [Accepted: 05/07/2022] [Indexed: 12/26/2022]
Abstract
RNA folding free energy change parameters are widely used to predict RNA secondary structure and to design RNA sequences. These parameters include terms for the folding free energies of helices and loops. Although the full set of parameters has only been traditionally available for the four common bases and backbone, it is well known that covalent modifications of nucleotides are widespread in natural RNAs. Covalent modifications are also widely used in engineered sequences. We recently derived a full set of nearest neighbor terms for RNA that includes N6-methyladenosine (m6A). In this work, we test the model using 98 optical melting experiments, matching duplexes with or without N6-methylation of A. Most experiments place RRACH, the consensus site of N6-methylation, in a variety of contexts, including helices, bulge loops, internal loops, dangling ends, and terminal mismatches. For matched sets of experiments that include either A or m6A in the same context, we find that the parameters for m6A are as accurate as those for A. Across all experiments, the root mean squared deviation between estimated and experimental free energy changes is 0.67 kcal/mol. We used the new experimental data to refine the set of nearest neighbor parameter terms for m6A. These parameters enable prediction of RNA secondary structures including m6A, which can be used to model how N6-methylation of A affects RNA structure.
Collapse
Affiliation(s)
- Marta Szabat
- Institute of Bioorganic Chemistry Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Martina Prochota
- Institute of Bioorganic Chemistry Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Ryszard Kierzek
- Institute of Bioorganic Chemistry Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Elzbieta Kierzek
- Institute of Bioorganic Chemistry Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland.
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, 601 Elmwood Avenue, Box 712, School of Medicine and Dentistry, University of Rochester, Rochester, NY 14642, United States.
| |
Collapse
|
11
|
Zuber J, Schroeder SJ, Sun H, Turner DH, Mathews DH. Nearest neighbor rules for RNA helix folding thermodynamics: improved end effects. Nucleic Acids Res 2022; 50:5251-5262. [PMID: 35524574 PMCID: PMC9122537 DOI: 10.1093/nar/gkac261] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 03/29/2022] [Accepted: 04/08/2022] [Indexed: 12/26/2022] Open
Abstract
Nearest neighbor parameters for estimating the folding stability of RNA secondary structures are in widespread use. For helices, current parameters penalize terminal AU base pairs relative to terminal GC base pairs. We curated an expanded database of helix stabilities determined by optical melting experiments. Analysis of the updated database shows that terminal penalties depend on the sequence identity of the adjacent penultimate base pair. New nearest neighbor parameters that include this additional sequence dependence accurately predict the measured values of 271 helices in an updated database with a correlation coefficient of 0.982. This refined understanding of helix ends facilitates fitting terms for base pair stacks with GU pairs. Prior parameter sets treated 5′GGUC3′ paired to 3′CUGG5′ separately from other 5′GU3′/3′UG5′ stacks. The improved understanding of helix end stability, however, makes the separate treatment unnecessary. Introduction of the additional terms was tested with three optical melting experiments. The average absolute difference between measured and predicted free energy changes at 37°C for these three duplexes containing terminal adjacent AU and GU pairs improved from 1.38 to 0.27 kcal/mol. This confirms the need for the additional sequence dependence in the model.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Alnylam Pharmaceuticals, Inc., Cambridge, MA 02142, USA
| | - Susan J Schroeder
- Department of Chemistry and Biochemistry, and Department of Microbiology and Plant Biology, University of Oklahoma, Norman, OK 73019, USA
| | - Hongying Sun
- Department of Biochemistry & Biophysics, University of Rochester, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester, Rochester, NY 14642, USA
| | - Douglas H Turner
- Center for RNA Biology, University of Rochester, Rochester, NY 14642, USA.,Department of Chemistry, University of Rochester, Rochester, NY 14627, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester, Rochester, NY 14642, USA
| |
Collapse
|
12
|
Cheng Y, Zhang S, Xu X, Chen SJ. Vfold2D-MC: A Physics-Based Hybrid Model for Predicting RNA Secondary Structure Folding. J Phys Chem B 2021; 125:10108-10118. [PMID: 34473508 DOI: 10.1021/acs.jpcb.1c04731] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Accurate prediction of RNA structure and folding stability has a far-reaching impact on our understanding of RNA functions. Here we develop Vfold2D-MC, a new physics-based model, to predict RNA structure and folding thermodynamics from the sequence. The model employs virtual bond-based coarse-graining of RNA backbone conformation and generates RNA conformations through Monte Carlo sampling of the bond angles and torsional angles of the virtual bonds. Using a coarse-grained statistical potential derived from the known structures, we assign each conformation with a statistical weight. The weighted average over the conformational ensemble gives the entropy and free energy parameters for the hairpin, bulge, and internal loops, and multiway junctions. From the thermodynamic parameters, we predict RNA structures, melting curves, and structural changes from the sequence. Theory-experiment comparisons indicate that Vfold2D-MC not only gives improved structure predictions but also enables the interpretation of thermodynamic results for different RNA structures, including multibranched junctions. This new model sets a promising framework to treat more complicated RNA structures, such as pseudoknotted and intramolecular kissing loops, for which experimental thermodynamic parameters are often unavailable.
Collapse
Affiliation(s)
- Yi Cheng
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, Jiangsu 213001, China
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| |
Collapse
|
13
|
Zhao Q, Zhao Z, Fan X, Yuan Z, Mao Q, Yao Y. Review of machine learning methods for RNA secondary structure prediction. PLoS Comput Biol 2021; 17:e1009291. [PMID: 34437528 PMCID: PMC8389396 DOI: 10.1371/journal.pcbi.1009291] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Secondary structure plays an important role in determining the function of noncoding RNAs. Hence, identifying RNA secondary structures is of great value to research. Computational prediction is a mainstream approach for predicting RNA secondary structure. Unfortunately, even though new methods have been proposed over the past 40 years, the performance of computational prediction methods has stagnated in the last decade. Recently, with the increasing availability of RNA structure data, new methods based on machine learning (ML) technologies, especially deep learning, have alleviated the issue. In this review, we provide a comprehensive overview of RNA secondary structure prediction methods based on ML technologies and a tabularized summary of the most important methods in this field. The current pending challenges in the field of RNA secondary structure prediction and future trends are also discussed.
Collapse
Affiliation(s)
- Qi Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Zheng Zhao
- School of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning, China
| | - Xiaoya Fan
- School of Software, Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian University of Technology, Dalian, Liaoning, China
| | - Zhengwei Yuan
- Key Laboratory of Health Ministry for Congenital Malformation, Shengjing Hospital of China Medical University, Shenyang, Liaoning, China
| | - Qian Mao
- College of Light Industry, Liaoning University, Shenyang, Liaoning, China
- Key Laboratory of Agroproducts Processing Technology, Changchun University, Changchun, Jilin, China
| | - Yudong Yao
- Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, New Jersey, United States of America
| |
Collapse
|
14
|
Sugimoto N, Endoh T, Takahashi S, Tateishi-Karimata H. Chemical Biology of Double Helical and Non-Double Helical Nucleic Acids: “To B or Not To B, That Is the Question”. BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN 2021. [DOI: 10.1246/bcsj.20210131] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Naoki Sugimoto
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 17-1-20 Minatojima-minamimachi, Kobe, Hyogo 650-0047, Japan
- Graduate School of Frontiers of Innovative Research in Science and Technology (FIRST), Konan University, 17-1-20 Minatojima-minamimachi, Kobe, Hyogo 650-0047, Japan
| | - Tamaki Endoh
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 17-1-20 Minatojima-minamimachi, Kobe, Hyogo 650-0047, Japan
| | - Shuntaro Takahashi
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 17-1-20 Minatojima-minamimachi, Kobe, Hyogo 650-0047, Japan
| | - Hisae Tateishi-Karimata
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 17-1-20 Minatojima-minamimachi, Kobe, Hyogo 650-0047, Japan
| |
Collapse
|
15
|
Inverse RNA Folding Workflow to Design and Test Ribozymes that Include Pseudoknots. Methods Mol Biol 2021; 2167:113-143. [PMID: 32712918 DOI: 10.1007/978-1-0716-0716-9_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Ribozymes are RNAs that catalyze reactions. They occur in nature, and can also be evolved in vitro to catalyze novel reactions. This chapter provides detailed protocols for using inverse folding software to design a ribozyme sequence that will fold to a known ribozyme secondary structure and for testing the catalytic activity of the sequence experimentally. This protocol is able to design sequences that include pseudoknots, which is important as all naturally occurring full-length ribozymes have pseudoknots. The starting point is the known pseudoknot-containing secondary structure of the ribozyme and knowledge of any nucleotides whose identity is required for function. The output of the protocol is a set of sequences that have been tested for function. Using this protocol, we were previously successful at designing highly active double-pseudoknotted HDV ribozymes.
Collapse
|
16
|
Hurst T, Chen SJ. Deciphering nucleotide modification-induced structure and stability changes. RNA Biol 2021; 18:1920-1930. [PMID: 33586616 DOI: 10.1080/15476286.2021.1882179] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Nucleotide modification in RNA controls a bevy of biological processes, including RNA degradation, gene expression, and gene editing. In turn, misregulation of modified nucleotides is associated with a host of chronic diseases and disorders. However, the molecular mechanisms driving these processes remain poorly understood. To partially address this knowledge gap, we used alchemical and temperature replica exchange molecular dynamics (TREMD) simulations on an RNA duplex and an analogous hairpin to probe the structural effects of modified and/or mutant nucleotides. The simulations successfully predict the modification/mutation-induced relative free energy change for complementary duplex formation, and structural analyses highlight mechanisms driving stability changes. Furthermore, TREMD simulations for a hairpin-forming RNA with and without modification provide reliable estimations of the energy landscape. Illuminating the impact of methylated and/or mutated nucleotides on the structure-function relationship and the folding energy landscape, the simulations provide insights into modification-induced alterations to the folding mechanics of the hairpin. The results here may be biologically significant as hairpins are widespread structure motifs that play critical roles in gene expression and regulation. Specifically, the tetraloop of the probed hairpin is phylogenetically abundant, and the stem mirrors a miRNA seed region whose modification has been implicated in epilepsy pathogenesis.
Collapse
Affiliation(s)
- Travis Hurst
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| |
Collapse
|
17
|
Ward M, Sun H, Datta A, Wise M, Mathews DH. Determining parameters for non-linear models of multi-loop free energy change. Bioinformatics 2020; 35:4298-4306. [PMID: 30923811 DOI: 10.1093/bioinformatics/btz222] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 02/10/2019] [Accepted: 03/27/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Predicting the secondary structure of RNA is a fundamental task in bioinformatics. Algorithms that predict secondary structure given only the primary sequence, and a model to evaluate the quality of a structure, are an integral part of this. These algorithms have been updated as our model of RNA thermodynamics changed and expanded. An exception to this has been the treatment of multi-loops. Although more advanced models of multi-loop free energy change have been suggested, a simple, linear model has been used since the 1980s. However, recently, new dynamic programing algorithms for secondary structure prediction that could incorporate these models were presented. Unfortunately, these models appear to have lower accuracy for secondary structure prediction. RESULTS We apply linear regression and a new parameter optimization algorithm to find better parameters for the existing linear model and advanced non-linear multi-loop models. These include the Jacobson-Stockmayer and Aalberts & Nandagopal models. We find that the current linear model parameters may be near optimal for the linear model, and that no advanced model performs better than the existing linear model parameters even after parameter optimization. AVAILABILITY AND IMPLEMENTATION Source code and data is available at https://github.com/maxhwardg/advanced_multiloops. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Max Ward
- Computer Science & Software Engineering, The University of Western Australia, Crawley, WA, Australia
| | - Hongying Sun
- Department of Biochemistry & Biophysics, University of Rochester, Rochester, NY, USA.,Center for RNA Biology, University of Rochester, Rochester, NY, USA
| | - Amitava Datta
- Computer Science & Software Engineering, The University of Western Australia, Crawley, WA, Australia
| | - Michael Wise
- Computer Science & Software Engineering, The University of Western Australia, Crawley, WA, Australia.,The Marshall Centre for Infectious Diseases Research and Training, The University of Western Australia, Crawley, WA, Australia
| | - David H Mathews
- Department of Biostatistics & Computational Biology, University of Rochester, Rochester, NY, USA
| |
Collapse
|
18
|
Abstract
There are some NP-hard problems in the prediction of RNA structures. Prediction of RNA folding structure in RNA nucleotide sequence remains an unsolved challenge. We investigate the computing algorithm in RNA folding structural prediction based on extended structure and basin hopping graph, it is a computing mode of basin hopping graph in RNA folding structural prediction including pseudoknots. This study presents the predicting algorithm based on extended structure, it also proposes an improved computing algorithm based on barrier tree and basin hopping graph, which are the attractive approaches in RNA folding structural prediction. Many experiments have been implemented in Rfam14.1 database and PseudoBase database, the experimental results show that our two algorithms are efficient and accurate than the other existing algorithms.
Collapse
Affiliation(s)
- Zhendong Liu
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, P. R. China
- Department of Biostatistics, University of California, Los Angeles, Los Angeles 90095, USA
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| | - Gang Li
- Department of Biostatistics, University of California, Los Angeles, Los Angeles 90095, USA
| | - Jun S. Liu
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
19
|
Takahashi S, Sugimoto N. Stability prediction of canonical and non-canonical structures of nucleic acids in various molecular environments and cells. Chem Soc Rev 2020; 49:8439-8468. [DOI: 10.1039/d0cs00594k] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
This review provides the biophysicochemical background and recent advances in stability prediction of canonical and non-canonical structures of nucleic acids in various molecular environments and cells.
Collapse
Affiliation(s)
- Shuntaro Takahashi
- Frontier Institute for Biomolecular Engineering Research (FIBER)
- Konan University
- Kobe
- Japan
| | - Naoki Sugimoto
- Frontier Institute for Biomolecular Engineering Research (FIBER)
- Konan University
- Kobe
- Japan
- Graduate School of Frontiers of Innovative Research in Science and Technology (FIRST)
| |
Collapse
|
20
|
Nishida S, Sakuraba S, Asai K, Hamada M. Estimating Energy Parameters for RNA Secondary Structure Predictions Using Both Experimental and Computational Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1645-1655. [PMID: 29994069 DOI: 10.1109/tcbb.2018.2813388] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Computational RNA secondary structure prediction depends on a large number of nearest-neighbor free-energy parameters, including 10 parameters for Watson-Crick stacked base pairs that were estimated from experimental measurements of the free energies of 90 RNA duplexes. These experimental data are provided by time-consuming and cost-intensive experiments. In contrast, various modified nucleotides in RNAs, which would affect not only their structures but also functions, have been found, and rapid determination of energy parameters for a such modified nucleotides is needed. To reduce the high cost of determining energy parameters, we propose a novel method to estimate energy parameters from both experimental and computational data, where the computational data are provided by a recently developed molecular dynamics simulation protocol. We evaluate our method for Watson-Crick stacked base pairs, and show that parameters estimated from 10 experimental data items and 10 computational data items can predict RNA secondary structures with accuracy comparable to that using conventional parameters. The results indicate that the combination of experimental free-energy measurements and molecular dynamics simulations is capable of estimating the thermodynamic properties of RNA secondary structures at lower cost.
Collapse
|
21
|
Spasic A, Berger KD, Chen JL, Seetin MG, Turner DH, Mathews DH. Improving RNA nearest neighbor parameters for helices by going beyond the two-state model. Nucleic Acids Res 2019; 46:4883-4892. [PMID: 29718397 PMCID: PMC6007268 DOI: 10.1093/nar/gky270] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Accepted: 04/22/2018] [Indexed: 12/31/2022] Open
Abstract
RNA folding free energy change nearest neighbor parameters are widely used to predict folding stabilities of secondary structures. They were determined by linear regression to datasets of optical melting experiments on small model systems. Traditionally, the optical melting experiments are analyzed assuming a two-state model, i.e. a structure is either complete or denatured. Experimental evidence, however, shows that structures exist in an ensemble of conformations. Partition functions calculated with existing nearest neighbor parameters predict that secondary structures can be partially denatured, which also directly conflicts with the two-state model. Here, a new approach for determining RNA nearest neighbor parameters is presented. Available optical melting data for 34 Watson–Crick helices were fit directly to a partition function model that allows an ensemble of conformations. Fitting parameters were the enthalpy and entropy changes for helix initiation, terminal AU pairs, stacks of Watson–Crick pairs and disordered internal loops. The resulting set of nearest neighbor parameters shows a 38.5% improvement in the sum of residuals in fitting the experimental melting curves compared to the current literature set.
Collapse
Affiliation(s)
- Aleksandar Spasic
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Kyle D Berger
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Jonathan L Chen
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Chemistry, University of Rochester, Rochester, NY 14627, USA
| | - Matthew G Seetin
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Douglas H Turner
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Chemistry, University of Rochester, Rochester, NY 14627, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
22
|
Spasic A, Assmann SM, Bevilacqua PC, Mathews DH. Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res 2019; 46:314-323. [PMID: 29177466 PMCID: PMC5758915 DOI: 10.1093/nar/gkx1057] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 10/30/2017] [Indexed: 12/22/2022] Open
Abstract
RNA secondary structure prediction is widely used for developing hypotheses about the structures of RNA sequences, and structure can provide insight about RNA function. The accuracy of structure prediction is known to be improved using experimental mapping data that provide information about the pairing status of single nucleotides, and these data can now be acquired for whole transcriptomes using high-throughput sequencing. Prior methods for using these experimental data focused on predicting structures for sequences assuming that they populate a single structure. Most RNAs populate multiple structures, however, where the ensemble of strands populates structures with different sets of canonical base pairs. The focus on modeling single structures has been a bottleneck for accurately modeling RNA structure. In this work, we introduce Rsample, an algorithm for using experimental data to predict more than one RNA structure for sequences that populate multiple structures at equilibrium. We demonstrate, using SHAPE mapping data, that we can accurately model RNA sequences that populate multiple structures, including the relative probabilities of those structures. This program is freely available as part of the RNAstructure software package.
Collapse
Affiliation(s)
- Aleksandar Spasic
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Sarah M Assmann
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Philip C Bevilacqua
- Department of Chemistry, Department of Biochemistry & Molecular Biology, Center for RNA Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA.,Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
23
|
Steger G, Riesner D. Viroid research and its significance for RNA technology and basic biochemistry. Nucleic Acids Res 2019; 46:10563-10576. [PMID: 30304486 PMCID: PMC6237808 DOI: 10.1093/nar/gky903] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 09/24/2018] [Indexed: 12/27/2022] Open
Abstract
Viroids were described 47 years ago as the smallest RNA molecules capable of infecting plants and autonomously self-replicating without an encoded protein. Work on viroids initiated the development of a number of innovative methods. Novel chromatographic and gelelectrophoretic methods were developed for the purification and characterization of viroids; these methods were later used in molecular biology, gene technology and in prion research. Theoretical and experimental studies of RNA folding demonstrated the general biological importance of metastable structures, and nuclear magnetic resonance spectroscopy of viroid RNA showed the partially covalent nature of hydrogen bonds in biological macromolecules. RNA biochemistry and molecular biology profited from viroid research, such as in the detection of RNA as template of DNA-dependent polymerases and in mechanisms of gene silencing. Viroids, the first circular RNA detected in nature, are important for studies on the much wider spectrum of circular RNAs and other non-coding RNAs.
Collapse
Affiliation(s)
- Gerhard Steger
- Department of Biology, Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany
| | - Detlev Riesner
- Department of Biology, Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany
| |
Collapse
|
24
|
Zuber J, Mathews DH. Estimating uncertainty in predicted folding free energy changes of RNA secondary structures. RNA (NEW YORK, N.Y.) 2019; 25:747-754. [PMID: 30952689 PMCID: PMC6521603 DOI: 10.1261/rna.069203.118] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Accepted: 04/02/2019] [Indexed: 06/09/2023]
Abstract
Nearest neighbor parameters for estimating the folding stability of RNA are commonly used in secondary structure prediction, for generating folding ensembles of structures, and for analyzing RNA function. Previously, we demonstrated that we could quantify the uncertainties in each nearest neighbor parameter by perturbing the underlying optical melting data within experimental error and rederiving the parameters, which accounts for the substantial correlations that exist between the parameters. In this contribution, we describe a method to estimate uncertainty in the estimated folding stabilities of RNA structures, accounting for correlations in the nearest neighbor parameters. This method is incorporated in the RNA structure software package.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York, 14642, USA
| | - David H Mathews
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York, 14642, USA
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York, 14642, USA
| |
Collapse
|
25
|
Ponce-Salvatierra A, Astha, Merdas K, Nithin C, Ghosh P, Mukherjee S, Bujnicki JM. Computational modeling of RNA 3D structure based on experimental data. Biosci Rep 2019; 39:BSR20180430. [PMID: 30670629 PMCID: PMC6367127 DOI: 10.1042/bsr20180430] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 01/19/2019] [Accepted: 01/21/2019] [Indexed: 01/02/2023] Open
Abstract
RNA molecules are master regulators of cells. They are involved in a variety of molecular processes: they transmit genetic information, sense cellular signals and communicate responses, and even catalyze chemical reactions. As in the case of proteins, RNA function is dictated by its structure and by its ability to adopt different conformations, which in turn is encoded in the sequence. Experimental determination of high-resolution RNA structures is both laborious and difficult, and therefore the majority of known RNAs remain structurally uncharacterized. To address this problem, predictive computational methods were developed based on the accumulated knowledge of RNA structures determined so far, the physical basis of the RNA folding, and taking into account evolutionary considerations, such as conservation of functionally important motifs. However, all theoretical methods suffer from various limitations, and they are generally unable to accurately predict structures for RNA sequences longer than 100-nt residues unless aided by additional experimental data. In this article, we review experimental methods that can generate data usable by computational methods, as well as computational approaches for RNA structure prediction that can utilize data from experimental analyses. We outline methods and data types that can be potentially useful for RNA 3D structure modeling but are not commonly used by the existing software, suggesting directions for future development.
Collapse
Affiliation(s)
- Almudena Ponce-Salvatierra
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, Warsaw PL-02-109, Poland
| | - Astha
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, Warsaw PL-02-109, Poland
| | - Katarzyna Merdas
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, Warsaw PL-02-109, Poland
| | - Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, Warsaw PL-02-109, Poland
| | - Pritha Ghosh
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, Warsaw PL-02-109, Poland
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, Warsaw PL-02-109, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, Warsaw PL-02-109, Poland
- Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, Poznan PL-61-614, Poland
| |
Collapse
|
26
|
Smith LG, Tan Z, Spasic A, Dutta D, Salas-Estrada LA, Grossfield A, Mathews DH. Chemically Accurate Relative Folding Stability of RNA Hairpins from Molecular Simulations. J Chem Theory Comput 2018; 14:6598-6612. [PMID: 30375860 DOI: 10.1021/acs.jctc.8b00633] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
To benchmark RNA force fields, we compared the folding stabilities of three 12-nucleotide hairpin stem loops estimated by simulation to stabilities determined by experiment. We used umbrella sampling and a reaction coordinate of end-to-end (5' to 3' hydroxyl oxygen) distance to estimate the free energy change of the transition from the native conformation to a fully extended conformation with no hydrogen bonds between non-neighboring bases. Each simulation was performed four times using the AMBER FF99+bsc0+χOL3 force field, and each window, spaced at 1 Å intervals, was sampled for 1 μs, for a total of 552 μs of simulation. We compared differences in the simulated free energy changes to analogous differences in free energies from optical melting experiments using thermodynamic cycles where the free energy change between stretched and random coil sequences is assumed to be sequence-independent. The differences between experimental and simulated ΔΔ G° are, on average, 0.98 ± 0.66 kcal/mol, which is chemically accurate and suggests that analogous simulations could be used predictively. We also report a novel method to identify where replica free energies diverge along a reaction coordinate, thus indicating where additional sampling would most improve convergence. We conclude by discussing methods to more economically perform these simulations.
Collapse
Affiliation(s)
- Louis G Smith
- Department of Biochemistry & Biophysics , University of Rochester , Rochester , New York 14642 , United States.,Center for RNA Biology , University of Rochester , Rochester , New York 14642 , United States
| | - Zhen Tan
- Department of Biochemistry & Biophysics , University of Rochester , Rochester , New York 14642 , United States.,Center for RNA Biology , University of Rochester , Rochester , New York 14642 , United States
| | - Aleksandar Spasic
- Department of Biochemistry & Biophysics , University of Rochester , Rochester , New York 14642 , United States.,Center for RNA Biology , University of Rochester , Rochester , New York 14642 , United States
| | - Debapratim Dutta
- Department of Biochemistry & Biophysics , University of Rochester , Rochester , New York 14642 , United States.,Center for RNA Biology , University of Rochester , Rochester , New York 14642 , United States
| | - Leslie A Salas-Estrada
- Department of Biochemistry & Biophysics , University of Rochester , Rochester , New York 14642 , United States
| | - Alan Grossfield
- Department of Biochemistry & Biophysics , University of Rochester , Rochester , New York 14642 , United States
| | - David H Mathews
- Department of Biochemistry & Biophysics , University of Rochester , Rochester , New York 14642 , United States.,Department of Biostatistics and Computational Biology , University of Rochester , Rochester , New York 14642 , United States.,Center for RNA Biology , University of Rochester , Rochester , New York 14642 , United States
| |
Collapse
|
27
|
Zuber J, Cabral BJ, McFadyen I, Mauger DM, Mathews DH. Analysis of RNA nearest neighbor parameters reveals interdependencies and quantifies the uncertainty in RNA secondary structure prediction. RNA (NEW YORK, N.Y.) 2018; 24:1568-1582. [PMID: 30104207 PMCID: PMC6191722 DOI: 10.1261/rna.065102.117] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 08/07/2018] [Indexed: 05/08/2023]
Abstract
RNA secondary structure prediction is often used to develop hypotheses about structure-function relationships for newly discovered RNA sequences, to identify unknown functional RNAs, and to design sequences. Secondary structure prediction methods typically use a thermodynamic model that estimates the free energy change of possible structures based on a set of nearest neighbor parameters. These parameters were derived from optical melting experiments of small model oligonucleotides. This work aims to better understand the precision of structure prediction. Here, the experimental errors in optical melting experiments were propagated to errors in the derived nearest neighbor parameter values and then to errors in RNA secondary structure prediction. To perform this analysis, the optical melting experimental values were systematically perturbed within the estimates of experimental error and alternative sets of nearest neighbor parameters were then derived from these error-bounded values. Secondary structure predictions using either the perturbed or reference parameter sets were then compared. This work demonstrated that the precision of RNA secondary structure prediction is more robust than suggested by previous work based on perturbation of the nearest neighbor parameters. This robustness is due to correlations between parameters. Additionally, this work identified weaknesses in the parameter derivation that makes accurate assessment of parameter uncertainty difficult. Considerations for experimental design are provided to mitigate these weaknesses are provided.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| | - B Joseph Cabral
- Computational Sciences, Moderna Therapeutics, Cambridge, Massachusetts 02141, USA
| | - Iain McFadyen
- Computational Sciences, Moderna Therapeutics, Cambridge, Massachusetts 02141, USA
| | - David M Mauger
- Computational Sciences, Moderna Therapeutics, Cambridge, Massachusetts 02141, USA
| | - David H Mathews
- Department of Biochemistry and Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York 14642, USA
| |
Collapse
|
28
|
Liu Z, Zhu D, Dai Q. Predicting Model and Algorithm in RNA Folding Structure Including Pseudoknots. INT J PATTERN RECOGN 2018. [DOI: 10.1142/s0218001418510059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The prediction of RNA structure with pseudoknots is a nondeterministic polynomial-time hard (NP-hard) problem; according to minimum free energy models and computational methods, we investigate the RNA-pseudoknotted structure. Our paper presents an efficient algorithm for predicting RNA structure with pseudoknots, and the algorithm takes O([Formula: see text]) time and O([Formula: see text]) space, the experimental tests in Rfam10.1 and PseudoBase indicate that the algorithm is more effective and precise. The predicting accuracy, the time complexity and space complexity outperform existing algorithms, such as Maximum Weight Matching (MWM) algorithm, PKNOTS algorithm and Inner Limiting Layer (ILM) algorithm, and the algorithm can predict arbitrary pseudoknots. And there exists a [Formula: see text] ([Formula: see text]) polynomial time approximation scheme in searching maximum number of stackings, and we give the proof of the approximation scheme in RNA-pseudoknotted structure. We have improved several types of pseudoknots considered in RNA folding structure, and analyze their possible transitions between types of pseudoknots.
Collapse
Affiliation(s)
- Zhendong Liu
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, P. R. China
| | - Daming Zhu
- School of Computer Science and Technology, Shandong University, Jinan 250101, P. R. China
| | - Qionghai Dai
- Department of Automation, Tsinghua University, Beijing 100084, P. R. China
| |
Collapse
|
29
|
Ward M, Datta A, Wise M, Mathews DH. Advanced multi-loop algorithms for RNA secondary structure prediction reveal that the simplest model is best. Nucleic Acids Res 2017; 45:8541-8550. [PMID: 28586479 PMCID: PMC5737859 DOI: 10.1093/nar/gkx512] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Accepted: 05/31/2017] [Indexed: 01/08/2023] Open
Abstract
Algorithmic prediction of RNA secondary structure has been an area of active inquiry since the 1970s. Despite many innovations since then, our best techniques are not yet perfect. The workhorses of the RNA secondary structure prediction engine are recursions first described by Zuker and Stiegler in 1981. These have well understood caveats; a notable flaw is the ad-hoc treatment of multi-loops, also called helical-junctions, that persists today. While several advanced models for multi-loops have been proposed, it seems to have been assumed that incorporating them into the recursions would lead to intractability, and so no algorithms for these models exist. Some of these models include the classical model based on Jacobson–Stockmayer polymer theory, and another by Aalberts and Nadagopal that incorporates two-length-scale polymer physics. We have realized practical, tractable algorithms for each of these models. However, after implementing these algorithms, we found that no advanced model was better than the original, ad-hoc model used for multi-loops. While this is unexpected, it supports the praxis of the current model.
Collapse
Affiliation(s)
- Max Ward
- Computer Science & Software Engineering, The University of Western Australia, Australia
| | - Amitava Datta
- Computer Science & Software Engineering, The University of Western Australia, Australia
| | - Michael Wise
- Computer Science & Software Engineering, The University of Western Australia, Australia.,The Marshall Centre for Infectious Diseases Research and Training, The University of Western Australia, Australia
| | - David H Mathews
- Department of Biochemistry & Biophysics, Department of Biostatistics & Computational Biology, and Center for RNA Biology, University of Rochester, NY, USA
| |
Collapse
|
30
|
Zuber J, Sun H, Zhang X, McFadyen I, Mathews DH. A sensitivity analysis of RNA folding nearest neighbor parameters identifies a subset of free energy parameters with the greatest impact on RNA secondary structure prediction. Nucleic Acids Res 2017; 45:6168-6176. [PMID: 28334976 PMCID: PMC5449625 DOI: 10.1093/nar/gkx170] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Accepted: 03/10/2017] [Indexed: 01/02/2023] Open
Abstract
Nearest neighbor parameters for estimating the folding energy changes of RNA secondary structures are used in structure prediction and analysis. Despite their widespread application, a comprehensive analysis of the impact of each parameter on the precision of calculations had not been conducted. To identify the parameters with greatest impact, a sensitivity analysis was performed on the 291 parameters that compose the 2004 version of the free energy nearest neighbor rules. Perturbed parameter sets were generated by perturbing each parameter independently. Then the effect of each individual parameter change on predicted base-pair probabilities and secondary structures as compared to the standard parameter set was observed for a set of sequences including structured ncRNA, mRNA and randomized sequences. The results identify for the first time the parameters with the greatest impact on secondary structure prediction, and the subset which should be prioritized for further study in order to improve the precision of structure prediction. In particular, bulge loop initiation, multibranch loop initiation, AU/GU internal loop closure and AU/GU helix end parameters were particularly important. An analysis of parameter usage during folding free energy calculations of stochastic samples of secondary structures revealed a correlation between parameter usage and impact on structure prediction precision.
Collapse
Affiliation(s)
- Jeffrey Zuber
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Hongying Sun
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Xiaoju Zhang
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Iain McFadyen
- Computational Sciences, Moderna Therapeutics, Cambridge, MA 02141, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.,Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| |
Collapse
|
31
|
Findeiß S, Etzel M, Will S, Mörl M, Stadler PF. Design of Artificial Riboswitches as Biosensors. SENSORS 2017; 17:s17091990. [PMID: 28867802 PMCID: PMC5621056 DOI: 10.3390/s17091990] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Revised: 08/23/2017] [Accepted: 08/25/2017] [Indexed: 12/11/2022]
Abstract
RNA aptamers readily recognize small organic molecules, polypeptides, as well as other nucleic acids in a highly specific manner. Many such aptamers have evolved as parts of regulatory systems in nature. Experimental selection techniques such as SELEX have been very successful in finding artificial aptamers for a wide variety of natural and synthetic ligands. Changes in structure and/or stability of aptamers upon ligand binding can propagate through larger RNA constructs and cause specific structural changes at distal positions. In turn, these may affect transcription, translation, splicing, or binding events. The RNA secondary structure model realistically describes both thermodynamic and kinetic aspects of RNA structure formation and refolding at a single, consistent level of modelling. Thus, this framework allows studying the function of natural riboswitches in silico. Moreover, it enables rationally designing artificial switches, combining essentially arbitrary sensors with a broad choice of read-out systems. Eventually, this approach sets the stage for constructing versatile biosensors.
Collapse
Affiliation(s)
- Sven Findeiß
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
- Faculty of Computer Science, Research Group Bioinformatics and Computational Biology, University of Vienna, Währingerstraße 29, A-1090 Vienna, Austria.
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria.
| | - Maja Etzel
- Institute for Biochemistry, Leipzig University, Brüderstraße 34, 04103 Leipzig, Germany.
| | - Sebastian Will
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria.
- Institute for Biochemistry, Leipzig University, Brüderstraße 34, 04103 Leipzig, Germany.
| | - Mario Mörl
- Institute for Biochemistry, Leipzig University, Brüderstraße 34, 04103 Leipzig, Germany.
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
- Faculty of Chemistry, Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, 04103 Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany.
- Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, 04103 Leipzig, Germany.
- Center for RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg , Denmark.
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA.
| |
Collapse
|
32
|
Choudhary K, Deng F, Aviran S. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions. QUANTITATIVE BIOLOGY 2017; 5:3-24. [PMID: 28717530 PMCID: PMC5510538 DOI: 10.1007/s40484-017-0093-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/08/2016] [Accepted: 12/15/2016] [Indexed: 12/30/2022]
Abstract
BACKGROUND Structure profiling experiments provide single-nucleotide information on RNA structure. Recent advances in chemistry combined with application of high-throughput sequencing have enabled structure profiling at transcriptome scale and in living cells, creating unprecedented opportunities for RNA biology. Propelled by these experimental advances, massive data with ever-increasing diversity and complexity have been generated, which give rise to new challenges in interpreting and analyzing these data. RESULTS We review current practices in analysis of structure profiling data with emphasis on comparative and integrative analysis as well as highlight emerging questions. Comparative analysis has revealed structural patterns across transcriptomes and has become an integral component of recent profiling studies. Additionally, profiling data can be integrated into traditional structure prediction algorithms to improve prediction accuracy. CONCLUSIONS To keep pace with experimental developments, methods to facilitate, enhance and refine such analyses are needed. Parallel advances in analysis methodology will complement profiling technologies and help them reach their full potential.
Collapse
Affiliation(s)
| | | | - Sharon Aviran
- Department of Biomedical Engineering and Genome Center, University of California at Davis, Davis, CA 95616, USA
| |
Collapse
|
33
|
Raposo AN, Gomes AJP. Computational 3D Assembling Methods for DNA: A Survey. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:1068-1085. [PMID: 26701896 DOI: 10.1109/tcbb.2015.2510008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
DNA encodes the genetic information of most living beings, except viruses that use RNA. Unlike other types of molecules, DNA is not usually described by its atomic structure being instead usually described by its base-pair sequence, i.e., the textual sequence of its subsidiary molecules known as nucleotides ( adenine (A), cytosine (C), guanine (G), and thymine (T)). The three-dimensional assembling of DNA molecules based on its base-pair sequence has been, for decades, a topic of interest for many research groups all over the world. In this paper, we survey the major methods found in the literature to assemble and visualize DNA molecules from their base-pair sequences. We divided these methods into three categories: predictive methods, adaptive methods, and thermodynamic methods . Predictive methods aim to predict a conformation of the DNA from its base pair sequence, while the goal of adaptive methods is to assemble DNA base-pairs sequences along previously known conformations, as needed in scenarios such as DNA Monte Carlo simulations. Unlike these two geometric methods, thermodynamic methods are energy-based and aim to predict secondary structural motifs of DNA in cases where hydrogen bonds between base pairs might be broken because of temperature changes. We also present the major software tools that implements predictive, adaptive, and thermodynamic methods.
Collapse
|
34
|
Nainar S, Feng C, Spitale RC. Chemical Tools for Dissecting the Role of lncRNAs in Epigenetic Regulation. ACS Chem Biol 2016; 11:2091-100. [PMID: 27267401 PMCID: PMC5068361 DOI: 10.1021/acschembio.6b00366] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Proper control and maintenance of gene expression is critical for cellular identity and maintenance. Transcription of RNA from the genome is intimately controlled by post-translational chemical modification of histone tails and DNA. Recent studies have demonstrated that chromatin-remodeling complexes seek out their target genomic loci through the help of noncoding RNA molecules. Within this Review, we will outline how the use of biochemical techniques has shed light on the mechanisms employed by RNA to guide these complexes and therefore control gene expression.
Collapse
Affiliation(s)
- Sarah Nainar
- Department of Pharmaceutical Sciences, University of California, Irvine. Irvine, California 92697, United States
| | - Chao Feng
- Department of Pharmaceutical Sciences, University of California, Irvine. Irvine, California 92697, United States
| | - Robert C. Spitale
- Department of Pharmaceutical Sciences, University of California, Irvine. Irvine, California 92697, United States
| |
Collapse
|
35
|
Abstract
Deciphering the folding pathways and predicting the structures of complex three-dimensional biomolecules is central to elucidating biological function. RNA is single-stranded, which gives it the freedom to fold into complex secondary and tertiary structures. These structures endow RNA with the ability to perform complex chemistries and functions ranging from enzymatic activity to gene regulation. Given that RNA is involved in many essential cellular processes, it is critical to understand how it folds and functions in vivo. Within the last few years, methods have been developed to probe RNA structures in vivo and genome-wide. These studies reveal that RNA often adopts very different structures in vivo and in vitro, and provide profound insights into RNA biology. Nonetheless, both in vitro and in vivo approaches have limitations: studies in the complex and uncontrolled cellular environment make it difficult to obtain insight into RNA folding pathways and thermodynamics, and studies in vitro often lack direct cellular relevance, leaving a gap in our knowledge of RNA folding in vivo. This gap is being bridged by biophysical and mechanistic studies of RNA structure and function under conditions that mimic the cellular environment. To date, most artificial cytoplasms have used various polymers as molecular crowding agents and a series of small molecules as cosolutes. Studies under such in vivo-like conditions are yielding fresh insights, such as cooperative folding of functional RNAs and increased activity of ribozymes. These observations are accounted for in part by molecular crowding effects and interactions with other molecules. In this review, we report milestones in RNA folding in vitro and in vivo and discuss ongoing experimental and computational efforts to bridge the gap between these two conditions in order to understand how RNA folds in the cell.
Collapse
|
36
|
Fang R, Moss WN, Rutenberg-Schoenberg M, Simon MD. Probing Xist RNA Structure in Cells Using Targeted Structure-Seq. PLoS Genet 2015; 11:e1005668. [PMID: 26646615 PMCID: PMC4672913 DOI: 10.1371/journal.pgen.1005668] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Accepted: 10/24/2015] [Indexed: 11/19/2022] Open
Abstract
The long non-coding RNA (lncRNA) Xist is a master regulator of X-chromosome inactivation in mammalian cells. Models for how Xist and other lncRNAs function depend on thermodynamically stable secondary and higher-order structures that RNAs can form in the context of a cell. Probing accessible RNA bases can provide data to build models of RNA conformation that provide insight into RNA function, molecular evolution, and modularity. To study the structure of Xist in cells, we built upon recent advances in RNA secondary structure mapping and modeling to develop Targeted Structure-Seq, which combines chemical probing of RNA structure in cells with target-specific massively parallel sequencing. By enriching for signals from the RNA of interest, Targeted Structure-Seq achieves high coverage of the target RNA with relatively few sequencing reads, thus providing a targeted and scalable approach to analyze RNA conformation in cells. We use this approach to probe the full-length Xist lncRNA to develop new models for functional elements within Xist, including the repeat A element in the 5’-end of Xist. This analysis also identified new structural elements in Xist that are evolutionarily conserved, including a new element proximal to the C repeats that is important for Xist function. To do their jobs, many RNAs need to fold into structures (through base-paring). We were interested in the conformation of a specific mammalian RNA, Xist, when it is inside a cell. Xist is a very large non-coding RNA (lncRNA), that is >17,000 nt long. Xist is particularly important because it is one of the first lncRNAs to be discovered, and turns genes off across an entire chromosome. To figure out how Xist RNA is folded in mouse cells, we developed a new approach, Targeted Structure-Seq, to examine the conformation of large RNAs like Xist. Using computer modeling, we identified parts of Xist that are base paired into RNA duplexes. We also determined which parts of the Xist RNA are likely to be structured. This work provides a new tool for studying the secondary structure of any large RNA, and helps us understand what the important pieces of Xist look like while Xist does its work in the cell.
Collapse
Affiliation(s)
- Rui Fang
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Chemical Biology Institute, Yale University, West Haven, Connecticut, United States of America
| | - Walter N. Moss
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Michael Rutenberg-Schoenberg
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Chemical Biology Institute, Yale University, West Haven, Connecticut, United States of America
| | - Matthew D. Simon
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Chemical Biology Institute, Yale University, West Haven, Connecticut, United States of America
- * E-mail:
| |
Collapse
|
37
|
Wagner D, Rinnenthal J, Narberhaus F, Schwalbe H. Mechanistic insights into temperature-dependent regulation of the simple cyanobacterial hsp17 RNA thermometer at base-pair resolution. Nucleic Acids Res 2015; 43:5572-85. [PMID: 25940621 PMCID: PMC4477652 DOI: 10.1093/nar/gkv414] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2015] [Accepted: 04/08/2015] [Indexed: 12/16/2022] Open
Abstract
The cyanobacterial hsp17 ribonucleicacid thermometer (RNAT) is one of the smallest naturally occurring RNAT. It forms a single hairpin with an internal 1×3-bulge separating the start codon in stem I from the ribosome binding site (RBS) in stem II. We investigated the temperature-dependent regulation of hsp17 by mapping individual base-pair stabilities from solvent exchange nuclear magnetic resonance (NMR) spectroscopy. The wild-type RNAT was found to be stabilized by two critical CG base pairs (C14-G27 and C13-G28). Replacing the internal 1×3 bulge by a stable CG base pair in hsp17rep significantly increased the global stability and unfolding cooperativity as evidenced by circular dichroism spectroscopy. From the NMR analysis, remote stabilization and non-nearest neighbour effects exist at the base-pair level, in particular for nucleotide G28 (five nucleotides apart from the side of mutation). Individual base-pair stabilities are coupled to the stability of the entire thermometer within both the natural and the stabilized RNATs by enthalpy–entropy compensation presumably mediated by the hydration shell. At the melting point the Gibbs energies of the individual nucleobases are equalized suggesting a consecutive zipper-type unfolding mechanism of the RBS leading to a dimmer-like function of hsp17 and switch-like regulation behaviour of hsp17rep. The data show how minor changes in the nucleotide sequence not only offset the melting temperature but also alter the mode of temperature sensing. The cyanobacterial thermosensor demonstrates the remarkable adjustment of natural RNATs to execute precise temperature control.
Collapse
Affiliation(s)
- Dominic Wagner
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance, Johann Wolfgang Goethe-University, Max-von-Laue-Strasse 7, D-60438 Frankfurt/Main, Germany
| | - Jörg Rinnenthal
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance, Johann Wolfgang Goethe-University, Max-von-Laue-Strasse 7, D-60438 Frankfurt/Main, Germany
| | - Franz Narberhaus
- Microbial Biology, Ruhr University, Universitätsstr. 150, D-44780 Bochum, Germany
| | - Harald Schwalbe
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance, Johann Wolfgang Goethe-University, Max-von-Laue-Strasse 7, D-60438 Frankfurt/Main, Germany
| |
Collapse
|
38
|
Sloma MF, Mathews DH. Improving RNA secondary structure prediction with structure mapping data. Methods Enzymol 2015; 553:91-114. [PMID: 25726462 DOI: 10.1016/bs.mie.2014.10.053] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Methods to probe RNA secondary structure, such as small molecule modifying agents, secondary structure-specific nucleases, inline probing, and SHAPE chemistry, are widely used to study the structure of functional RNA. Computational secondary structure prediction programs can incorporate probing data to predict structure with high accuracy. In this chapter, an overview of current methods for probing RNA secondary structure is provided, including modern high-throughput methods. Methods for guiding secondary structure prediction algorithms using these data are explained, and best practices for using these data are provided. This chapter concludes by listing a number of open questions about how to best use probing data, and what these data can provide.
Collapse
Affiliation(s)
- Michael F Sloma
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Box 712, Rochester, New York, USA; Center for RNA Biology, University of Rochester Medical Center, Box 712, Rochester, New York, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Box 712, Rochester, New York, USA; Center for RNA Biology, University of Rochester Medical Center, Box 712, Rochester, New York, USA.
| |
Collapse
|