1
|
Bergonzo C, Grishaev A. Critical Assessment of RNA and DNA Structure Predictions via Artificial Intelligence: The Imitation Game. J Chem Inf Model 2025; 65:3544-3554. [PMID: 40159092 PMCID: PMC12004532 DOI: 10.1021/acs.jcim.5c00245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2025] [Revised: 03/13/2025] [Accepted: 03/17/2025] [Indexed: 04/02/2025]
Abstract
Computational predictions of biomolecular structure via artificial intelligence (AI) based approaches, as exemplified by AlphaFold software, have the potential to model of all life's biomolecules. We performed oligonucleotide structure prediction and gauged the accuracy of the AI-generated models via their agreement with experimental solution-state observables. We find parts of these models in good agreement with experimental data, and others falling short of the ground truth. The latter include internal or capping loops, noncanonical base pairings, and regions involving conformational flexibility, all essential for RNA folding, interactions, and function. We estimate root-mean-square (r.m.s.) errors in predicted nucleotide bond vector orientations ranging between 7° and 30°, with higher accuracies for simpler architectures of individual canonically paired helical stems. These mixed results highlight the necessity of experimental validation of AI-based oligonucleotide model predictions and their current tendency to mimic the training data set rather than reproduce the underlying reality.
Collapse
Affiliation(s)
- Christina Bergonzo
- Biomolecular
Measurement Division, Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
- Institute
for Bioscience and Biotechnology Research, Rockville, Maryland 20850, United States
| | - Alexander Grishaev
- Biomolecular
Measurement Division, Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
- Institute
for Bioscience and Biotechnology Research, Rockville, Maryland 20850, United States
| |
Collapse
|
2
|
Love O, Galindo-Murillo R, Roe DR, Dans PD, Cheatham III TE, Bergonzo C. modXNA: A Modular Approach to Parametrization of Modified Nucleic Acids for Use with Amber Force Fields. J Chem Theory Comput 2024; 20:9354-9363. [PMID: 39468889 PMCID: PMC11562377 DOI: 10.1021/acs.jctc.4c01164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 10/21/2024] [Accepted: 10/22/2024] [Indexed: 10/30/2024]
Abstract
Modified nucleic acids have surged as a popular therapeutic route, emphasizing the importance of nucleic acid research in drug discovery and development. Beyond well-known RNA vaccines, antisense oligonucleotides and aptamers can incorporate various modified nucleic acids to target specific biomolecules for various therapeutic activities. Molecular dynamics simulations can accelerate the design and development of these systems with noncanonical nucleic acids by observing intricate dynamic properties and relative stability on the all-atom level. However, modeling these modified systems is challenging due to the time and resources required to parametrize components outside default force field parameters. Here, we present modXNA, a tool to derive and build modified nucleotides for use with Amber force fields. Several nucleic acid systems varying in size and number of modification sites were used to evaluate the accuracy of modXNA parameters, and results indicate the dynamics and structure are preserved throughout the simulations. We detail the protocol for quantum mechanics charge derivation and describe a workflow for implementing modXNA in Amber molecular dynamics simulations, which includes updates and added features to CPPTRAJ.
Collapse
Affiliation(s)
- Olivia Love
- Department
of Medicinal Chemistry, College of Pharmacy, University of Utah, 2000 East 30 South Skaggs 306, Salt Lake City, Utah 84112, United States
| | - Rodrigo Galindo-Murillo
- Department
of Medicinal Chemistry, Ionis Pharmaceuticals, 2855 Gazelle Court, Carlsbad, California 92010, United States
| | - Daniel R. Roe
- Laboratory
of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Pablo D. Dans
- Computational
Biophysics Group, Department of Biological Sciences, CENUR Litoral
Norte, Universidad de la República, Salto 50000, Uruguay
- Bioinformatics
Unit. Institute Pasteur of Montevideo, Montevideo 11400, Uruguay
| | - Thomas E. Cheatham III
- Department
of Medicinal Chemistry, College of Pharmacy, University of Utah, 2000 East 30 South Skaggs 306, Salt Lake City, Utah 84112, United States
| | - Christina Bergonzo
- Institute
for Bioscience and Biotechnology Research, National Institute of Standards and Technology and the University
of Maryland, 9600 Gudelsky Way, Rockville, Maryland 20850, United States
| |
Collapse
|
3
|
Sannapureddi RKR, Mohanty MK, Salmon L, Sathyamoorthy B. Conformational Plasticity of Parallel G-Quadruplex─Implications on Duplex-Quadruplex Motifs. J Am Chem Soc 2023. [PMID: 37428641 DOI: 10.1021/jacs.3c03218] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2023]
Abstract
DNA G-quadruplexes are essential motifs in molecular biology performing a wide range of functions enabled by their unique and diverse structures. In this study, we focus on the conformational plasticity of the most abundant and biologically relevant parallel G-quadruplex topology. A multipronged approach of structure survey, solution-state NMR spectroscopy, and molecular dynamics simulations unravels subtle yet essential features of the parallel G-quadruplex topology. Stark differences in flexibility are observed for the nucleotides depending upon their positioning in the tetrad planes that are intricately correlated with the conformational sampling of the propeller loop. Importantly, the terminal nucleotides in the 5'-end versus the 3'-end of the parallel quadruplex display differential dynamics that manifests their ability to accommodate a duplex on either end of the G-quadruplex. The conformational plasticity characterized in this study provides essential cues toward biomolecular processes such as small molecular binding, intermolecular quadruplex stacking, and implications on how a duplex influences the structure of a neighboring quadruplex.
Collapse
Affiliation(s)
| | - Manish Kumar Mohanty
- Department of Chemistry, Indian Institute of Science Education and Research, Bhopal 462066, India
| | - Loïc Salmon
- Centre de RMN à Très Hauts Champs, UMR 5082 (CNRS, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1), University of Lyon, Villeurbanne 69100, France
| | - Bharathwaj Sathyamoorthy
- Department of Chemistry, Indian Institute of Science Education and Research, Bhopal 462066, India
| |
Collapse
|
4
|
Hussain A, Paukovich N, Henen MA, Vögeli B. Advances in the exact nuclear Overhauser effect 2018-2022. Methods 2022; 206:87-98. [PMID: 35985641 PMCID: PMC9596134 DOI: 10.1016/j.ymeth.2022.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 08/05/2022] [Accepted: 08/12/2022] [Indexed: 11/26/2022] Open
Abstract
The introduction of the exact nuclear Overhauser enhancement (eNOE) methodology to solution-state nuclear magnetic resonance (NMR) spectroscopy results in tighter distance restraints from NOEs than in convention analysis. These improved restraints allow for higher resolution in structure calculation and even the disentanglement of different conformations of macromolecules. While initial work primarily focused on technical development of the eNOE, structural studies aimed at the elucidation of spatial sampling in proteins and nucleic acids were published in parallel prior to 2018. The period of 2018-2022 saw a continued series of technical innovation, but also major applications addressing biological questions. Here, we review both aspects, covering topics from the implementation of non-uniform sampling of NOESY buildups, novel pulse sequences, adaption of the eNOE to solid-state NMR, advances in eNOE data analysis, and innovations in structural ensemble calculation, to applications to protein, RNA, and DNA structure elucidation.
Collapse
Affiliation(s)
- Alya Hussain
- Department of Biochemistry & Molecular Genetics, School of Medicine, University of Colorado, 12801 E. 17(th) Avenue, Aurora, CO 80045, USA
| | - Natasia Paukovich
- Department of Biochemistry & Molecular Genetics, School of Medicine, University of Colorado, 12801 E. 17(th) Avenue, Aurora, CO 80045, USA
| | - Morkos A Henen
- Department of Biochemistry & Molecular Genetics, School of Medicine, University of Colorado, 12801 E. 17(th) Avenue, Aurora, CO 80045, USA; Department of Pharmaceutical Organic Chemistry, Faculty of Pharmacy, Mansoura University, Mansoura 35516, Egypt
| | - Beat Vögeli
- Department of Biochemistry & Molecular Genetics, School of Medicine, University of Colorado, 12801 E. 17(th) Avenue, Aurora, CO 80045, USA.
| |
Collapse
|
5
|
Bergonzo C, Grishaev A, Bottaro S. Conformational heterogeneity of UCAAUC RNA oligonucleotide from molecular dynamics simulations, SAXS, and NMR experiments. RNA (NEW YORK, N.Y.) 2022; 28:937-946. [PMID: 35483823 PMCID: PMC9202585 DOI: 10.1261/rna.078888.121] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 03/17/2022] [Indexed: 06/14/2023]
Abstract
We describe the conformational ensemble of the single-stranded r(UCAAUC) oligonucleotide obtained using extensive molecular dynamics (MD) simulations and Rosetta's FARFAR2 algorithm. The conformations observed in MD consist of A-form-like structures and variations thereof. These structures are not present in the pool generated using FARFAR2. By comparing with available nuclear magnetic resonance (NMR) measurements, we show that the presence of both A-form-like and other extended conformations is necessary to quantitatively explain experimental data. To further validate our results, we measure solution X-ray scattering (SAXS) data on the RNA hexamer and find that simulations result in more compact structures than observed from these experiments. The integration of simulations with NMR via a maximum entropy approach shows that small modifications to the MD ensemble lead to an improved description of the conformational ensemble. Nevertheless, we identify persisting discrepancies in matching experimental SAXS data.
Collapse
Affiliation(s)
- Christina Bergonzo
- National Institute of Standards and Technology and Institute for Bioscience and Biotechnology Research, Rockville, Maryland 20850, USA
| | - Alexander Grishaev
- National Institute of Standards and Technology and Institute for Bioscience and Biotechnology Research, Rockville, Maryland 20850, USA
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
- Department of Biomedical Sciences, Humanitas University, 20090 Pieve Emanuele, Italy
| |
Collapse
|
6
|
Case DA. Using quantum chemistry to estimate chemical shifts in biomolecules. Biophys Chem 2020; 267:106476. [PMID: 33035752 PMCID: PMC7686263 DOI: 10.1016/j.bpc.2020.106476] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 09/08/2020] [Accepted: 09/08/2020] [Indexed: 01/17/2023]
Abstract
An automated fragmentation quantum mechanics/molecular mechanics approach (AFNMR) has shown promising results in chemical shift calculations for biomolecules. Sample results for ubiquitin, and an RNA hairpin and helix are presented, and used to recent directions in quantum calculations. Trends in chemical shift are stable with regards to change in density functional or basis sets, and the use of the small "pcSseg-0" basis, which was optimized for chemical shift prediction [1], opens the way to more extensive conformational averaging, which can often be necessary, even for fairly well-defined structures.
Collapse
Affiliation(s)
- David A Case
- Dept. of Chemistry & Chemical Biology, Piscataway, NJ 08854, United States.
| |
Collapse
|
7
|
Using All-Atom Potentials to Refine RNA Structure Predictions of SARS-CoV-2 Stem Loops. Int J Mol Sci 2020; 21:ijms21176188. [PMID: 32867123 PMCID: PMC7504604 DOI: 10.3390/ijms21176188] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 08/19/2020] [Accepted: 08/25/2020] [Indexed: 12/30/2022] Open
Abstract
A considerable amount of rapid-paced research is underway to combat the SARS-CoV-2 pandemic. In this work, we assess the 3D structure of the 5′ untranslated region of its RNA, in the hopes that stable secondary structures can be targeted, interrupted, or otherwise measured. To this end, we have combined molecular dynamics simulations with previous Nuclear Magnetic Resonance measurements for stem loop 2 of SARS-CoV-1 to refine 3D structure predictions of that stem loop. We find that relatively short sampling times allow for loop rearrangement from predicted structures determined in absence of water or ions, to structures better aligned with experimental data. We then use molecular dynamics to predict the refined structure of the transcription regulatory leader sequence (TRS-L) region which includes stem loop 3, and show that arrangement of the loop around exchangeable monovalent potassium can interpret the conformational equilibrium determined by in-cell dimethyl sulfate (DMS) data.
Collapse
|
8
|
Accuracy of MD solvent models in RNA structure refinement assessed via liquid-crystal NMR and spin relaxation data. J Struct Biol 2019; 207:250-259. [PMID: 31279068 DOI: 10.1016/j.jsb.2019.07.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 06/24/2019] [Accepted: 07/01/2019] [Indexed: 11/20/2022]
Abstract
Molecular dynamics (MD) simulations play an important role in characterizing Ribonucleic Acid (RNA) structure, augmenting information from experimental techniques such as Nuclear Magnetic Resonance (NMR). In this work, we examine the accuracy of structural representation resulting from application of a number of explicit and implicit solvent models and refinement protocols against experimental data ranging from high density of residual dipolar coupling (RDC) restraints to completely unrestrained simulations. For a prototype A-form RNA helix, our results indicate that AMBER RNA force field with either implicit or explicit solvent can produce a realistic dynamic representation of RNA helical structure, accurately cross-validating with respect to a diverse array of NMR observables. In refinement against NMR distance restraints, modern MD force fields are found to be equally adequate, with high fidelity cross-validation to the residual dipolar couplings (RDCs) and residual chemical shift anisotropies (RCSAs), while slightly over-estimating structural order as monitored via NMR relaxation data. With restraints trimmed to encode only for base pairing information, cross-validation quality significantly deteriorates, now exhibiting a pronounced dependence on the choice of the solvent model. This deterioration is found to be partially reversible by increasing planarity restraints on the nucleobase geometry. For completely unrestrained MD simulations, the choice of water model becomes very important, with the best-performing TIP4P-Ew accurately reproducing both the RDC and RCSA data, while closely matching the NMR-derived order parameters. The information provided here will serve as a foundation for MD-based refinement of solution state NMR structures of RNA.
Collapse
|