1
|
Zuo Y, Chen H, Yang L, Chen R, Zhang X, Deng Z. Research progress on prediction of RNA-protein binding sites in the past five years. Anal Biochem 2024; 691:115535. [PMID: 38643894 DOI: 10.1016/j.ab.2024.115535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 04/23/2024]
Abstract
Accurately predicting RNA-protein binding sites is essential to gain a deeper comprehension of the protein-RNA interactions and their regulatory mechanisms, which are fundamental in gene expression and regulation. However, conventional biological approaches to detect these sites are often costly and time-consuming. In contrast, computational methods for predicting RNA protein binding sites are both cost-effective and expeditious. This review synthesizes already existing computational methods, summarizing commonly used databases for predicting RNA protein binding sites. In addition, applications and innovations of computational methods using traditional machine learning and deep learning for RNA protein binding site prediction during 2018-2023 are presented. These methods cover a wide range of aspects such as effective database utilization, feature selection and encoding, innovative classification algorithms, and evaluation strategies. Exploring the limitations of existing computational methods, this paper delves into the potential directions for future development. DeepRKE, RDense, and DeepDW all employ convolutional neural networks and long and short-term memory networks to construct prediction models, yet their algorithm design and feature encoding differ, resulting in diverse prediction performances.
Collapse
Affiliation(s)
- Yun Zuo
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Huixian Chen
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Lele Yang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Ruoyan Chen
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Xiaoyao Zhang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China.
| |
Collapse
|
2
|
Zhou Y, Chen SJ. Advances in machine-learning approaches to RNA-targeted drug design. ARTIFICIAL INTELLIGENCE CHEMISTRY 2024; 2:100053. [PMID: 38434217 PMCID: PMC10904028 DOI: 10.1016/j.aichem.2024.100053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2024]
Abstract
RNA molecules play multifaceted functional and regulatory roles within cells and have garnered significant attention in recent years as promising therapeutic targets. With remarkable successes achieved by artificial intelligence (AI) in different fields such as computer vision and natural language processing, there is a growing imperative to harness AI's potential in computer-aided drug design (CADD) to discover novel drug compounds that target RNA. Although machine-learning (ML) approaches have been widely adopted in the discovery of small molecules targeting proteins, the application of ML approaches to model interactions between RNA and small molecule is still in its infancy. Compared to protein-targeted drug discovery, the major challenges in ML-based RNA-targeted drug discovery stem from the scarcity of available data resources. With the growing interest and the development of curated databases focusing on interactions between RNA and small molecule, the field anticipates a rapid growth and the opening of a new avenue for disease treatment. In this review, we aim to provide an overview of recent advancements in computationally modeling RNA-small molecule interactions within the context of RNA-targeted drug discovery, with a particular emphasis on methodologies employing ML techniques.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
3
|
Sabei A, Hognon C, Martin J, Frezza E. Dynamics of Protein-RNA Interfaces Using All-Atom Molecular Dynamics Simulations. J Phys Chem B 2024; 128:4865-4886. [PMID: 38740056 DOI: 10.1021/acs.jpcb.3c07698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Facing the current challenges posed by human health diseases requires the understanding of cell machinery at a molecular level. The interplay between proteins and RNA is key for any physiological phenomenon, as well protein-RNA interactions. To understand these interactions, many experimental techniques have been developed, spanning a very wide range of spatial and temporal resolutions. In particular, the knowledge of tridimensional structures of protein-RNA complexes provides structural, mechanical, and dynamical pieces of information essential to understand their functions. To get insights into the dynamics of protein-RNA complexes, we carried out all-atom molecular dynamics simulations in explicit solvent on nine different protein-RNA complexes with different functions and interface size by taking into account the bound and unbound forms. First, we characterized structural changes upon binding and, for the RNA part, the change in the puckering. Second, we extensively analyzed the interfaces, their dynamics and structural properties, and the structural waters involved in the binding, as well as the contacts mediated by them. Based on our analysis, the interfaces rearranged during the simulation time showing alternative and stable residue-residue contacts with respect to the experimental structure.
Collapse
Affiliation(s)
- Afra Sabei
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| | - Cécilia Hognon
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| | - Juliette Martin
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS, UMR 5086 MMSB, Lyon 69367, France
- Laboratory of Biology and Modeling of the Cell, Université de Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, Inserm U1293, Lyon 69367, France
| | - Elisa Frezza
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| |
Collapse
|
4
|
Heng Tan L, Keong Kwoh C, Mu Y. RmsdXNA: RMSD prediction of nucleic acid-ligand docking poses using machine-learning method. Brief Bioinform 2024; 25:bbae166. [PMID: 38695120 PMCID: PMC11063749 DOI: 10.1093/bib/bbae166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/15/2024] [Accepted: 03/19/2024] [Indexed: 05/04/2024] Open
Abstract
Small molecule drugs can be used to target nucleic acids (NA) to regulate biological processes. Computational modeling methods, such as molecular docking or scoring functions, are commonly employed to facilitate drug design. However, the accuracy of the scoring function in predicting the closest-to-native docking pose is often suboptimal. To overcome this problem, a machine learning model, RmsdXNA, was developed to predict the root-mean-square-deviation (RMSD) of ligand docking poses in NA complexes. The versatility of RmsdXNA has been demonstrated by its successful application to various complexes involving different types of NA receptors and ligands, including metal complexes and short peptides. The predicted RMSD by RmsdXNA was strongly correlated with the actual RMSD of the docked poses. RmsdXNA also outperformed the rDock scoring function in ranking and identifying closest-to-native docking poses across different structural groups and on the testing dataset. Using experimental validated results conducted on polyadenylated nuclear element for nuclear expression triplex, RmsdXNA demonstrated better screening power for the RNA-small molecule complex compared to rDock. Molecular dynamics simulations were subsequently employed to validate the binding of top-scoring ligand candidates selected by RmsdXNA and rDock on MALAT1. The results showed that RmsdXNA has a higher success rate in identifying promising ligands that can bind well to the receptor. The development of an accurate docking score for a NA-ligand complex can aid in drug discovery and development advancements. The code to use RmsdXNA is available at the GitHub repository https://github.com/laiheng001/RmsdXNA.
Collapse
Affiliation(s)
- Lai Heng Tan
- Interdisciplinary Graduate School, Nanyang Technological University, 61 Nanyang Drive, 637335 Singapore, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798 Singapore, Singapore
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, 637551 Singapore, Singapore
| |
Collapse
|
5
|
Chen L, Yu Z, Wu Z, Zhou M, Wang Y, Yu X, Li W, Liu G, Tang Y. AptaDB: a comprehensive database integrating aptamer-target interactions. RNA (NEW YORK, N.Y.) 2024; 30:189-199. [PMID: 38164624 PMCID: PMC10870366 DOI: 10.1261/rna.079784.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 12/12/2023] [Indexed: 01/03/2024]
Abstract
Aptamers have emerged as research hotspots of the next generation due to excellent performance benefits and application potentials in pharmacology, medicine, and analytical chemistry. Despite the numerous aptamer investigations, the lack of comprehensive data integration has hindered the development of computational methods for aptamers and the reuse of aptamers. A public access database named AptaDB, derived from experimentally validated data manually collected from the literature, was hence developed, integrating comprehensive aptamer-related data, which include six key components: (i) experimentally validated aptamer-target interaction information, (ii) aptamer property information, (iii) structure information of aptamer, (iv) target information, (v) experimental activity information, and (vi) algorithmically calculated similar aptamers. AptaDB currently contains 1350 experimentally validated aptamer-target interactions, 1230 binding affinity constants, 1293 aptamer sequences, and more. Compared to other aptamer databases, it contains twice the number of entries found in available databases. The collection and integration of the above information categories is unique among available aptamer databases and provides a user-friendly interface. AptaDB will also be continuously updated as aptamer research evolves. We expect that AptaDB will become a powerful source for aptamer rational design and a valuable tool for aptamer screening in the future. For access to AptaDB, please visit http://lmmd.ecust.edu.cn/aptadb/.
Collapse
Affiliation(s)
- Long Chen
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Zhuohang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Moran Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yimeng Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Xinxin Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
6
|
Harini K, Sekijima M, Gromiha MM. PRA-Pred: Structure-based prediction of protein-RNA binding affinity. Int J Biol Macromol 2024; 259:129490. [PMID: 38224813 DOI: 10.1016/j.ijbiomac.2024.129490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 01/10/2024] [Accepted: 01/12/2024] [Indexed: 01/17/2024]
Abstract
Understanding crucial factors that affect the binding affinity of protein-RNA complexes is vital for comprehending their recognition mechanisms. This study involved compiling experimentally measured binding affinity (ΔG) values of 217 protein-RNA complexes and extracting numerous structure-based features, considering RNA, protein, and interactions between protein and RNA. Our findings indicate the significance of RNA base-step parameters, interaction energies, number of atomic contacts in the complex, hydrogen bonds, and contact potentials in understanding the binding affinity. Further, we observed that these factors are influenced by the type of RNA strand and the function of the protein in a protein-RNA complex. Multiple regression equations were developed for different classes of complexes to perform the prediction of the binding affinity between the protein and RNA. We evaluated the models using the jack-knife test and achieved an overall correlation 0.77 between the experimental and predicted binding affinities with a mean absolute error of 1.02 kcal/mol. Furthermore, we introduced a web server, PRA-Pred, intended for the prediction of protein-RNA binding affinity, and it is freely accessible through https://web.iitm.ac.in/bioinfo2/prapred/. We propose that our approach could function as a potential resource for investigating protein-RNA recognitions and developing therapeutic strategies.
Collapse
Affiliation(s)
- K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Sekijima
- Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama, 226-8501, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
7
|
Rigden DJ, Fernández XM. The 2024 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res 2024; 52:D1-D9. [PMID: 38035367 PMCID: PMC10767945 DOI: 10.1093/nar/gkad1173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 11/23/2023] [Indexed: 12/02/2023] Open
Abstract
The 2024 Nucleic Acids Research database issue contains 180 papers from across biology and neighbouring disciplines. There are 90 papers reporting on new databases and 83 updates from resources previously published in the Issue. Updates from databases most recently published elsewhere account for a further seven. Nucleic acid databases include the new NAKB for structural information and updates from Genbank, ENA, GEO, Tarbase and JASPAR. The Issue's Breakthrough Article concerns NMPFamsDB for novel prokaryotic protein families and the AlphaFold Protein Structure Database has an important update. Metabolism is covered by updates from Reactome, Wikipathways and Metabolights. Microbes are covered by RefSeq, UNITE, SPIRE and P10K; viruses by ViralZone and PhageScope. Medically-oriented databases include the familiar COSMIC, Drugbank and TTD. Genomics-related resources include Ensembl, UCSC Genome Browser and Monarch. New arrivals cover plant imaging (OPIA and PlantPAD) and crop plants (SoyMD, TCOD and CropGS-Hub). The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). Over the last year the NAR online Molecular Biology Database Collection has been updated, reviewing 1060 entries, adding 97 new resources and eliminating 388 discontinued URLs bringing the current total to 1959 databases. It is available at http://www.oxfordjournals.org/nar/database/c/.
Collapse
Affiliation(s)
- Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK
| | | |
Collapse
|
8
|
Lawson CL, Berman H, Chen L, Vallat B, Zirbel C. The Nucleic Acid Knowledgebase: a new portal for 3D structural information about nucleic acids. Nucleic Acids Res 2024; 52:D245-D254. [PMID: 37953312 PMCID: PMC10767938 DOI: 10.1093/nar/gkad957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 10/02/2023] [Accepted: 10/16/2023] [Indexed: 11/14/2023] Open
Abstract
The Nucleic Acid Knowledgebase (nakb.org) is a new data resource, updated weekly, for experimentally determined 3D structures containing DNA and/or RNA nucleic acid polymers and their biological assemblies. NAKB indexes nucleic acid-containing structures derived from all major structure determination methods (X-ray, NMR and EM), including all held by the Protein Data Bank (PDB). As the planned successor to the Nucleic Acid Database (NDB), NAKB's design preserves all functionality of the NDB and provides novel nucleic acid-centric content, including structural and functional annotations, as well as annotations from and links to external resources. A variety of custom interactive tools have been developed to enable rapid exploration and drill-down of NAKB's content.
Collapse
Affiliation(s)
- Catherine L Lawson
- Institute for Quantitative Biomedicine, Rutgers, State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Li Chen
- Institute for Quantitative Biomedicine, Rutgers, State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Institute for Quantitative Biomedicine, Rutgers, State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Craig L Zirbel
- Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, USA
| |
Collapse
|
9
|
Sarrazin-Gendron R, Waldispühl J, Reinharz V. Classification and Identification of Non-canonical Base Pairs and Structural Motifs. Methods Mol Biol 2024; 2726:143-168. [PMID: 38780731 DOI: 10.1007/978-1-0716-3519-3_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
The 3D structures of many ribonucleic acid (RNA) loops are characterized by highly organized networks of non-canonical interactions. Multiple computational methods have been developed to annotate structures with those interactions or automatically identify recurrent interaction networks. By contrast, the reverse problem that aims to retrieve the geometry of a look from its sequence or ensemble of interactions remains much less explored. In this chapter, we will describe how to retrieve and build families of conserved structural motifs using their underlying network of non-canonical interactions. Then, we will show how to assign sequence alignments to those families and use the software BayesPairing to build statistical models of structural motifs with their associated sequence alignments. From this model, we will apply BayesPairing to identify in new sequences regions where those loop geometries can occur.
Collapse
Affiliation(s)
| | | | - Vladimir Reinharz
- Department of Computer Science, Université du Québec à Montréal, Montreal, QC, Canada.
| |
Collapse
|
10
|
Haack DB, Rudolfs B, Zhang C, Lyumkis D, Toor N. Structural basis of branching during RNA splicing. Nat Struct Mol Biol 2024; 31:179-189. [PMID: 38057551 PMCID: PMC10968580 DOI: 10.1038/s41594-023-01150-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 10/04/2023] [Indexed: 12/08/2023]
Abstract
Branching is a critical step in RNA splicing that is essential for 5' splice site selection. Recent spliceosome structures have led to competing models for the recognition of the invariant adenosine at the branch point. However, there are no structures of any splicing complex with the adenosine nucleophile docked in the active site and positioned to attack the 5' splice site. Thus we lack a mechanistic understanding of adenosine selection and splice site recognition during RNA splicing. Here we present a cryo-electron microscopy structure of a group II intron that reveals that active site dynamics are coupled to the formation of a base triple within the branch-site helix that positions the 2'-OH of the adenosine for nucleophilic attack on the 5' scissile phosphate. This structure, complemented with biochemistry and comparative analyses to splicing complexes, supports a base triple model of adenosine recognition for branching within group II introns and the evolutionarily related spliceosome.
Collapse
Affiliation(s)
- Daniel B Haack
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA.
| | - Boris Rudolfs
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA
| | - Cheng Zhang
- Salk Institute, La Jolla, CA, USA
- Amgen, Thousand Oaks, CA, USA
| | | | - Navtej Toor
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
11
|
Haymaker A, Bardin AA, Gonen T, Martynowycz MW, Nannenga BL. Structure determination of a DNA crystal by MicroED. Structure 2023; 31:1499-1503.e2. [PMID: 37541248 PMCID: PMC10805983 DOI: 10.1016/j.str.2023.07.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/21/2023] [Accepted: 07/11/2023] [Indexed: 08/06/2023]
Abstract
Microcrystal electron diffraction (MicroED) is a powerful tool for determining high-resolution structures of microcrystals from a diverse array of biomolecular, chemical, and material samples. In this study, we apply MicroED to DNA crystals, which have not been previously analyzed using this technique. We utilized the d(CGCGCG)2 DNA duplex as a model sample and employed cryo-FIB milling to create thin lamella for diffraction data collection. The MicroED data collection and subsequent processing resulted in a 1.10 Å resolution structure of the d(CGCGCG)2 DNA, demonstrating the successful application of cryo-FIB milling and MicroED to the investigation of nucleic acid crystals.
Collapse
Affiliation(s)
- Alison Haymaker
- Biodesign Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, 727 East Tyler Street, Tempe, AZ 85287, USA; School for Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, USA
| | - Andrey A Bardin
- Biodesign Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, 727 East Tyler Street, Tempe, AZ 85287, USA; School for Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, USA
| | - Tamir Gonen
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA; Howard Hughes Medical Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Michael W Martynowycz
- Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA.
| | - Brent L Nannenga
- Biodesign Center for Applied Structural Discovery, Biodesign Institute, Arizona State University, 727 East Tyler Street, Tempe, AZ 85287, USA; School for Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
12
|
Malhotra S, Mulvaney T, Cragnolini T, Sidhu H, Joseph A, Beton J, Topf M. RIBFIND2: Identifying rigid bodies in protein and nucleic acid structures. Nucleic Acids Res 2023; 51:9567-9575. [PMID: 37670532 PMCID: PMC10570027 DOI: 10.1093/nar/gkad721] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 08/10/2023] [Accepted: 08/21/2023] [Indexed: 09/07/2023] Open
Abstract
Molecular structures are often fitted into cryo-EM maps by flexible fitting. When this requires large conformational changes, identifying rigid bodies can help optimize the model-map fit. Tools for identifying rigid bodies in protein structures exist, however an equivalent for nucleic acid structures is lacking. With the increase in cryo-EM maps containing RNA and progress in RNA structure prediction, there is a need for such tools. We previously developed RIBFIND, a program for clustering protein secondary structures into rigid bodies. In RIBFIND2, this approach is extended to nucleic acid structures. RIBFIND2 can identify biologically relevant rigid bodies in important groups of complex RNA structures, capturing a wide range of dynamics, including large rigid-body movements. The usefulness of RIBFIND2-assigned rigid bodies in cryo-EM model refinement was demonstrated on three examples, with two conformations each: Group II Intron complexed IEP, Internal Ribosome Entry Site and the Processome, using cryo-EM maps at 2.7-5 Å resolution. A hierarchical refinement approach, performed on progressively smaller sets of RIBFIND2 rigid bodies, was clearly shown to have an advantage over classical all-atom refinement. RIBFIND2 is available via a web server with structure visualization and as a standalone tool.
Collapse
Affiliation(s)
- Sony Malhotra
- Science and Technology Facilities Council, Scientific Computing, Research Complex at Harwell, Didcot OX11 0FA, UK
| | - Thomas Mulvaney
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg D-22607, Germany
- Universitätsklinikum Hamburg Eppendorf (UKE), Hamburg 20246, Germany
| | - Tristan Cragnolini
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London WC1E 7HX, UK
| | - Haneesh Sidhu
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London WC1E 7HX, UK
| | - Agnel P Joseph
- Science and Technology Facilities Council, Scientific Computing, Research Complex at Harwell, Didcot OX11 0FA, UK
| | - Joseph G Beton
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg D-22607, Germany
| | - Maya Topf
- Leibniz Institute of Virology, Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg D-22607, Germany
- Universitätsklinikum Hamburg Eppendorf (UKE), Hamburg 20246, Germany
| |
Collapse
|
13
|
Tang M, Hwang K, Kang SH. StemP: A Fast and Deterministic Stem-Graph Approach for RNA Secondary Structure Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3278-3291. [PMID: 37028040 DOI: 10.1109/tcbb.2023.3253049] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
We propose a new deterministic methodology to predict the secondary structure of RNA sequences. What information of stem is important for structure prediction, and is it enough ? The proposed simple deterministic algorithm uses minimum stem length, Stem-Loop score, and co-existence of stems, to give good structure predictions for short RNA and tRNA sequences. The main idea is to consider all possible stem with certain stem loop energy and strength to predict RNA secondary structure. We use graph notation, where stems are represented as vertexes, and co-existence between stems as edges. This full Stem-graph presents all possible folding structure, and we pick sub-graph(s) which give the best matching energy for structure prediction. Stem-Loop score adds structure information and speeds up the computation. The proposed method can predict secondary structure even with pseudo knots. One of the strengths of this approach is the simplicity and flexibility of the algorithm, and it gives a deterministic answer. Numerical experiments are done on various sequences from Protein Data Bank and the Gutell Lab using a laptop and results take only a few seconds.
Collapse
|
14
|
Jiang D, Zhao H, Du H, Deng Y, Wu Z, Wang J, Zeng Y, Zhang H, Wang X, Wu J, Hsieh CY, Hou T. How Good Are Current Docking Programs at Nucleic Acid-Ligand Docking? A Comprehensive Evaluation. J Chem Theory Comput 2023; 19:5633-5647. [PMID: 37480347 DOI: 10.1021/acs.jctc.3c00507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]
Abstract
Nucleic acid (NA)-ligand interactions are of paramount importance in a variety of biological processes, including cellular reproduction and protein biosynthesis, and therefore, NAs have been broadly recognized as potential drug targets. Understanding NA-ligand interactions at the atomic scale is essential for investigating the molecular mechanism and further assisting in NA-targeted drug discovery. Molecular docking is one of the predominant computational approaches for predicting the interactions between NAs and small molecules. Despite the availability of versatile docking programs, their performance profiles for NA-ligand complexes have not been thoroughly characterized. In this study, we first compiled the largest structure-based NA-ligand binding data set to date, containing 800 noncovalent NA-ligand complexes with clearly identified ligands. Based on this extensive data set, eight frequently used docking programs, including six protein-ligand docking programs (LeDock, Surflex-Dock, UCSF Dock6, AutoDock, AutoDock Vina, and PLANTS) and two specific NA-ligand docking programs (rDock and RLDOCK), were systematically evaluated in terms of binding pose and binding affinity predictions. The results demonstrated that some protein-ligand docking programs, specifically PLANTS and LeDock, produced more promising or comparable results compared with the specialized NA-ligand docking programs. Among the programs evaluated, PLANTS, rDock, and LeDock showed the highest performance in binding pose prediction, and their top-1 and best root-mean-square deviation (rmsd) success rates were as follows: PLANTS (35.93 and 76.05%), rDock (27.25 and 72.16%), and LeDock (27.40 and 64.37%). Compared with the moderate level of binding pose prediction, few programs were successful in binding affinity prediction, and the best correlation (Rp = -0.461) was observed with PLANTS. Finally, further comparison with the latest NA-ligand docking program (NLDock) on four well-established data sets revealed that PLANTS and LeDock outperformed NLDock in terms of binding pose prediction on all data sets, demonstrating their significant potential for NA-ligand docking. To the best of our knowledge, this study is the most comprehensive evaluation of popular molecular docking programs for NA-ligand systems.
Collapse
Affiliation(s)
- Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
| | - Huifeng Zhao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
| | - Hongyan Du
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yafeng Deng
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
| | - Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yundian Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Haotian Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Xiaorui Wang
- China State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Macau 999078, China
| | - Jian Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310006, Zhejiang, China
| | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
15
|
Mohanty M, Mohanty PS. Molecular docking in organic, inorganic, and hybrid systems: a tutorial review. MONATSHEFTE FUR CHEMIE 2023; 154:1-25. [PMID: 37361694 PMCID: PMC10243279 DOI: 10.1007/s00706-023-03076-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/08/2023] [Indexed: 06/28/2023]
Abstract
Molecular docking simulation is a very popular and well-established computational approach and has been extensively used to understand molecular interactions between a natural organic molecule (ideally taken as a receptor) such as an enzyme, protein, DNA, RNA and a natural or synthetic organic/inorganic molecule (considered as a ligand). But the implementation of docking ideas to synthetic organic, inorganic, or hybrid systems is very limited with respect to their use as a receptor despite their huge popularity in different experimental systems. In this context, molecular docking can be an efficient computational tool for understanding the role of intermolecular interactions in hybrid systems that can help in designing materials on mesoscale for different applications. The current review focuses on the implementation of the docking method in organic, inorganic, and hybrid systems along with examples from different case studies. We describe different resources, including databases and tools required in the docking study and applications. The concept of docking techniques, types of docking models, and the role of different intermolecular interactions involved in the docking process to understand the binding mechanisms are explained. Finally, the challenges and limitations of dockings are also discussed in this review. Graphical abstract
Collapse
Affiliation(s)
- Madhuchhanda Mohanty
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
| | - Priti S. Mohanty
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
- School of Chemical Technology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
| |
Collapse
|
16
|
Harini K, Kihara D, Michael Gromiha M. PDA-Pred: Predicting the binding affinity of protein-DNA complexes using machine learning techniques and structural features. Methods 2023; 213:10-17. [PMID: 36924867 PMCID: PMC10563387 DOI: 10.1016/j.ymeth.2023.03.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/17/2023] [Accepted: 03/11/2023] [Indexed: 03/17/2023] Open
Abstract
Protein-DNA interactions play an important role in various biological processes such as gene expression, replication, and transcription. Understanding the important features that dictate the binding affinity of protein-DNA complexes and predicting their affinities is important for elucidating their recognition mechanisms. In this work, we have collected the experimental binding free energy (ΔG) for a set of 391 Protein-DNA complexes and derived several structure-based features such as interaction energy, contact potentials, volume and surface area of binding site residues, base step parameters of the DNA and contacts between different types of atoms. Our analysis on relationship between binding affinity and structural features revealed that the important factors mainly depend on the number of DNA strands as well as functional and structural classes of proteins. Specifically, binding site properties such as number of atom contacts between the DNA and protein, volume of protein binding sites and interaction-based features such as interaction energies and contact potentials are important to understand the binding affinity. Further, we developed multiple regression equations for predicting the binding affinity of protein-DNA complexes belonging to different structural and functional classes. Our method showed an average correlation and mean absolute error of 0.78 and 0.98 kcal/mol, respectively, between the experimental and predicted binding affinities on a jack-knife test. We have developed a webserver, PDA-PreD (Protein-DNA Binding affinity predictor), for predicting the affinity of protein-DNA complexes and it is freely available at https://web.iitm.ac.in/bioinfo2/pdapred/.
Collapse
Affiliation(s)
- K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States; Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama 226-8501, Japan.
| |
Collapse
|
17
|
Ortega AD. Real-Time Assessment of Intracellular Metabolites in Single Cells through RNA-Based Sensors. Biomolecules 2023; 13:biom13050765. [PMID: 37238635 DOI: 10.3390/biom13050765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 04/24/2023] [Accepted: 04/26/2023] [Indexed: 05/28/2023] Open
Abstract
Quantification of the concentration of particular cellular metabolites reports on the actual utilization of metabolic pathways in physiological and pathological conditions. Metabolite concentration also constitutes the readout for screening cell factories in metabolic engineering. However, there are no direct approaches that allow for real-time assessment of the levels of intracellular metabolites in single cells. In recent years, the modular architecture of natural bacterial RNA riboswitches has inspired the design of genetically encoded synthetic RNA devices that convert the intracellular concentration of a metabolite into a quantitative fluorescent signal. These so-called RNA-based sensors are composed of a metabolite-binding RNA aptamer as the sensor domain, connected through an actuator segment to a signal-generating reporter domain. However, at present, the variety of available RNA-based sensors for intracellular metabolites is still very limited. Here, we go through natural mechanisms for metabolite sensing and regulation in cells across all kingdoms, focusing on those mediated by riboswitches. We review the design principles underlying currently developed RNA-based sensors and discuss the challenges that hindered the development of novel sensors and recent strategies to address them. We finish by introducing the current and potential applicability of synthetic RNA-based sensors for intracellular metabolites.
Collapse
Affiliation(s)
- Alvaro Darío Ortega
- Department of Cell Biology, Faculty of Biological Sciences, Complutense University of Madrid, 28040 Madrid, Spain
| |
Collapse
|
18
|
Sabei A, Caldas Baia TG, Saffar R, Martin J, Frezza E. Internal Normal Mode Analysis Applied to RNA Flexibility and Conformational Changes. J Chem Inf Model 2023; 63:2554-2572. [PMID: 36972178 DOI: 10.1021/acs.jcim.2c01509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
We investigated the capability of internal normal modes to reproduce RNA flexibility and predict observed RNA conformational changes and, notably, those induced by the formation of RNA-protein and RNA-ligand complexes. Here, we extended our iNMA approach developed for proteins to study RNA molecules using a simplified representation of the RNA structure and its potential energy. Three data sets were also created to investigate different aspects. Despite all the approximations, our study shows that iNMA is a suitable method to take into account RNA flexibility and describe its conformational changes opening the route to its applicability in any integrative approach where these properties are crucial.
Collapse
|
19
|
Ashok Kumar T. PDBms-An online tool for PDB file splitting and interactive molecular visualization. Interdiscip Sci 2023; 15:146-153. [PMID: 36180812 DOI: 10.1007/s12539-022-00539-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 09/15/2022] [Accepted: 09/20/2022] [Indexed: 11/27/2022]
Abstract
The rapid growth of biological databases has resulted in the vast development of many state-of-the-art molecular analysis tools for accurate disease diagnosis and drug discovery. Protein Data Bank (PDB) is a leading molecular database consisting of three-dimensional (3D) experimental structures of macromolecules and small molecules. The most significant role of PDB in Bioinformatics includes molecular modelling and computer-aided drug design (CADD). PDBms is a web tool for splitting PDB file and interactive visualization of molecules. It parses coordinate section records in the PDB file and categorizes them into a group of molecules. Moreover, it supports 3D graphic visualization in the various model for polymers and target-ligand interactions of complex structures. The web interface of PDBms is designed using NGL Viewer/WebGL JavaScript package and programming languages such as PHP, HTML, CSS, JavaScript, AJAX, and jQuery. PDBms is freely accessible at https://www.biogem.org/tool/pdbms/ .
Collapse
Affiliation(s)
- T Ashok Kumar
- Department of Plant Biotechnology, Kerala Agricultural University, Vellayani, 695522, India.
| |
Collapse
|
20
|
Esmaeeli R, Bauzá A, Perez A. Structural predictions of protein-DNA binding: MELD-DNA. Nucleic Acids Res 2023; 51:1625-1636. [PMID: 36727436 PMCID: PMC9976882 DOI: 10.1093/nar/gkad013] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/27/2022] [Accepted: 01/30/2023] [Indexed: 02/03/2023] Open
Abstract
Structural, regulatory and enzymatic proteins interact with DNA to maintain a healthy and functional genome. Yet, our structural understanding of how proteins interact with DNA is limited. We present MELD-DNA, a novel computational approach to predict the structures of protein-DNA complexes. The method combines molecular dynamics simulations with general knowledge or experimental information through Bayesian inference. The physical model is sensitive to sequence-dependent properties and conformational changes required for binding, while information accelerates sampling of bound conformations. MELD-DNA can: (i) sample multiple binding modes; (ii) identify the preferred binding mode from the ensembles; and (iii) provide qualitative binding preferences between DNA sequences. We first assess performance on a dataset of 15 protein-DNA complexes and compare it with state-of-the-art methodologies. Furthermore, for three selected complexes, we show sequence dependence effects of binding in MELD predictions. We expect that the results presented herein, together with the freely available software, will impact structural biology (by complementing DNA structural databases) and molecular recognition (by bringing new insights into aspects governing protein-DNA interactions).
Collapse
Affiliation(s)
- Reza Esmaeeli
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| | - Antonio Bauzá
- Department of Chemistry, Universitat de les Illes Balears, Palma de Mallorca (Baleares), 07122, Spain
| | - Alberto Perez
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
21
|
Zeng C, Jian Y, Vosoughi S, Zeng C, Zhao Y. Evaluating native-like structures of RNA-protein complexes through the deep learning method. Nat Commun 2023; 14:1060. [PMID: 36828844 PMCID: PMC9958188 DOI: 10.1038/s41467-023-36720-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 02/14/2023] [Indexed: 02/26/2023] Open
Abstract
RNA-protein complexes underlie numerous cellular processes, including basic translation and gene regulation. The high-resolution structure determination of the RNA-protein complexes is essential for elucidating their functions. Therefore, computational methods capable of identifying the native-like RNA-protein structures are needed. To address this challenge, we thus develop DRPScore, a deep-learning-based approach for identifying native-like RNA-protein structures. DRPScore is tested on representative sets of RNA-protein complexes with various degrees of binding-induced conformation change ranging from fully rigid docking (bound-bound) to fully flexible docking (unbound-unbound). Out of the top 20 predictions, DRPScore selects native-like structures with a success rate of 91.67% on the testing set of bound RNA-protein complexes and 56.14% on the unbound complexes. DRPScore consistently outperforms existing methods with a roughly 10.53-15.79% improvement, even for the most difficult unbound cases. Furthermore, DRPScore significantly improves the accuracy of the native interface interaction predictions. DRPScore should be broadly useful for modeling and designing RNA-protein complexes.
Collapse
Affiliation(s)
- Chengwei Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China
| | - Yiren Jian
- Department of Computer Science, Dartmouth College, Hanover, NH, 03755, USA
| | - Soroush Vosoughi
- Department of Computer Science, Dartmouth College, Hanover, NH, 03755, USA
| | - Chen Zeng
- Department of Physics, The George Washington University, Washington, DC, 20052, USA
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China.
| |
Collapse
|
22
|
Sato R, Suzuki K, Yasuda Y, Suenaga A, Fukui K. RNAapt3D: RNA aptamer 3D-structural modeling database. Biophys J 2022; 121:4770-4776. [PMID: 36146935 PMCID: PMC9808543 DOI: 10.1016/j.bpj.2022.09.023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 08/17/2022] [Accepted: 09/20/2022] [Indexed: 01/07/2023] Open
Abstract
RNA aptamers are oligonucleotides with high binding affinity and specificity for target molecules and are expected to be a new generation of therapeutic molecules and targeted delivery materials. The tertiary structure of RNA molecules and RNA-protein interaction sites are increasingly important as potential targets for new drugs. The pathological mechanisms of diseases must be understood in detail to guide drug design. In developing RNA aptamers as drugs, information about the interaction mechanisms and structures of RNA aptamer-target protein complexes are useful. We constructed a database, RNA aptamer 3D-structural modeling (RNAapt3D), consisting of RNA aptamer data that are potential drug candidates. The database includes RNA sequences and computationally predicted RNA tertiary structures based on secondary structures and implements methods that can be used to predict unknown structures of RNA aptamer-target molecule complexes. RNAapt3D should enable the design of RNA aptamers for target molecules and improve the efficiency and productivity of candidate drug selection. RNAapt3D can be accessed at https://rnaapt3d.medals.jp.
Collapse
Affiliation(s)
- Ryuma Sato
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | - Koji Suzuki
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | - Yuichi Yasuda
- College of Humanities and Science, Department of Biosciences, Nihon University, Tokyo, Japan
| | - Atsushi Suenaga
- College of Humanities and Science, Department of Biosciences, Nihon University, Tokyo, Japan
| | - Kazuhiko Fukui
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan.
| |
Collapse
|
23
|
Sultanov D, Hochwagen A. Varying strength of selection contributes to the intragenomic diversity of rRNA genes. Nat Commun 2022; 13:7245. [PMID: 36434003 PMCID: PMC9700816 DOI: 10.1038/s41467-022-34989-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 11/14/2022] [Indexed: 11/27/2022] Open
Abstract
Ribosome biogenesis in eukaryotes is supported by hundreds of ribosomal RNA (rRNA) gene copies that are encoded in the ribosomal DNA (rDNA). The multiple copies of rRNA genes are thought to have low sequence diversity within one species. Here, we present species-wide rDNA sequence analysis in Saccharomyces cerevisiae that challenges this view. We show that rDNA copies in this yeast are heterogeneous, both among and within isolates, and that many variants avoided fixation or elimination over evolutionary time. The sequence diversity landscape across the rDNA shows clear functional stratification, suggesting different copy-number thresholds for selection that contribute to rDNA diversity. Notably, nucleotide variants in the most conserved rDNA regions are sufficiently deleterious to exhibit signatures of purifying selection even when present in only a small fraction of rRNA gene copies. Our results portray a complex evolutionary landscape that shapes rDNA sequence diversity within a single species and reveal unexpectedly strong purifying selection of multi-copy genes.
Collapse
Affiliation(s)
- Daniel Sultanov
- grid.137628.90000 0004 1936 8753Department of Biology, New York University, New York, NY 10003 USA
| | - Andreas Hochwagen
- grid.137628.90000 0004 1936 8753Department of Biology, New York University, New York, NY 10003 USA
| |
Collapse
|
24
|
Bheemireddy S, Sandhya S, Srinivasan N, Sowdhamini R. Computational tools to study RNA-protein complexes. Front Mol Biosci 2022; 9:954926. [PMID: 36275618 PMCID: PMC9585174 DOI: 10.3389/fmolb.2022.954926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 09/20/2022] [Indexed: 11/19/2022] Open
Abstract
RNA is the key player in many cellular processes such as signal transduction, replication, transport, cell division, transcription, and translation. These diverse functions are accomplished through interactions of RNA with proteins. However, protein–RNA interactions are still poorly derstood in contrast to protein–protein and protein–DNA interactions. This knowledge gap can be attributed to the limited availability of protein-RNA structures along with the experimental difficulties in studying these complexes. Recent progress in computational resources has expanded the number of tools available for studying protein-RNA interactions at various molecular levels. These include tools for predicting interacting residues from primary sequences, modelling of protein-RNA complexes, predicting hotspots in these complexes and insights into derstanding in the dynamics of their interactions. Each of these tools has its strengths and limitations, which makes it significant to select an optimal approach for the question of interest. Here we present a mini review of computational tools to study different aspects of protein-RNA interactions, with focus on overall application, development of the field and the future perspectives.
Collapse
Affiliation(s)
- Sneha Bheemireddy
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Sankaran Sandhya
- Department of Biotechnology, Faculty of Life and Allied Health Sciences, M.S. Ramaiah University of Applied Sciences, Bengaluru, India
- *Correspondence: Sankaran Sandhya, ; Ramanathan Sowdhamini,
| | | | - Ramanathan Sowdhamini
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
- National Centre for Biological Sciences, TIFR, GKVK Campus, Bangalore, India
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, India
- *Correspondence: Sankaran Sandhya, ; Ramanathan Sowdhamini,
| |
Collapse
|
25
|
Pepe G, Appierdo R, Carrino C, Ballesio F, Helmer-Citterich M, Gherardini PF. Artificial intelligence methods enhance the discovery of RNA interactions. Front Mol Biosci 2022; 9:1000205. [PMID: 36275611 PMCID: PMC9585310 DOI: 10.3389/fmolb.2022.1000205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Understanding how RNAs interact with proteins, RNAs, or other molecules remains a challenge of main interest in biology, given the importance of these complexes in both normal and pathological cellular processes. Since experimental datasets are starting to be available for hundreds of functional interactions between RNAs and other biomolecules, several machine learning and deep learning algorithms have been proposed for predicting RNA-RNA or RNA-protein interactions. However, most of these approaches were evaluated on a single dataset, making performance comparisons difficult. With this review, we aim to summarize recent computational methods, developed in this broad research area, highlighting feature encoding and machine learning strategies adopted. Given the magnitude of the effect that dataset size and quality have on performance, we explored the characteristics of these datasets. Additionally, we discuss multiple approaches to generate datasets of negative examples for training. Finally, we describe the best-performing methods to predict interactions between proteins and specific classes of RNA molecules, such as circular RNAs (circRNAs) and long non-coding RNAs (lncRNAs), and methods to predict RNA-RNA or RNA-RBP interactions independently of the RNA type.
Collapse
Affiliation(s)
- G Pepe
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
- *Correspondence: G Pepe, ; M Helmer-Citterich,
| | - R Appierdo
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - C Carrino
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - F Ballesio
- PhD Program in Cellular and Molecular Biology, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - M Helmer-Citterich
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
- *Correspondence: G Pepe, ; M Helmer-Citterich,
| | - PF Gherardini
- Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| |
Collapse
|
26
|
Novel Design of RNA Aptamers as Cancer Inhibitors and Diagnosis Targeting the Tyrosine Kinase Domain of the NT-3 Growth Factor Receptor Using a Computational Sequence-Based Approach. Molecules 2022; 27:molecules27144518. [PMID: 35889390 PMCID: PMC9320020 DOI: 10.3390/molecules27144518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/08/2022] [Accepted: 07/09/2022] [Indexed: 12/10/2022] Open
Abstract
Aptamers, the nucleic acid analogs of antibodies, bind to their target molecules with remarkable specificity and sensitivity, making them promising diagnostic and therapeutic tools. The systematic evolution of ligands by exponential enrichment (SELEX) is time-consuming and expensive. However, regardless of those issues, it is the most used in vitro method for selecting aptamers. Therefore, recent studies have used computational approaches to reduce the time and cost associated with the synthesis and selection of aptamers. In an effort to present the potential of computational techniques in aptamer selection, a simple sequence-based method was used to design a 69-nucleotide long aptamer (mod_09) with a relatively stable structure (with a minimum free energy of −32.2 kcal/mol) and investigate its binding properties to the tyrosine kinase domain of the NT-3 growth factor receptor, for the first time, by employing computational modeling and docking tools.
Collapse
|
27
|
Yang R, Liu H, Yang L, Zhou T, Li X, Zhao Y. RPpocket: An RNA–Protein Intuitive Database with RNA Pocket Topology Resources. Int J Mol Sci 2022; 23:ijms23136903. [PMID: 35805909 PMCID: PMC9266927 DOI: 10.3390/ijms23136903] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/13/2022] [Accepted: 06/20/2022] [Indexed: 02/04/2023] Open
Abstract
RNA–protein complexes regulate a variety of biological functions. Thus, it is essential to explore and visualize RNA–protein structural interaction features, especially pocket interactions. In this work, we develop an easy-to-use bioinformatics resource: RPpocket. This database provides RNA–protein complex interactions based on sequence, secondary structure, and pocket topology analysis. We extracted 793 pockets from 74 non-redundant RNA–protein structures. Then, we calculated the binding- and non-binding pocket topological properties and analyzed the binding mechanism of the RNA–protein complex. The results showed that the binding pockets were more extended than the non-binding pockets. We also found that long-range forces were the main interaction for RNA–protein recognition, while short-range forces strengthened and optimized the binding. RPpocket could facilitate RNA–protein engineering for biological or medical applications.
Collapse
|
28
|
fingeRNAt—A novel tool for high-throughput analysis of nucleic acid-ligand interactions. PLoS Comput Biol 2022; 18:e1009783. [PMID: 35653385 PMCID: PMC9197077 DOI: 10.1371/journal.pcbi.1009783] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 06/14/2022] [Accepted: 05/06/2022] [Indexed: 11/19/2022] Open
Abstract
Computational methods play a pivotal role in drug discovery and are widely applied in virtual screening, structure optimization, and compound activity profiling. Over the last decades, almost all the attention in medicinal chemistry has been directed to protein-ligand binding, and computational tools have been created with this target in mind. With novel discoveries of functional RNAs and their possible applications, RNAs have gained considerable attention as potential drug targets. However, the availability of bioinformatics tools for nucleic acids is limited. Here, we introduce fingeRNAt—a software tool for detecting non-covalent interactions formed in complexes of nucleic acids with ligands. The program detects nine types of interactions: (i) hydrogen and (ii) halogen bonds, (iii) cation-anion, (iv) pi-cation, (v) pi-anion, (vi) pi-stacking, (vii) inorganic ion-mediated, (viii) water-mediated, and (ix) lipophilic interactions. However, the scope of detected interactions can be easily expanded using a simple plugin system. In addition, detected interactions can be visualized using the associated PyMOL plugin, which facilitates the analysis of medium-throughput molecular complexes. Interactions are also encoded and stored as a bioinformatics-friendly Structural Interaction Fingerprint (SIFt)—a binary string where the respective bit in the fingerprint is set to 1 if a particular interaction is present and to 0 otherwise. This output format, in turn, enables high-throughput analysis of interaction data using data analysis techniques. We present applications of fingeRNAt-generated interaction fingerprints for visual and computational analysis of RNA-ligand complexes, including analysis of interactions formed in experimentally determined RNA-small molecule ligand complexes deposited in the Protein Data Bank. We propose interaction fingerprint-based similarity as an alternative measure to RMSD to recapitulate complexes with similar interactions but different folding. We present an application of interaction fingerprints for the clustering of molecular complexes. This approach can be used to group ligands that form similar binding networks and thus have similar biological properties. The fingeRNAt software is freely available at https://github.com/n-szulc/fingeRNAt.
Collapse
|
29
|
Möller L, Guerci L, Isert C, Atz K, Schneider G. Translating from proteins to ribonucleic acids for ligand-binding site detection. Mol Inform 2022; 41:e2200059. [PMID: 35577762 DOI: 10.1002/minf.202200059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 05/16/2022] [Indexed: 11/10/2022]
Abstract
Identifying druggable ligand-binding sites on the surface of the macromolecular targets is an important process in structure-based drug discovery. Deep-learning models have been shown to successfully predict ligand-binding sites of proteins. As a step toward predicting binding sites in RNA and RNA-protein complexes, we employ three-dimensional convolutional neural networks. We introduce a dataset splitting approach to minimize structure-related bias in training data, and investigate the influence of protein-based neural network pre-training before fine-tuning on RNA structures. Models that were pre-trained on proteins considerably outperformed the models that were trained exclusively on RNA structures. Overall, 71% of the known RNA binding sites were correctly located within 4 Å of their true centres with a structural overlap of at least 25%.
Collapse
|
30
|
Zhou Y, Jiang Y, Chen SJ. RNA-ligand molecular docking: advances and challenges. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2022; 12:e1571. [PMID: 37293430 PMCID: PMC10250017 DOI: 10.1002/wcms.1571] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 07/20/2021] [Indexed: 12/16/2022]
Abstract
With rapid advances in computer algorithms and hardware, fast and accurate virtual screening has led to a drastic acceleration in selecting potent small molecules as drug candidates. Computational modeling of RNA-small molecule interactions has become an indispensable tool for RNA-targeted drug discovery. The current models for RNA-ligand binding have mainly focused on the docking-and-scoring method. Accurate docking and scoring should tackle four crucial problems: (1) conformational flexibility of ligand, (2) conformational flexibility of RNA, (3) efficient sampling of binding sites and binding poses, and (4) accurate scoring of different binding modes. Moreover, compared with the problem of protein-ligand docking, predicting ligand binding to RNA, a negatively charged polymer, is further complicated by additional effects such as metal ion effects. Thermodynamic models based on physics-based and knowledge-based scoring functions have shown highly encouraging success in predicting ligand binding poses and binding affinities. Recently, kinetic models for ligand binding have further suggested that including dissociation kinetics (residence time) in ligand docking would result in improved performance in estimating in vivo drug efficacy. More recently, the rise of deep-learning approaches has led to new tools for predicting RNA-small molecule binding. In this review, we present an overview of the recently developed computational methods for RNA-ligand docking and their advantages and disadvantages.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Yangwei Jiang
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
31
|
Abstract
Recent events have pushed RNA research into the spotlight. Continued discoveries of RNA with unexpected diverse functions in healthy and diseased cells, such as the role of RNA as both the source and countermeasure to a severe acute respiratory syndrome coronavirus 2 infection, are igniting a new passion for understanding this functionally and structurally versatile molecule. Although RNA structure is key to function, many foundational characteristics of RNA structure are misunderstood, and the default state of RNA is often thought of and depicted as a single floppy strand. The purpose of this perspective is to help adjust mental models, equipping the community to better use the fundamental aspects of RNA structural information in new mechanistic models, enhance experimental design to test these models, and refine data interpretation. We discuss six core observations focused on the inherent nature of RNA structure and how to incorporate these characteristics to better understand RNA structure. We also offer some ideas for future efforts to make validated RNA structural information available and readily used by all researchers.
Collapse
Affiliation(s)
- Quentin Vicens
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, School of Medicine, Aurora, CO 80045
- RNA BioScience Initiative, University of Colorado Denver School of Medicine, Aurora, CO 80045
| | - Jeffrey S. Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, School of Medicine, Aurora, CO 80045
- RNA BioScience Initiative, University of Colorado Denver School of Medicine, Aurora, CO 80045
| |
Collapse
|
32
|
Li P, Liu ZP. PST-PRNA: prediction of RNA-binding sites using protein surface topography and deep learning. Bioinformatics 2022; 38:2162-2168. [PMID: 35150250 DOI: 10.1093/bioinformatics/btac078] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 01/20/2022] [Accepted: 02/05/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Protein-RNA interactions play essential roles in many biological processes, including pre-mRNA processing, post-transcriptional gene regulation and RNA degradation. Accurate identification of binding sites on RNA-binding proteins (RBPs) is important for functional annotation and site-directed mutagenesis. Experimental assays to sparse RBPs are precise and convincing but also costly and time consuming. Therefore, flexible and reliable computational methods are required to recognize RNA-binding residues. RESULTS In this work, we propose PST-PRNA, a novel model for predicting RNA-binding sites (PRNA) based on protein surface topography (PST). Taking full advantage of the 3D structural information of protein, PST-PRNA creates representative topography images of the entire protein surface by mapping it onto a unit spherical surface. Four kinds of descriptors are encoded to represent residues on the surface. Then, the potential features are integrated and optimized by using deep learning models. We compile a comprehensive non-redundant RBP dataset to train and test PST-PRNA using 10-fold cross-validation. Numerous experiments demonstrate PST-PRNA learns successfully the latent structural information of protein surface. On the non-redundant dataset with sequence identity of 0.3, PST-PRNA achieves area under the receiver operating characteristic curves (AUC) value of 0.860 and Matthew's correlation coefficient value of 0.420. Furthermore, we construct a completely independent test dataset for justification and comparison. PST-PRNA achieves AUC value of 0.913 on the independent dataset, which is superior to the other state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION The code and data are available at https://www.github.com/zpliulab/PST-PRNA. A web server is freely available at http://www.zpliulab.cn/PSTPRNA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pengpai Li
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
33
|
Developing Community Resources for Nucleic Acid Structures. Life (Basel) 2022; 12:life12040540. [PMID: 35455031 PMCID: PMC9031032 DOI: 10.3390/life12040540] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/28/2022] [Accepted: 03/31/2022] [Indexed: 01/14/2023] Open
Abstract
In this review, we describe the creation of the Nucleic Acid Database (NDB) at Rutgers University and how it became a testbed for the current infrastructure of the RCSB Protein Data Bank. We describe some of the special features of the NDB and how it has been used to enable research. Plans for the next phase as the Nucleic Acid Knowledgebase (NAKB) are summarized.
Collapse
|
34
|
Molodenskiy DS, Svergun DI, Kikhney AG. Artificial neural networks for solution scattering data analysis. Structure 2022; 30:900-908.e2. [DOI: 10.1016/j.str.2022.03.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 01/24/2022] [Accepted: 03/16/2022] [Indexed: 11/27/2022]
|
35
|
Paithankar H, Tarang GS, Parvez F, Marathe A, Joshi M, Chugh J. Inherent conformational plasticity in dsRBDs enables interaction with topologically distinct RNAs. Biophys J 2022; 121:1038-1055. [PMID: 35134335 PMCID: PMC8943759 DOI: 10.1016/j.bpj.2022.02.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 12/25/2021] [Accepted: 02/03/2022] [Indexed: 11/02/2022] Open
Abstract
Many double-stranded RNA-binding domains (dsRBDs) interact with topologically distinct dsRNAs in biological pathways pivotal to viral replication, cancer causation, neurodegeneration, and so on. We hypothesized that the adaptability of dsRBDs is essential to target different dsRNA substrates. A model dsRBD and a few dsRNAs, slightly different in shape from each other, were used to test the systematic shape dependence of RNA on the dsRBD-binding using nuclear magnetic resonance (NMR) spectroscopy and molecular modeling. NMR-based titrations showed a distinct binding pattern for the dsRBD with the topologically distinct dsRNAs. The line broadening upon RNA binding was observed to cluster in the residues lying in close proximity, thereby suggesting an RNA-induced conformational exchange in the dsRBD. Further, while the intrinsic microsecond dynamics observed in the apo-dsRBD were found to quench upon binding with the dsRNA, the microsecond dynamics got induced at residues spatially proximal to quench sites upon binding with the dsRNA. This apparent relay of conformational exchange suggests the significance of intrinsic dynamics to help adapt the dsRBD to target various dsRNA-shapes. The conformational pool visualized in MD simulations for the apo-dsRBD reported here has also been observed to sample the conformations seen previously for various dsRBDs in apo- and in dsRNA-bound state structures, further suggesting the conformational adaptability of the dsRBDs. These investigations provide a dynamic basis for the substrate promiscuity for dsRBD proteins.
Collapse
Affiliation(s)
- Harshad Paithankar
- Department of Chemistry, Indian Institute of Science Education and Research (IISER), Pune, Maharashtra, India
| | - Guneet Singh Tarang
- Department of Biology, Indian Institute of Science Education and Research (IISER), Pune, Maharashtra, India
| | - Firdousi Parvez
- Department of Biology, Indian Institute of Science Education and Research (IISER), Pune, Maharashtra, India
| | - Aniket Marathe
- Bioinformatics Center, Savitrabai Phule Pune University, Pune, Maharashtra, India
| | - Manali Joshi
- Bioinformatics Center, Savitrabai Phule Pune University, Pune, Maharashtra, India
| | - Jeetender Chugh
- Department of Chemistry, Indian Institute of Science Education and Research (IISER), Pune, Maharashtra, India; Department of Biology, Indian Institute of Science Education and Research (IISER), Pune, Maharashtra, India.
| |
Collapse
|
36
|
Mu K, Zhu Z, Abula A, Peng C, Zhu W, Xu Z. Halogen Bonds Exist between Noncovalent Ligands and Natural Nucleic Acids. J Med Chem 2022; 65:4424-4435. [PMID: 35276046 DOI: 10.1021/acs.jmedchem.1c01854] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Because of their strong electron-rich properties, nucleic acids (NAs) can theoretically serve as halogen bond (XB) acceptors. From a PDB database survey, Kolář found that no XBs are formed between noncovalent ligands and NAs. Through statistical database analysis, quantum-mechanics/molecular-mechanics (QM/MM) optimizations, and energy calculations, we find that XBs formed between natural NAs and noncovalent ligands are primarily underestimated and that NAs can serve as XB acceptors to interact with noncovalent halogen ligands. Finally, through energy calculations, natural bond orbital analysis, and noncovalent interaction analysis, XBs are confirmed in 13 systems, among which two systems (445D and 4Q9Q) have relatively strong XBs. In addition, on the basis of energy scanning of four model systems, we explore the geometric rule for XB formation in NAs. This work will inspire researchers to utilize XBs in rational drug design targeting NAs.
Collapse
Affiliation(s)
- Kaijie Mu
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.,Nano Science and Technology Institute, University of Science and Technology of China, Suzhou, Jiangsu 215123, China
| | - Zhengdan Zhu
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.,School of Pharmacy, University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, PR China
| | - Amina Abula
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
| | - Cheng Peng
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.,School of Pharmacy, University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, PR China
| | - Weiliang Zhu
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.,School of Pharmacy, University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, PR China
| | - Zhijian Xu
- CAS Key Laboratory of Receptor Research, Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.,School of Pharmacy, University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, PR China
| |
Collapse
|
37
|
Sarathi P, Padhi S. Insight of the various in silico screening techniques developed for assortment of cocrystal formers and their thermodynamic characterization. Drug Dev Ind Pharm 2022; 47:1523-1534. [PMID: 35164621 DOI: 10.1080/03639045.2022.2042554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Most of the widely used drugs have problems associated with their oral bioavailability either due to their poor aqueous solubility or due to their poor permeability. Co-crystallization is an efficient and economically feasible approach that offers a great opportunity for improvement in physicochemical properties such as solubility, stability, and bioavailability of such type of therapeutic agent. Selection of the best co-former plays a major role in co-crystallization. Various approaches have been developed for the selection of suitable co-formers with API. In recent years in silico screening, a computational tool paying more attention for screening of co-formers has been developed. Numerous approaches can be used for in silico screening such as the Autodocking tool, COSMORS, COSMOTHERM, etc. Autodocking can predict several numbers of co-former effectively screened in silico method to identify a suitable co-former with an API. Prediction of solubility and dissolution is also important for the development of co-crystal. In this review, we discuss in silico screening of coformer and thermodynamic approaches to determine the dissolution and solubility of co-crystal specially with reference to the drugs belonging to BCS class II group.
Collapse
Affiliation(s)
- Parth Sarathi
- Noida Institute of Engineering and Technology (Pharmacy Institute), Greater Noida, India
| | - Swarupanjali Padhi
- Noida Institute of Engineering and Technology (Pharmacy Institute), Greater Noida, India
| |
Collapse
|
38
|
Roy P, Bhattacharyya D. Contact networks in RNA: a structural bioinformatics study with a new tool. J Comput Aided Mol Des 2022; 36:131-140. [DOI: 10.1007/s10822-021-00438-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 12/01/2021] [Indexed: 10/19/2022]
|
39
|
Brovarets’ OO, Muradova A, Hovorun DM. Novel horizons of the conformationally-tautomeric transformations of the G·T base pairs: quantum-mechanical investigation. Mol Phys 2022. [DOI: 10.1080/00268976.2022.2026510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Ol’ha O. Brovarets’
- Department of Molecular and Quantum Biophysics, Institute of Molecular Biology and Genetics, National Academy of Sciences of Ukraine, Kyiv, Ukraine
| | - Alona Muradova
- Department of Molecular Biotechnology and Bioinformatics, Institute of High Technologies, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Dmytro M. Hovorun
- Department of Molecular and Quantum Biophysics, Institute of Molecular Biology and Genetics, National Academy of Sciences of Ukraine, Kyiv, Ukraine
- Department of Molecular Biotechnology and Bioinformatics, Institute of High Technologies, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| |
Collapse
|
40
|
Marques-Pereira C, Pires M, Moreira IS. Discovery of Virus-Host interactions using bioinformatic tools. Methods Cell Biol 2022; 169:169-198. [DOI: 10.1016/bs.mcb.2022.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
41
|
杨 爽. Analysis of Residue Interface Preference in Protein-DNA Complexes and Its Application in Recognition of Binding Interface. Biophysics (Nagoya-shi) 2022. [DOI: 10.12677/biphy.2022.104006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
42
|
Kozlovskii I, Popov P. Structure-based deep learning for binding site detection in nucleic acid macromolecules. NAR Genom Bioinform 2021; 3:lqab111. [PMID: 34859211 PMCID: PMC8633674 DOI: 10.1093/nargab/lqab111] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/14/2021] [Accepted: 11/09/2021] [Indexed: 12/30/2022] Open
Abstract
Structure-based drug design (SBDD) targeting nucleic acid macromolecules, particularly RNA, is a gaining momentum research direction that already resulted in several FDA-approved compounds. Similar to proteins, one of the critical components in SBDD for RNA is the correct identification of the binding sites for putative drug candidates. RNAs share a common structural organization that, together with the dynamic nature of these molecules, makes it challenging to recognize binding sites for small molecules. Moreover, there is a need for structure-based approaches, as sequence information only does not consider conformation plasticity of nucleic acid macromolecules. Deep learning holds a great promise to resolve binding site detection problem, but requires a large amount of structural data, which is very limited for nucleic acids, compared to proteins. In this study we composed a set of ∼2000 nucleic acid-small molecule structures comprising ∼2500 binding sites, which is ∼40-times larger than previously used one, and demonstrated the first structure-based deep learning approach, BiteNetN, to detect binding sites in nucleic acid structures. BiteNetN operates with arbitrary nucleic acid complexes, shows the state-of-the-art performance, and can be helpful in the analysis of different conformations and mutant variants, as we demonstrated for HIV-1 TAR RNA and ATP-aptamer case studies.
Collapse
Affiliation(s)
- Igor Kozlovskii
- iMolecule, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Petr Popov
- iMolecule, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| |
Collapse
|
43
|
Feng Y, Yan Y, He J, Tao H, Wu Q, Huang SY. Docking and scoring for nucleic acid-ligand interactions: Principles and current status. Drug Discov Today 2021; 27:838-847. [PMID: 34718205 DOI: 10.1016/j.drudis.2021.10.013] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 09/06/2021] [Accepted: 10/20/2021] [Indexed: 12/24/2022]
Abstract
Nucleic acid (NA)-ligand interactions have crucial roles in many cellular processes and, thus, are increasingly attracting therapeutic interest in drug discovery. Molecular docking is a valuable tool for studying molecular interactions. However, because NAs differ significantly from proteins in both their physical and chemical properties, traditional docking algorithms and scoring functions for protein-ligand interactions might not be applicable to NA-ligand docking. Therefore, various sampling strategies and scoring functions for NA-ligand interactions have been developed. Here, we review the basic principles and current status of docking algorithms and scoring functions for DNA/RNA-ligand interactions. We also discuss challenges and limitations of current docking and scoring approaches.
Collapse
Affiliation(s)
- Yuyu Feng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Huanyu Tao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Qilong Wu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China.
| |
Collapse
|
44
|
Mias-Lucquin D, Chauvot de Beauchene I. Conformational variability in proteins bound to single-stranded DNA: A new benchmark for new docking perspectives. Proteins 2021; 90:625-631. [PMID: 34617336 PMCID: PMC9292434 DOI: 10.1002/prot.26258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 09/15/2021] [Accepted: 09/27/2021] [Indexed: 12/19/2022]
Abstract
We explored the Protein Data Bank (PDB) to collect protein-ssDNA structures and create a multi-conformational docking benchmark including both bound and unbound protein structures. Due to ssDNA high flexibility when not bound, no ssDNA unbound structure is included in the benchmark. For the 91 sequence-identity groups identified as bound-unbound structures of the same protein, we studied the conformational changes in the protein induced by the ssDNA binding. Moreover, based on several bound or unbound protein structures in some groups, we also assessed the intrinsic conformational variability in either bound or unbound conditions and compared it to the supposedly binding-induced modifications. To illustrate a use case of this benchmark, we performed docking experiments using ATTRACT docking software. This benchmark is, to our knowledge, the first one made to peruse available structures of ssDNA-protein interactions to such an extent, aiming to improve computational docking tools dedicated to this kind of molecular interactions.
Collapse
|
45
|
Harini K, Srivastava A, Kulandaisamy A, Gromiha MM. ProNAB: database for binding affinities of protein-nucleic acid complexes and their mutants. Nucleic Acids Res 2021; 50:D1528-D1534. [PMID: 34606614 PMCID: PMC8728258 DOI: 10.1093/nar/gkab848] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 09/08/2021] [Accepted: 09/10/2021] [Indexed: 11/16/2022] Open
Abstract
Protein–nucleic acid interactions are involved in various biological processes such as gene expression, replication, transcription, translation and packaging. The binding affinities of protein–DNA and protein–RNA complexes are important for elucidating the mechanism of protein–nucleic acid recognition. Although experimental data on binding affinity are reported abundantly in the literature, no well-curated database is currently available for protein–nucleic acid binding affinity. We have developed a database, ProNAB, which contains more than 20 000 experimental data for the binding affinities of protein–DNA and protein–RNA complexes. Each entry provides comprehensive information on sequence and structural features of a protein, nucleic acid and its complex, experimental conditions, thermodynamic parameters such as dissociation constant (Kd), binding free energy (ΔG) and change in binding free energy upon mutation (ΔΔG), and literature information. ProNAB is cross-linked with GenBank, UniProt, PDB, ProThermDB, PROSITE, DisProt and Pubmed. It provides a user-friendly web interface with options for search, display, sorting, visualization, download and upload the data. ProNAB is freely available at https://web.iitm.ac.in/bioinfo2/pronab/ and it has potential applications such as understanding the factors influencing the affinity, development of prediction tools, binding affinity change upon mutation and design complexes with the desired affinity.
Collapse
Affiliation(s)
- Kannan Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Ambuj Srivastava
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| |
Collapse
|
46
|
Largy E, König A, Ghosh A, Ghosh D, Benabou S, Rosu F, Gabelica V. Mass Spectrometry of Nucleic Acid Noncovalent Complexes. Chem Rev 2021; 122:7720-7839. [PMID: 34587741 DOI: 10.1021/acs.chemrev.1c00386] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Nucleic acids have been among the first targets for antitumor drugs and antibiotics. With the unveiling of new biological roles in regulation of gene expression, specific DNA and RNA structures have become very attractive targets, especially when the corresponding proteins are undruggable. Biophysical assays to assess target structure as well as ligand binding stoichiometry, affinity, specificity, and binding modes are part of the drug development process. Mass spectrometry offers unique advantages as a biophysical method owing to its ability to distinguish each stoichiometry present in a mixture. In addition, advanced mass spectrometry approaches (reactive probing, fragmentation techniques, ion mobility spectrometry, ion spectroscopy) provide more detailed information on the complexes. Here, we review the fundamentals of mass spectrometry and all its particularities when studying noncovalent nucleic acid structures, and then review what has been learned thanks to mass spectrometry on nucleic acid structures, self-assemblies (e.g., duplexes or G-quadruplexes), and their complexes with ligands.
Collapse
Affiliation(s)
- Eric Largy
- Univ. Bordeaux, CNRS, INSERM, ARNA, UMR 5320, U1212, IECB, F-33600 Pessac, France
| | - Alexander König
- Univ. Bordeaux, CNRS, INSERM, ARNA, UMR 5320, U1212, IECB, F-33600 Pessac, France
| | - Anirban Ghosh
- Univ. Bordeaux, CNRS, INSERM, ARNA, UMR 5320, U1212, IECB, F-33600 Pessac, France
| | - Debasmita Ghosh
- Univ. Bordeaux, CNRS, INSERM, ARNA, UMR 5320, U1212, IECB, F-33600 Pessac, France
| | - Sanae Benabou
- Univ. Bordeaux, CNRS, INSERM, ARNA, UMR 5320, U1212, IECB, F-33600 Pessac, France
| | - Frédéric Rosu
- Univ. Bordeaux, CNRS, INSERM, IECB, UMS 3033, F-33600 Pessac, France
| | - Valérie Gabelica
- Univ. Bordeaux, CNRS, INSERM, ARNA, UMR 5320, U1212, IECB, F-33600 Pessac, France
| |
Collapse
|
47
|
Cheng Y, Zhang S, Xu X, Chen SJ. Vfold2D-MC: A Physics-Based Hybrid Model for Predicting RNA Secondary Structure Folding. J Phys Chem B 2021; 125:10108-10118. [PMID: 34473508 DOI: 10.1021/acs.jpcb.1c04731] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Accurate prediction of RNA structure and folding stability has a far-reaching impact on our understanding of RNA functions. Here we develop Vfold2D-MC, a new physics-based model, to predict RNA structure and folding thermodynamics from the sequence. The model employs virtual bond-based coarse-graining of RNA backbone conformation and generates RNA conformations through Monte Carlo sampling of the bond angles and torsional angles of the virtual bonds. Using a coarse-grained statistical potential derived from the known structures, we assign each conformation with a statistical weight. The weighted average over the conformational ensemble gives the entropy and free energy parameters for the hairpin, bulge, and internal loops, and multiway junctions. From the thermodynamic parameters, we predict RNA structures, melting curves, and structural changes from the sequence. Theory-experiment comparisons indicate that Vfold2D-MC not only gives improved structure predictions but also enables the interpretation of thermodynamic results for different RNA structures, including multibranched junctions. This new model sets a promising framework to treat more complicated RNA structures, such as pseudoknotted and intramolecular kissing loops, for which experimental thermodynamic parameters are often unavailable.
Collapse
Affiliation(s)
- Yi Cheng
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, Jiangsu 213001, China
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| |
Collapse
|
48
|
Abstract
Deciphering the contribution of DNA subunits to the variability of its 3D structure represents an important step toward the elucidation of DNA functions at the atomic level. In the pursuit of that goal, our previous studies revealed that the essential conformational characteristics of the most populated “canonic” BI and AI conformational families of Watson–Crick duplexes, including the sequence dependence of their 3D structure, preexist in the local energy minima of the elemental single-chain fragments, deoxydinucleoside monophosphates (dDMPs). Those computations have uncovered important sequence-dependent regularity in the superposition of neighbor bases. The present work expands our studies to new minimal fragments of DNA with Watson–Crick nucleoside pairs that differ from canonic families in the torsion angles of the sugar-phosphate backbone (SPB). To address this objective, computations have been performed on dDMPs, cdDMPs (complementary dDMPs), and minimal fragments of SPBs of respective systems by using methods of molecular and quantum mechanics. These computations reveal that the conformations of dDMPs and cdDMPs having torsion angles of SPB corresponding to the local energy minima of separate minimal units of SPB exhibit sequence-dependent characteristics representative of canonic families. In contrast, conformations of dDMP and cdDMP with SPB torsions being far from the local minima of separate SPB units exhibit more complex sequence dependence.
Collapse
|
49
|
Gupta A, Kulkarni M, Mukherjee A. Accurate prediction of B-form/A-form DNA conformation propensity from primary sequence: A machine learning and free energy handshake. PATTERNS 2021; 2:100329. [PMID: 34553171 PMCID: PMC8441556 DOI: 10.1016/j.patter.2021.100329] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 03/25/2021] [Accepted: 07/20/2021] [Indexed: 11/26/2022]
Abstract
DNA carries the genetic code of life, with different conformations associated with different biological functions. Predicting the conformation of DNA from its primary sequence, although desirable, is a challenging problem owing to the polymorphic nature of DNA. We have deployed a host of machine learning algorithms, including the popular state-of-the-art LightGBM (a gradient boosting model), for building prediction models. We used the nested cross-validation strategy to address the issues of “overfitting” and selection bias. This simultaneously provides an unbiased estimate of the generalization performance of a machine learning algorithm and allows us to tune the hyperparameters optimally. Furthermore, we built a secondary model based on SHAP (SHapley Additive exPlanations) that offers crucial insight into model interpretability. Our detailed model-building strategy and robust statistical validation protocols tackle the formidable challenge of working on small datasets, which is often the case in biological and medical data. A robust machine learning model to predict A- or B-DNA conformation Outcome of machine learning model is explained with free energy values Our approach works well under class imbalance and limited data constraints
The sequence in the genome of an organism encodes all the information of life. We combine a data-driven approach using machine learning (ML) and the results of free energy calculations to offer a fresh perspective on this long-standing problem of prediction of DNA conformation (A or B) from the sequence. We trained our ML model using sophisticated state-of-the art algorithms such as LightGBM along with a nested cross-validation strategy to overcome the common problems associated with data bias and overfitting when constrained by limited data size. Our study will serve the broader interest of researchers who are not only seeking accurate and reliable predictive models but also want to understand the physical and chemical origins behind the predictions.
Collapse
Affiliation(s)
- Abhijit Gupta
- Department of Chemistry, Indian Institute of Science Education and Research, Pune, Maharashtra 411008, India
| | - Mandar Kulkarni
- Division of Biophysical Chemistry, Lund University, Chemical Center, P.O.B. 124, 22100 Lund, Sweden
| | - Arnab Mukherjee
- Department of Chemistry, Indian Institute of Science Education and Research, Pune, Maharashtra 411008, India
| |
Collapse
|
50
|
Li Y, Garcia G, Arumugaswami V, Guo F. Structure-based design of antisense oligonucleotides that inhibit SARS-CoV-2 replication. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2021.08.23.457434. [PMID: 34462746 PMCID: PMC8404888 DOI: 10.1101/2021.08.23.457434] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Antisense oligonucleotides (ASOs) are an emerging class of drugs that target RNAs. Current ASO designs strictly follow the rule of Watson-Crick base pairing along target sequences. However, RNAs often fold into structures that interfere with ASO hybridization. Here we developed a structure-based ASO design method and applied it to target severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Our method makes sure that ASO binding is compatible with target structures in three-dimensional (3D) space by employing structural design templates. These 3D-ASOs recognize the shapes and hydrogen bonding patterns of targets via tertiary interactions, achieving enhanced affinity and specificity. We designed 3D-ASOs that bind to the frameshift stimulation element and transcription regulatory sequence of SARS-CoV-2 and identified lead ASOs that strongly inhibit viral replication in human cells. We further optimized the lead sequences and characterized structure-activity relationship. The 3D-ASO technology helps fight coronavirus disease-2019 and is broadly applicable to ASO drug development.
Collapse
Affiliation(s)
- Yan Li
- Department of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, U.S.A
- Molecular Biology Interdepartmental Ph.D. Program, University of California, Los Angeles, CA 90095, U.S.A
| | - Gustavo Garcia
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, U.S.A
| | - Vaithilingaraja Arumugaswami
- Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, U.S.A
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, CA 90095, U.S.A
| | - Feng Guo
- Department of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, U.S.A
- Molecular Biology Institute, University of California, Los Angeles, CA 90095, U.S.A
| |
Collapse
|