1
|
Zhu W, Ding X, Shen HB, Pan X. Identifying RNA-small Molecule Binding Sites Using Geometric Deep Learning with Language Models. J Mol Biol 2025; 437:169010. [PMID: 39961524 DOI: 10.1016/j.jmb.2025.169010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 02/10/2025] [Accepted: 02/12/2025] [Indexed: 02/28/2025]
Abstract
RNAs are emerging as promising therapeutic targets, yet identifying small molecules that bind to them remains a significant challenge in drug discovery. This underscores the crucial role of computational modeling in predicting RNA-small molecule binding sites. However, accurate and efficient computational methods for identifying these interactions are still lacking. Recently, advances in large language models (LLMs), previously successful in DNA and protein research, have spurred the development of RNA-specific LLMs. These models leverage vast unlabeled RNA sequences to autonomously learn semantic representations with the goal of enhancing downstream tasks, particularly those constrained by limited annotated data. Here, we develop RNABind, an embedding-informed geometric deep learning framework to detect RNA-small molecule binding sites from RNA structures. RNABind integrates RNA LLMs into advanced geometric deep learning networks, which encodes both RNA sequence and structure information. To evaluate RNABind, we first compile the largest RNA-small molecule interaction dataset from the entire multi-chain complex structure instead of single-chain RNAs. Extensive experiments demonstrate that RNABind outperforms existing state-of-the-art methods. Besides, we conduct an extensive experimental evaluation of eight pre-trained RNA LLMs, assessing their performance on the binding site prediction task within a unified experimental protocol. In summary, RNABind provides a powerful tool on exploring RNA-small molecule binding site prediction, which paves the way for future innovations in the RNA-targeted drug discovery.
Collapse
Affiliation(s)
- Weimin Zhu
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaohan Ding
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.
| |
Collapse
|
2
|
Zhuo C, Zeng C, Liu H, Wang H, Peng Y, Zhao Y. Advances and Mechanisms of RNA-Ligand Interaction Predictions. Life (Basel) 2025; 15:104. [PMID: 39860045 PMCID: PMC11767038 DOI: 10.3390/life15010104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Revised: 01/13/2025] [Accepted: 01/15/2025] [Indexed: 01/27/2025] Open
Abstract
The diversity and complexity of RNA include sequence, secondary structure, and tertiary structure characteristics. These elements are crucial for RNA's specific recognition of other molecules. With advancements in biotechnology, RNA-ligand structures allow researchers to utilize experimental data to uncover the mechanisms of complex interactions. However, determining the structures of these complexes experimentally can be technically challenging and often results in low-resolution data. Many machine learning computational approaches have recently emerged to learn multiscale-level RNA features to predict the interactions. Predicting interactions remains an unexplored area. Therefore, studying RNA-ligand interactions is essential for understanding biological processes. In this review, we analyze the interaction characteristics of RNA-ligand complexes by examining RNA's sequence, secondary structure, and tertiary structure. Our goal is to clarify how RNA specifically recognizes ligands. Additionally, we systematically discuss advancements in computational methods for predicting interactions and to guide future research directions. We aim to inspire the creation of more reliable RNA-ligand interaction prediction tools.
Collapse
Affiliation(s)
- Chen Zhuo
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Chengwei Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Haoquan Liu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Huiwen Wang
- School of Physics and Engineering, Henan University of Science and Technology, Luoyang 471023, China;
| | - Yunhui Peng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
3
|
Liu H, Zhuo C, Gao J, Zeng C, Zhao Y. AI-integrated network for RNA complex structure and dynamic prediction. BIOPHYSICS REVIEWS 2024; 5:041304. [PMID: 39512332 PMCID: PMC11540444 DOI: 10.1063/5.0237319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Accepted: 10/15/2024] [Indexed: 11/15/2024]
Abstract
RNA complexes are essential components in many cellular processes. The functions of these complexes are linked to their tertiary structures, which are shaped by detailed interface information, such as binding sites, interface contact, and dynamic conformational changes. Network-based approaches have been widely used to analyze RNA complex structures. With their roots in the graph theory, these methods have a long history of providing insight into the static and dynamic properties of RNA molecules. These approaches have been effective in identifying functional binding sites and analyzing the dynamic behavior of RNA complexes. Recently, the advent of artificial intelligence (AI) has brought transformative changes to the field. These technologies have been increasingly applied to studying RNA complex structures, providing new avenues for understanding the complex interactions within RNA complexes. By integrating AI with traditional network analysis methods, researchers can build more accurate models of RNA complex structures, predict their dynamic behaviors, and even design RNA-based inhibitors. In this review, we introduce the integration of network-based methodologies with AI techniques to enhance the understanding of RNA complex structures. We examine how these advanced computational tools can be used to model and analyze the detailed interface information and dynamic behaviors of RNA molecules. Additionally, we explore the potential future directions of how AI-integrated networks can aid in the modeling and analyzing RNA complex structures.
Collapse
Affiliation(s)
- Haoquan Liu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Chen Zhuo
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Jiaming Gao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Chengwei Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
4
|
Zhuo C, Gao J, Li A, Liu X, Zhao Y. A Machine Learning Method for RNA-Small Molecule Binding Preference Prediction. J Chem Inf Model 2024; 64:7386-7397. [PMID: 39265103 DOI: 10.1021/acs.jcim.4c01324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2024]
Abstract
The interaction between RNA and small molecules is crucial in various biological functions. Identifying molecules targeting RNA is essential for the inhibitor design and RNA-related studies. However, traditional methods focus on learning RNA sequence and secondary structure features and neglect small molecule characteristics, and resulting in poor performance on unknown small molecule testing. To overcome this limitation, we developed a double-layer stacking-based machine learning model called ZHMol-RLinter. This approach more effectively predicts RNA-small molecule binding preferences by learning RNA and small molecule features to capture their interaction information. ZHMol-RLinter also combines sequence and secondary structural features with structural geometric and physicochemical environment information to capture the specificity of RNA spatial conformations in recognizing small molecules. Our results demonstrate that ZHMol-RLinter has a success rate of 90.8% on the published RL98 testing set, representing a significant improvement over existing methods. Additionally, ZHMol-RLinter achieved a success rate of 77.1% on the unknown small molecule UNK96 testing set, showing substantial improvement over the existing methods. The evaluation of predicted structures confirms that ZHMol-RLinter is reliable and accurate for predicting RNA-small molecule binding preferences, even for challenging unknown small molecule testing. Predicting RNA-small molecule binding preferences can help in the understanding of RNA-small molecule interactions and promote the design of RNA-related drugs for biological and medical applications.
Collapse
Affiliation(s)
- Chen Zhuo
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Jiaming Gao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Anbang Li
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Xuefeng Liu
- College of Mathematics and Physics, Chengdu University of Technology, Chengdu 610059, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
5
|
Gao J, Liu H, Zhuo C, Zeng C, Zhao Y. Predicting Small Molecule Binding Nucleotides in RNA Structures Using RNA Surface Topography. J Chem Inf Model 2024. [PMID: 39230508 DOI: 10.1021/acs.jcim.4c01264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
RNA small molecule interactions play a crucial role in drug discovery and inhibitor design. Identifying RNA small molecule binding nucleotides is essential and requires methods that exhibit a high predictive ability to facilitate drug discovery and inhibitor design. Existing methods can predict the binding nucleotides of simple RNA structures, but it is hard to predict binding nucleotides in complex RNA structures with junctions. To address this limitation, we developed a new deep learning model based on spatial correlation, ZHmolReSTasite, which can accurately predict binding nucleotides of small and large RNA with junctions. We utilize RNA surface topography to consider the spatial correlation, characterizing nucleotides from sequence and tertiary structures to learn a high-level representation. Our method outperforms existing methods for benchmark test sets composed of simple RNA structures, achieving precision values of 72.9% on TE18 and 76.7% on RB9 test sets. For a challenging test set composed of RNA structures with junctions, our method outperforms the second best method by 11.6% in precision. Moreover, ZHmolReSTasite demonstrates robustness regarding the predicted RNA structures. In summary, ZHmolReSTasite successfully incorporates spatial correlation, outperforms previous methods on small and large RNA structures using RNA surface topography, and can provide valuable insights into RNA small molecule prediction and accelerate RNA inhibitor design.
Collapse
Affiliation(s)
- Jiaming Gao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Haoquan Liu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Chen Zhuo
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Chengwei Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
6
|
Wang J, Quan L, Jin Z, Wu H, Ma X, Wang X, Xie J, Pan D, Chen T, Wu T, Lyu Q. MultiModRLBP: A Deep Learning Approach for Multi-Modal RNA-Small Molecule Ligand Binding Sites Prediction. IEEE J Biomed Health Inform 2024; 28:4995-5006. [PMID: 38739505 DOI: 10.1109/jbhi.2024.3400521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
This study aims to tackle the intricate challenge of predicting RNA-small molecule binding sites to explore the potential value in the field of RNA drug targets. To address this challenge, we propose the MultiModRLBP method, which integrates multi-modal features using deep learning algorithms. These features include 3D structural properties at the nucleotide base level of the RNA molecule, relational graphs based on overall RNA structure, and rich RNA semantic information. In our investigation, we gathered 851 interactions between RNA and small molecule ligand from the RNAglib dataset and RLBind training set. Unlike conventional training sets, this collection broadened its scope by including RNA complexes that have the same RNA sequence but change their respective binding sites due to structural differences or the presence of different ligands. This enhancement enables the MultiModRLBP model to more accurately capture subtle changes at the structural level, ultimately improving its ability to discern nuances among similar RNA conformations. Furthermore, we evaluated MultiModRLBP on two classic test sets, Test18 and Test3, highlighting its performance disparities on small molecules based on metal and non-metal ions. Additionally, we conducted a structural sensitivity analysis on specific complex categories, considering RNA instances with varying degrees of structural changes and whether they share the same ligands. The research results indicate that MultiModRLBP outperforms the current state-of-the-art methods on multiple classic test sets, particularly excelling in predicting binding sites for non-metal ions and instances where the binding sites are widely distributed along the sequence. MultiModRLBP also can be used as a potential tool when the RNA structure is perturbed or the RNA experimental tertiary structure is not available. Most importantly, MultiModRLBP exhibits the capability to distinguish binding characteristics of RNA that are structurally diverse yet exhibit sequence similarity. These advancements hold promise in reducing the costs associated with the development of RNA-targeted drugs.
Collapse
|
7
|
Panei FP, Gkeka P, Bonomi M. Identifying small-molecules binding sites in RNA conformational ensembles with SHAMAN. Nat Commun 2024; 15:5725. [PMID: 38977675 PMCID: PMC11231146 DOI: 10.1038/s41467-024-49638-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 06/05/2024] [Indexed: 07/10/2024] Open
Abstract
The rational targeting of RNA with small molecules is hampered by our still limited understanding of RNA structural and dynamic properties. Most in silico tools for binding site identification rely on static structures and therefore cannot face the challenges posed by the dynamic nature of RNA molecules. Here, we present SHAMAN, a computational technique to identify potential small-molecule binding sites in RNA structural ensembles. SHAMAN enables exploring the conformational landscape of RNA with atomistic molecular dynamics simulations and at the same time identifying RNA pockets in an efficient way with the aid of probes and enhanced-sampling techniques. In our benchmark composed of large, structured riboswitches as well as small, flexible viral RNAs, SHAMAN successfully identifies all the experimentally resolved pockets and ranks them among the most favorite probe hotspots. Overall, SHAMAN sets a solid foundation for future drug design efforts targeting RNA with small molecules, effectively addressing the long-standing challenges in the field.
Collapse
Affiliation(s)
- F P Panei
- Integrated Drug Discovery, Molecular Design Sciences, Sanofi, Vitry-sur-Seine, France
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France
- Sorbonne Université, Ecole Doctorale Complexité du Vivant, Paris, France
| | - P Gkeka
- Integrated Drug Discovery, Molecular Design Sciences, Sanofi, Vitry-sur-Seine, France.
| | - M Bonomi
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France.
| |
Collapse
|
8
|
Zhou Y, Chen SJ. Advances in machine-learning approaches to RNA-targeted drug design. ARTIFICIAL INTELLIGENCE CHEMISTRY 2024; 2:100053. [PMID: 38434217 PMCID: PMC10904028 DOI: 10.1016/j.aichem.2024.100053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2024]
Abstract
RNA molecules play multifaceted functional and regulatory roles within cells and have garnered significant attention in recent years as promising therapeutic targets. With remarkable successes achieved by artificial intelligence (AI) in different fields such as computer vision and natural language processing, there is a growing imperative to harness AI's potential in computer-aided drug design (CADD) to discover novel drug compounds that target RNA. Although machine-learning (ML) approaches have been widely adopted in the discovery of small molecules targeting proteins, the application of ML approaches to model interactions between RNA and small molecule is still in its infancy. Compared to protein-targeted drug discovery, the major challenges in ML-based RNA-targeted drug discovery stem from the scarcity of available data resources. With the growing interest and the development of curated databases focusing on interactions between RNA and small molecule, the field anticipates a rapid growth and the opening of a new avenue for disease treatment. In this review, we aim to provide an overview of recent advancements in computationally modeling RNA-small molecule interactions within the context of RNA-targeted drug discovery, with a particular emphasis on methodologies employing ML techniques.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
9
|
Morishita EC, Nakamura S. Recent applications of artificial intelligence in RNA-targeted small molecule drug discovery. Expert Opin Drug Discov 2024; 19:415-431. [PMID: 38321848 DOI: 10.1080/17460441.2024.2313455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024]
Abstract
INTRODUCTION Targeting RNAs with small molecules offers an alternative to the conventional protein-targeted drug discovery and can potentially address unmet and emerging medical needs. The recent rise of interest in the strategy has already resulted in large amounts of data on disease associated RNAs, as well as on small molecules that bind to such RNAs. Artificial intelligence (AI) approaches, including machine learning and deep learning, present an opportunity to speed up the discovery of RNA-targeted small molecules by improving decision-making efficiency and quality. AREAS COVERED The topics described in this review include the recent applications of AI in the identification of RNA targets, RNA structure determination, screening of chemical compound libraries, and hit-to-lead optimization. The impact and limitations of the recent AI applications are discussed, along with an outlook on the possible applications of next-generation AI tools for the discovery of novel RNA-targeted small molecule drugs. EXPERT OPINION Key areas for improvement include developing AI tools for understanding RNA dynamics and RNA - small molecule interactions. High-quality and comprehensive data still need to be generated especially on the biological activity of small molecules that target RNAs.
Collapse
|
10
|
Rinaldi S, Moroni E, Rozza R, Magistrato A. Frontiers and Challenges of Computing ncRNAs Biogenesis, Function and Modulation. J Chem Theory Comput 2024; 20:993-1018. [PMID: 38287883 DOI: 10.1021/acs.jctc.3c01239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
Non-coding RNAs (ncRNAs), generated from nonprotein coding DNA sequences, constitute 98-99% of the human genome. Non-coding RNAs encompass diverse functional classes, including microRNAs, small interfering RNAs, PIWI-interacting RNAs, small nuclear RNAs, small nucleolar RNAs, and long non-coding RNAs. With critical involvement in gene expression and regulation across various biological and physiopathological contexts, such as neuronal disorders, immune responses, cardiovascular diseases, and cancer, non-coding RNAs are emerging as disease biomarkers and therapeutic targets. In this review, after providing an overview of non-coding RNAs' role in cell homeostasis, we illustrate the potential and the challenges of state-of-the-art computational methods exploited to study non-coding RNAs biogenesis, function, and modulation. This can be done by directly targeting them with small molecules or by altering their expression by targeting the cellular engines underlying their biosynthesis. Drawing from applications, also taken from our work, we showcase the significance and role of computer simulations in uncovering fundamental facets of ncRNA mechanisms and modulation. This information may set the basis to advance gene modulation tools and therapeutic strategies to address unmet medical needs.
Collapse
Affiliation(s)
- Silvia Rinaldi
- National Research Council of Italy (CNR) - Institute of Chemistry of OrganoMetallic Compounds (ICCOM), c/o Area di Ricerca CNR di Firenze Via Madonna del Piano 10, 50019 Sesto Fiorentino, Florence, Italy
| | - Elisabetta Moroni
- National Research Council of Italy (CNR) - Institute of Chemical Sciences and Technologies (SCITEC), via Mario Bianco 9, 20131 Milano, Italy
| | - Riccardo Rozza
- National Research Council of Italy (CNR) - Institute of Material Foundry (IOM) c/o International School for Advanced Studies (SISSA), Via Bonomea, 265, 34136 Trieste, Italy
| | - Alessandra Magistrato
- National Research Council of Italy (CNR) - Institute of Material Foundry (IOM) c/o International School for Advanced Studies (SISSA), Via Bonomea, 265, 34136 Trieste, Italy
| |
Collapse
|
11
|
Liu H, Jian Y, Hou J, Zeng C, Zhao Y. RNet: a network strategy to predict RNA binding preferences. Brief Bioinform 2023; 25:bbad482. [PMID: 38145947 PMCID: PMC10749790 DOI: 10.1093/bib/bbad482] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/15/2023] [Accepted: 12/05/2023] [Indexed: 12/27/2023] Open
Abstract
Determining the RNA binding preferences remains challenging because of the bottleneck of the binding interactions accompanied by subtle RNA flexibility. Typically, designing RNA inhibitors involves screening thousands of potential candidates for binding. Accurate binding site information can increase the number of successful hits even with few candidates. There are two main issues regarding RNA binding preference: binding site prediction and binding dynamical behavior prediction. Here, we propose one interpretable network-based approach, RNet, to acquire precise binding site and binding dynamical behavior information. RNetsite employs a machine learning-based network decomposition algorithm to predict RNA binding sites by analyzing the local and global network properties. Our research focuses on large RNAs with 3D structures without considering smaller regulatory RNAs, which are too small and dynamic. Our study shows that RNetsite outperforms existing methods, achieving precision values as high as 0.701 on TE18 and 0.788 on RB9 tests. In addition, RNetsite demonstrates remarkable robustness regarding perturbations in RNA structures. We also developed RNetdyn, a distance-based dynamical graph algorithm, to characterize the interface dynamical behavior consequences upon inhibitor binding. The simulation testing of competitive inhibitors indicates that RNetdyn outperforms the traditional method by 30%. The benchmark testing results demonstrate that RNet is highly accurate and robust. Our interpretable network algorithms can assist in predicting RNA binding preferences and accelerating RNA inhibitor design, providing valuable insights to the RNA research community.
Collapse
Affiliation(s)
- Haoquan Liu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China
| | - Yiren Jian
- Department of Computer Science, Dartmouth College, Hanover, NH 03755, USA
| | - Jinxuan Hou
- Department of Thyroid and Breast Surgery, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Chen Zeng
- Department of Physics, The George Washington University, Washington, DC 20052, USA
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China
| |
Collapse
|
12
|
Wang K, Zhou R, Wu Y, Li M. RLBind: a deep learning method to predict RNA-ligand binding sites. Brief Bioinform 2023; 24:6832814. [PMID: 36398911 DOI: 10.1093/bib/bbac486] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 09/28/2022] [Accepted: 10/14/2022] [Indexed: 11/19/2022] Open
Abstract
Identification of RNA-small molecule binding sites plays an essential role in RNA-targeted drug discovery and development. These small molecules are expected to be leading compounds to guide the development of new types of RNA-targeted therapeutics compared with regular therapeutics targeting proteins. RNAs can provide many potential drug targets with diverse structures and functions. However, up to now, only a few methods have been proposed. Predicting RNA-small molecule binding sites still remains a big challenge. New computational model is required to better extract the features and predict RNA-small molecule binding sites more accurately. In this paper, a deep learning model, RLBind, was proposed to predict RNA-small molecule binding sites from sequence-dependent and structure-dependent properties by combining global RNA sequence channel and local neighbor nucleotides channel. To our best knowledge, this research was the first to develop a convolutional neural network for RNA-small molecule binding sites prediction. Furthermore, RLBind also can be used as a potential tool when the RNA experimental tertiary structure is not available. The experimental results show that RLBind outperforms other state-of-the-art methods in predicting binding sites. Therefore, our study demonstrates that the combination of global information for full-length sequences and local information for limited local neighbor nucleotides in RNAs can improve the model's predictive performance for binding sites prediction. All datasets and resource codes are available at https://github.com/KailiWang1/RLBind.
Collapse
Affiliation(s)
- Kaili Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Renyi Zhou
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yifan Wu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
13
|
Möller L, Guerci L, Isert C, Atz K, Schneider G. Translating from proteins to ribonucleic acids for ligand-binding site detection. Mol Inform 2022; 41:e2200059. [PMID: 35577762 DOI: 10.1002/minf.202200059] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 05/16/2022] [Indexed: 11/10/2022]
Abstract
Identifying druggable ligand-binding sites on the surface of the macromolecular targets is an important process in structure-based drug discovery. Deep-learning models have been shown to successfully predict ligand-binding sites of proteins. As a step toward predicting binding sites in RNA and RNA-protein complexes, we employ three-dimensional convolutional neural networks. We introduce a dataset splitting approach to minimize structure-related bias in training data, and investigate the influence of protein-based neural network pre-training before fine-tuning on RNA structures. Models that were pre-trained on proteins considerably outperformed the models that were trained exclusively on RNA structures. Overall, 71% of the known RNA binding sites were correctly located within 4 Å of their true centres with a structural overlap of at least 25%.
Collapse
|
14
|
Zhou Y, Jiang Y, Chen SJ. RNA-ligand molecular docking: advances and challenges. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2022; 12:e1571. [PMID: 37293430 PMCID: PMC10250017 DOI: 10.1002/wcms.1571] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 07/20/2021] [Indexed: 12/16/2022]
Abstract
With rapid advances in computer algorithms and hardware, fast and accurate virtual screening has led to a drastic acceleration in selecting potent small molecules as drug candidates. Computational modeling of RNA-small molecule interactions has become an indispensable tool for RNA-targeted drug discovery. The current models for RNA-ligand binding have mainly focused on the docking-and-scoring method. Accurate docking and scoring should tackle four crucial problems: (1) conformational flexibility of ligand, (2) conformational flexibility of RNA, (3) efficient sampling of binding sites and binding poses, and (4) accurate scoring of different binding modes. Moreover, compared with the problem of protein-ligand docking, predicting ligand binding to RNA, a negatively charged polymer, is further complicated by additional effects such as metal ion effects. Thermodynamic models based on physics-based and knowledge-based scoring functions have shown highly encouraging success in predicting ligand binding poses and binding affinities. Recently, kinetic models for ligand binding have further suggested that including dissociation kinetics (residence time) in ligand docking would result in improved performance in estimating in vivo drug efficacy. More recently, the rise of deep-learning approaches has led to new tools for predicting RNA-small molecule binding. In this review, we present an overview of the recently developed computational methods for RNA-ligand docking and their advantages and disadvantages.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Yangwei Jiang
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
15
|
Kozlovskii I, Popov P. Structure-based deep learning for binding site detection in nucleic acid macromolecules. NAR Genom Bioinform 2021; 3:lqab111. [PMID: 34859211 PMCID: PMC8633674 DOI: 10.1093/nargab/lqab111] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/14/2021] [Accepted: 11/09/2021] [Indexed: 12/30/2022] Open
Abstract
Structure-based drug design (SBDD) targeting nucleic acid macromolecules, particularly RNA, is a gaining momentum research direction that already resulted in several FDA-approved compounds. Similar to proteins, one of the critical components in SBDD for RNA is the correct identification of the binding sites for putative drug candidates. RNAs share a common structural organization that, together with the dynamic nature of these molecules, makes it challenging to recognize binding sites for small molecules. Moreover, there is a need for structure-based approaches, as sequence information only does not consider conformation plasticity of nucleic acid macromolecules. Deep learning holds a great promise to resolve binding site detection problem, but requires a large amount of structural data, which is very limited for nucleic acids, compared to proteins. In this study we composed a set of ∼2000 nucleic acid-small molecule structures comprising ∼2500 binding sites, which is ∼40-times larger than previously used one, and demonstrated the first structure-based deep learning approach, BiteNetN, to detect binding sites in nucleic acid structures. BiteNetN operates with arbitrary nucleic acid complexes, shows the state-of-the-art performance, and can be helpful in the analysis of different conformations and mutant variants, as we demonstrated for HIV-1 TAR RNA and ATP-aptamer case studies.
Collapse
Affiliation(s)
- Igor Kozlovskii
- iMolecule, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Petr Popov
- iMolecule, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| |
Collapse
|
16
|
Jiang Z, Xiao SR, Liu R. Dissecting and predicting different types of binding sites in nucleic acids based on structural information. Brief Bioinform 2021; 23:6384399. [PMID: 34624074 PMCID: PMC8769709 DOI: 10.1093/bib/bbab411] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/26/2021] [Accepted: 09/07/2021] [Indexed: 12/16/2022] Open
Abstract
The biological functions of DNA and RNA generally depend on their interactions with other molecules, such as small ligands, proteins and nucleic acids. However, our knowledge of the nucleic acid binding sites for different interaction partners is very limited, and identification of these critical binding regions is not a trivial work. Herein, we performed a comprehensive comparison between binding and nonbinding sites and among different categories of binding sites in these two nucleic acid classes. From the structural perspective, RNA may interact with ligands through forming binding pockets and contact proteins and nucleic acids using protruding surfaces, while DNA may adopt regions closer to the middle of the chain to make contacts with other molecules. Based on structural information, we established a feature-based ensemble learning classifier to identify the binding sites by fully using the interplay among different machine learning algorithms, feature spaces and sample spaces. Meanwhile, we designed a template-based classifier by exploiting structural conservation. The complementarity between the two classifiers motivated us to build an integrative framework for improving prediction performance. Moreover, we utilized a post-processing procedure based on the random walk algorithm to further correct the integrative predictions. Our unified prediction framework yielded promising results for different binding sites and outperformed existing methods.
Collapse
Affiliation(s)
- Zheng Jiang
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Si-Rui Xiao
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Rong Liu
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| |
Collapse
|
17
|
Su H, Peng Z, Yang J. Recognition of small molecule-RNA binding sites using RNA sequence and structure. Bioinformatics 2021; 37:36-42. [PMID: 33416863 PMCID: PMC8034527 DOI: 10.1093/bioinformatics/btaa1092] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 12/12/2020] [Accepted: 12/23/2020] [Indexed: 11/22/2022] Open
Abstract
Motivation RNA molecules become attractive small molecule drug targets to treat disease in recent years. Computer-aided drug design can be facilitated by detecting the RNA sites that bind small molecules. However, very limited progress has been reported for the prediction of small molecule–RNA binding sites. Results We developed a novel method RNAsite to predict small molecule–RNA binding sites using sequence profile- and structure-based descriptors. RNAsite was shown to be competitive with the state-of-the-art methods on the experimental structures of two independent test sets. When predicted structure models were used, RNAsite outperforms other methods by a large margin. The possibility of improving RNAsite by geometry-based binding pocket detection was investigated. The influence of RNA structure’s flexibility and the conformational changes caused by ligand binding on RNAsite were also discussed. RNAsite is anticipated to be a useful tool for the design of RNA-targeting small molecule drugs. Availability and implementation http://yanglab.nankai.edu.cn/RNAsite. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hong Su
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin, 300071, China
| |
Collapse
|
18
|
Wang H, Zhao Y. RBinds: A user-friendly server for RNA binding site prediction. Comput Struct Biotechnol J 2020; 18:3762-3765. [PMID: 34136090 PMCID: PMC8164131 DOI: 10.1016/j.csbj.2020.10.043] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/27/2020] [Accepted: 10/31/2020] [Indexed: 12/03/2022] Open
Abstract
RNA performs various biological functions by interacting with other molecules. The knowledge of RNA binding sites is essential for the understanding of RNA-protein or RNA-ligand complex structures and their mechanisms. However, the RNA binding site prediction study requires tedious programming scripts and manual handling. One user-friendly bioinformatics tool for RNA binding site prediction has been missing. This limitation motivated us to develop the RBinds, a user-friendly web server, to predict the RNA binding site using a simple graphical user interface. Some advanced features implemented in RBinds are (1) transforming the RNA structure to a network automatically; (2) analyzing the structural network properties to predict binding site; (3) constructing one annotated force-directed network; (4) providing a visualization tool for users to scale and rotate the structure; (5) offering the related tools to predict or simulate RNA structures. RBinds web server is a reliable and user-friendly tool and facilitates the RNA binding site study without installing programs locally. RBinds is freely accessible at http://zhaoserver.com.cn/RBinds/RBinds.html.
Collapse
Affiliation(s)
- Huiwen Wang
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
19
|
Wang K, Jian Y, Wang H, Zeng C, Zhao Y. RBind: computational network method to predict RNA binding sites. Bioinformatics 2019; 34:3131-3136. [PMID: 29718097 DOI: 10.1093/bioinformatics/bty345] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 04/24/2018] [Indexed: 12/21/2022] Open
Abstract
Motivation Non-coding RNA molecules play essential roles by interacting with other molecules to perform various biological functions. However, it is difficult to determine RNA structures due to their flexibility. At present, the number of experimentally solved RNA-ligand and RNA-protein structures is still insufficient. Therefore, binding sites prediction of non-coding RNA is required to understand their functions. Results Current RNA binding site prediction algorithms produce many false positive nucleotides that are distance away from the binding sites. Here, we present a network approach, RBind, to predict the RNA binding sites. We benchmarked RBind in RNA-ligand and RNA-protein datasets. The average accuracy of 0.82 in RNA-ligand and 0.63 in RNA-protein testing showed that this network strategy has a reliable accuracy for binding sites prediction. Availability and implementation The codes and datasets are available at https://zhaolab.com.cn/RBind. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kaili Wang
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China
| | - Yiren Jian
- Department of Physics, The George Washington University, Washington, DC, USA
| | - Huiwen Wang
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China
| | - Chen Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China.,Department of Physics, The George Washington University, Washington, DC, USA
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, China
| |
Collapse
|