1
|
Sang C, Shu J, Wang K, Xia W, Wang Y, Sun T, Xu X. The prediction of RNA-small molecule binding sites in RNA structures based on geometric deep learning. Int J Biol Macromol 2025; 310:143308. [PMID: 40268011 DOI: 10.1016/j.ijbiomac.2025.143308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 04/15/2025] [Accepted: 04/16/2025] [Indexed: 04/25/2025]
Abstract
Biological interactions between RNA and small-molecule ligands play a crucial role in determining the specific functions of RNA, such as catalysis and folding, and are essential for guiding drug design in the medical field. Accurately predicting the binding sites of ligands within RNA structures is therefore of significant importance. To address this challenge, we introduced a computational approach named RLBSIF (RNA-Ligand Binding Surface Interaction Fingerprints) based on geometric deep learning. This model utilizes surface geometric features, including shape index and distance-dependent curvature, combined with chemical features represented by atomic charge, to comprehensively characterize RNA-ligand interactions through MaSIF-based surface interaction fingerprints. Additionally, we employ the ResNet18 network to analyze these fingerprints for identifying ligand binding pockets. Trained on 440 binding pockets, RLBSIF achieves an overall pocket-level classification accuracy of 90 %. Through a full-space enumeration method, it can predict binding sites at nucleotide resolution. In two independent tests, RLBSIF outperformed competing models, demonstrating its efficacy in accurately identifying binding sites within complex molecular structures. This method shows promise for drug design and biological product development, providing valuable insights into RNA-ligand interactions and facilitating the design of novel therapeutic interventions. For access to the related source code, please visit RLBSIF on GitHub (https://github.com/ZUSTSTTLAB/RLBSIF).
Collapse
Affiliation(s)
- Chunjiang Sang
- Department of Physics, Zhejiang University of Science and Technology, Hangzhou 310008, China
| | - Jiasai Shu
- Department of Physics, Zhejiang University of Science and Technology, Hangzhou 310008, China
| | - Kang Wang
- School of Physics, Nanjing University, Nanjing 210093, China
| | - Wentao Xia
- Department of Physics, Zhejiang University of Science and Technology, Hangzhou 310008, China
| | - Yan Wang
- Department of Physics, Zhejiang University of Science and Technology, Hangzhou 310008, China
| | - Tingting Sun
- Department of Physics, Zhejiang University of Science and Technology, Hangzhou 310008, China.
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China.
| |
Collapse
|
2
|
Shen Y, Jiang Z, Liu R. Dynamic integration of feature- and template-based methods improves the prediction of conformational B cell epitopes. Structure 2025; 33:798-807.e4. [PMID: 39938510 DOI: 10.1016/j.str.2025.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 12/10/2024] [Accepted: 01/16/2025] [Indexed: 02/14/2025]
Abstract
The accurate prediction of conformational epitopes promotes our understanding of antigen-antibody interactions. All existing algorithms depend on a feature-based strategy, which limits their performance. A template-based strategy can provide complementary information, and the interplay between these two strategies could improve the prediction of epitopes. Here, we present DynaBCE, a dynamic ensemble algorithm to effectively identify conformational B cell epitopes (BCEs). Using novel handcrafted structural descriptors and embeddings from protein language models, we developed machine learning and deep learning modules based on boosting algorithms and geometric graph neural networks, respectively. Furthermore, we built a template module by leveraging known structural template information and transformer-based algorithms to capture binding signatures. Finally, we integrated the three modules using a dynamic weighting approach to maximize the strength of each module for different samples. DynaBCE achieved promising results for both native and predicted structures and outperformed previous methods as demonstrated in various evaluation scenarios.
Collapse
Affiliation(s)
- Yueyue Shen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Zheng Jiang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Rong Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China.
| |
Collapse
|
3
|
Zhou Y, Chen SJ. Advances in machine-learning approaches to RNA-targeted drug design. ARTIFICIAL INTELLIGENCE CHEMISTRY 2024; 2:100053. [PMID: 38434217 PMCID: PMC10904028 DOI: 10.1016/j.aichem.2024.100053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2024]
Abstract
RNA molecules play multifaceted functional and regulatory roles within cells and have garnered significant attention in recent years as promising therapeutic targets. With remarkable successes achieved by artificial intelligence (AI) in different fields such as computer vision and natural language processing, there is a growing imperative to harness AI's potential in computer-aided drug design (CADD) to discover novel drug compounds that target RNA. Although machine-learning (ML) approaches have been widely adopted in the discovery of small molecules targeting proteins, the application of ML approaches to model interactions between RNA and small molecule is still in its infancy. Compared to protein-targeted drug discovery, the major challenges in ML-based RNA-targeted drug discovery stem from the scarcity of available data resources. With the growing interest and the development of curated databases focusing on interactions between RNA and small molecule, the field anticipates a rapid growth and the opening of a new avenue for disease treatment. In this review, we aim to provide an overview of recent advancements in computationally modeling RNA-small molecule interactions within the context of RNA-targeted drug discovery, with a particular emphasis on methodologies employing ML techniques.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
4
|
Morishita EC, Nakamura S. Recent applications of artificial intelligence in RNA-targeted small molecule drug discovery. Expert Opin Drug Discov 2024; 19:415-431. [PMID: 38321848 DOI: 10.1080/17460441.2024.2313455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024]
Abstract
INTRODUCTION Targeting RNAs with small molecules offers an alternative to the conventional protein-targeted drug discovery and can potentially address unmet and emerging medical needs. The recent rise of interest in the strategy has already resulted in large amounts of data on disease associated RNAs, as well as on small molecules that bind to such RNAs. Artificial intelligence (AI) approaches, including machine learning and deep learning, present an opportunity to speed up the discovery of RNA-targeted small molecules by improving decision-making efficiency and quality. AREAS COVERED The topics described in this review include the recent applications of AI in the identification of RNA targets, RNA structure determination, screening of chemical compound libraries, and hit-to-lead optimization. The impact and limitations of the recent AI applications are discussed, along with an outlook on the possible applications of next-generation AI tools for the discovery of novel RNA-targeted small molecule drugs. EXPERT OPINION Key areas for improvement include developing AI tools for understanding RNA dynamics and RNA - small molecule interactions. High-quality and comprehensive data still need to be generated especially on the biological activity of small molecules that target RNAs.
Collapse
|
5
|
Zhou JX, Yang Z, Xi DH, Dai SJ, Feng ZQ, Li JY, Xu W, Wang H. Enhanced segmentation of gastrointestinal polyps from capsule endoscopy images with artifacts using ensemble learning. World J Gastroenterol 2022; 28:5931-5943. [PMID: 36405108 PMCID: PMC9669827 DOI: 10.3748/wjg.v28.i41.5931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/31/2022] [Accepted: 10/19/2022] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Endoscopy artifacts are widespread in real capsule endoscopy (CE) images but not in high-quality standard datasets.
AIM To improve the segmentation performance of polyps from CE images with artifacts based on ensemble learning.
METHODS We collected 277 polyp images with CE artifacts from 5760 h of videos from 480 patients at Guangzhou First People’s Hospital from January 2016 to December 2019. Two public high-quality standard external datasets were retrieved and used for the comparison experiments. For each dataset, we randomly segmented the data into training, validation, and testing sets for model training, selection, and testing. We compared the performance of the base models and the ensemble model in segmenting polyps from images with artifacts.
RESULTS The performance of the semantic segmentation model was affected by artifacts in the sample images, which also affected the results of polyp detection by CE using a single model. The evaluation based on real datasets with artifacts and standard datasets showed that the ensemble model of all state-of-the-art models performed better than the best corresponding base learner on the real dataset with artifacts. Compared with the corresponding optimal base learners, the intersection over union (IoU) and dice of the ensemble learning model increased to different degrees, ranging from 0.08% to 7.01% and 0.61% to 4.93%, respectively. Moreover, in the standard datasets without artifacts, most of the ensemble models were slightly better than the base learner, as demonstrated by the IoU and dice increases ranging from -0.28% to 1.20% and -0.61% to 0.76%, respectively.
CONCLUSION Ensemble learning can improve the segmentation accuracy of polyps from CE images with artifacts. Our results demonstrated an improvement in the detection rate of polyps with interference from artifacts.
Collapse
Affiliation(s)
- Jun-Xiao Zhou
- Department of Gastroenterology and Hepatology, Guangzhou First People’s Hospital, Guangzhou 510180, Guangdong Province, China
| | - Zhan Yang
- School of Information, Renmin University of China, Beijing 100872, China
| | - Ding-Hao Xi
- School of Information, Renmin University of China, Beijing 100872, China
| | - Shou-Jun Dai
- Department of Gastroenterology and Hepatology, Guangzhou First People’s Hospital, Guangzhou 510180, Guangdong Province, China
| | - Zhi-Qiang Feng
- Department of Gastroenterology and Hepatology, Guangzhou First People’s Hospital, Guangzhou 510180, Guangdong Province, China
| | - Jun-Yan Li
- Department of Gastroenterology and Hepatology, Guangzhou First People’s Hospital, Guangzhou 510180, Guangdong Province, China
| | - Wei Xu
- School of Information, Renmin University of China, Beijing 100872, China
| | - Hong Wang
- Department of Gastroenterology and Hepatology, Guangzhou First People’s Hospital, Guangzhou 510180, Guangdong Province, China
| |
Collapse
|
6
|
Bheemireddy S, Sandhya S, Srinivasan N, Sowdhamini R. Computational tools to study RNA-protein complexes. Front Mol Biosci 2022; 9:954926. [PMID: 36275618 PMCID: PMC9585174 DOI: 10.3389/fmolb.2022.954926] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 09/20/2022] [Indexed: 11/19/2022] Open
Abstract
RNA is the key player in many cellular processes such as signal transduction, replication, transport, cell division, transcription, and translation. These diverse functions are accomplished through interactions of RNA with proteins. However, protein–RNA interactions are still poorly derstood in contrast to protein–protein and protein–DNA interactions. This knowledge gap can be attributed to the limited availability of protein-RNA structures along with the experimental difficulties in studying these complexes. Recent progress in computational resources has expanded the number of tools available for studying protein-RNA interactions at various molecular levels. These include tools for predicting interacting residues from primary sequences, modelling of protein-RNA complexes, predicting hotspots in these complexes and insights into derstanding in the dynamics of their interactions. Each of these tools has its strengths and limitations, which makes it significant to select an optimal approach for the question of interest. Here we present a mini review of computational tools to study different aspects of protein-RNA interactions, with focus on overall application, development of the field and the future perspectives.
Collapse
Affiliation(s)
- Sneha Bheemireddy
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Sankaran Sandhya
- Department of Biotechnology, Faculty of Life and Allied Health Sciences, M.S. Ramaiah University of Applied Sciences, Bengaluru, India
- *Correspondence: Sankaran Sandhya, ; Ramanathan Sowdhamini,
| | | | - Ramanathan Sowdhamini
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
- National Centre for Biological Sciences, TIFR, GKVK Campus, Bangalore, India
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, India
- *Correspondence: Sankaran Sandhya, ; Ramanathan Sowdhamini,
| |
Collapse
|