1
|
Xia R, Li W, Cheng Y, Xie L, Xu X. Molecular surfaces modeling: Advancements in deep learning for molecular interactions and predictions. Biochem Biophys Res Commun 2025; 763:151799. [PMID: 40239539 DOI: 10.1016/j.bbrc.2025.151799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2025] [Revised: 03/20/2025] [Accepted: 04/10/2025] [Indexed: 04/18/2025]
Abstract
Molecular surface analysis can provide a high-dimensional, rich representation of molecular properties and interactions, which is crucial for enabling powerful predictive modeling and rational molecular design across diverse scientific and technological domains. With remarkable successes achieved by artificial intelligence (AI) in different fields such as computer vision and natural language processing, there is a growing imperative to harness AI's potential in accelerating molecular discovery and innovation. The integration of AI techniques with molecular surface analysis has opened up new frontiers, allowing researchers to uncover hidden patterns, relationships, and design principles that were previously elusive. By leveraging the complementary strengths of molecular surface representations and advanced AI algorithms, scientists can now explore chemical space more efficiently, optimize molecular properties with greater precision, and drive transformative advancements in areas like drug development, materials engineering, and catalysis. In this review, we aim to provide an overview of recent advancements in the field of molecular surface analysis and its integration with AI techniques. These AI-driven approaches have led to significant advancements in various downstream tasks, including interface site prediction, protein-protein interaction prediction, surface-centric molecular generation and design.
Collapse
Affiliation(s)
- Renjie Xia
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China
| | - Wei Li
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China
| | - Yi Cheng
- College of Engineering, Lishui University, Lishui, 323000, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China.
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, 213001, China.
| |
Collapse
|
2
|
Sandoval JE, Carullo NVN, Salisbury AJ, Day JJ, Reich NO. Mechanism of non-coding RNA regulation of DNMT3A. Epigenetics Chromatin 2025; 18:15. [PMID: 40148869 PMCID: PMC11951571 DOI: 10.1186/s13072-025-00574-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Accepted: 02/11/2025] [Indexed: 03/29/2025] Open
Abstract
BACKGROUND De novo DNA methylation by DNMT3A is a fundamental epigenetic modification for transcriptional regulation. Histone tails and regulatory proteins regulate DNMT3A, and the crosstalk between these epigenetic mechanisms ensures appropriate DNA methylation patterning. Based on findings showing that Fos ecRNA inhibits DNMT3A activity in neurons, we sought to characterize the contribution of this regulatory RNA in the modulation of DNMT3A in the presence of regulatory proteins and histone tails. RESULTS We show that Fos ecRNA and mRNA strongly correlate in primary cortical neurons on a single cell level and provide evidence that Fos ecRNA modulation of DNMT3A at these actively transcribed sites occurs in a sequence-independent manner. Further characterization of the Fos ecRNA-DNMT3A interaction showed that Fos-1 ecRNA binds the DNMT3A tetramer interface and clinically relevant DNMT3A substitutions that disrupt the inhibition of DNMT3A activity by Fos-1 ecRNA are restored by the formation of heterotetramers with DNMT3L. Lastly, using DNMT3L and Fos ecRNA in the presence of synthetic histone H3 tails or reconstituted polynucleosomes, we found that regulatory RNAs play dominant roles in the modulation of DNMT3A activity. CONCLUSION Our results are consistent with a model for RNA regulation of DNMT3A that involves localized production of short RNAs binding to a nonspecific site on the protein, rather than formation of localized RNA/DNA structures. We propose that regulatory RNAs play a dominant role in the regulation of DNMT3A catalytic activity at sites with increased production of regulatory RNAs.
Collapse
Affiliation(s)
- Jonathan E Sandoval
- Department of Molecular, Cellular and Developmental Biology, University of California, Santa Barbara, CA, 93106-9510, USA
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106-9510, USA
| | - Nancy V N Carullo
- Department of Neurobiology, University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Aaron J Salisbury
- Department of Neurobiology, University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Jeremy J Day
- Department of Neurobiology, University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Norbert O Reich
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, CA, 93106-9510, USA.
| |
Collapse
|
3
|
Wang Y, Yang Y, Ma Z, Wong KC, Li X. EDCNN: identification of genome-wide RNA-binding proteins using evolutionary deep convolutional neural network. Bioinformatics 2022; 38:678-686. [PMID: 34694393 DOI: 10.1093/bioinformatics/btab739] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 10/14/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION RNA-binding proteins (RBPs) are a group of proteins associated with RNA regulation and metabolism, and play an essential role in mediating the maturation, transport, localization and translation of RNA. Recently, Genome-wide RNA-binding event detection methods have been developed to predict RBPs. Unfortunately, the existing computational methods usually suffer some limitations, such as high-dimensionality, data sparsity and low model performance. RESULTS Deep convolution neural network has a useful advantage for solving high-dimensional and sparse data. To improve further the performance of deep convolution neural network, we propose evolutionary deep convolutional neural network (EDCNN) to identify protein-RNA interactions by synergizing evolutionary optimization with gradient descent to enhance deep conventional neural network. In particular, EDCNN combines evolutionary algorithms and different gradient descent models in a complementary algorithm, where the gradient descent and evolution steps can alternately optimize the RNA-binding event search. To validate the performance of EDCNN, an experiment is conducted on two large-scale CLIP-seq datasets, and results reveal that EDCNN provides superior performance to other state-of-the-art methods. Furthermore, time complexity analysis, parameter analysis and motif analysis are conducted to demonstrate the effectiveness of our proposed algorithm from several perspectives. AVAILABILITY AND IMPLEMENTATION The EDCNN algorithm is available at GitHub: https://github.com/yaweiwang1232/EDCNN. Both the software and the supporting data can be downloaded from: https://figshare.com/articles/software/EDCNN/16803217. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yawei Wang
- School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
| | - Yuning Yang
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun, Jilin, China
| |
Collapse
|
4
|
Carazo F, Romero JP, Rubio A. Upstream analysis of alternative splicing: a review of computational approaches to predict context-dependent splicing factors. Brief Bioinform 2020; 20:1358-1375. [PMID: 29390045 DOI: 10.1093/bib/bby005] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 12/14/2017] [Indexed: 12/13/2022] Open
Abstract
Alternative splicing (AS) has shown to play a pivotal role in the development of diseases, including cancer. Specifically, all the hallmarks of cancer (angiogenesis, cell immortality, avoiding immune system response, etc.) are found to have a counterpart in aberrant splicing of key genes. Identifying the context-specific regulators of splicing provides valuable information to find new biomarkers, as well as to define alternative therapeutic strategies. The computational models to identify these regulators are not trivial and require three conceptual steps: the detection of AS events, the identification of splicing factors that potentially regulate these events and the contextualization of these pieces of information for a specific experiment. In this work, we review the different algorithmic methodologies developed for each of these tasks. Main weaknesses and strengths of the different steps of the pipeline are discussed. Finally, a case study is detailed to help the reader be aware of the potential and limitations of this computational approach.
Collapse
|
5
|
Pan X, Shen HB. Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks. Bioinformatics 2019; 34:3427-3436. [PMID: 29722865 DOI: 10.1093/bioinformatics/bty364] [Citation(s) in RCA: 121] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 05/01/2018] [Indexed: 12/21/2022] Open
Abstract
Motivation RNA-binding proteins (RBPs) take over 5-10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using patterns learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. Results In this study, we present a computational method iDeepE to predict RNA-protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN runs 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. Availability and implementation https://github.com/xypan1232/iDeepE. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoyong Pan
- Department of Medical informatics, Erasmus Medical Center, CE Rotterdam, The Netherlands
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| |
Collapse
|
6
|
García-Cárdenas JM, Guerrero S, López-Cortés A, Armendáriz-Castillo I, Guevara-Ramírez P, Pérez-Villa A, Yumiceba V, Zambrano AK, Leone PE, Paz-y-Miño C. Post-transcriptional Regulation of Colorectal Cancer: A Focus on RNA-Binding Proteins. Front Mol Biosci 2019; 6:65. [PMID: 31440515 PMCID: PMC6693420 DOI: 10.3389/fmolb.2019.00065] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 07/23/2019] [Indexed: 12/24/2022] Open
Abstract
Colorectal cancer (CRC) is a major health problem with an estimated 1. 8 million new cases worldwide. To date, most CRC studies have focused on DNA-related aberrations, leaving post-transcriptional processes under-studied. However, post-transcriptional alterations have been shown to play a significant part in the maintenance of cancer features. RNA binding proteins (RBPs) are uprising as critical regulators of every cancer hallmark, yet little is known regarding the underlying mechanisms and key downstream oncogenic targets. Currently, more than a thousand RBPs have been discovered in humans and only a few have been implicated in the carcinogenic process and even much less in CRC. Identification of cancer-related RBPs is of great interest to better understand CRC biology and potentially unveil new targets for cancer therapy and prognostic biomarkers. In this work, we reviewed all RBPs which have a role in CRC, including their control by microRNAs, xenograft studies and their clinical implications.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - César Paz-y-Miño
- Facultad de Ciencias de la Salud Eugenio Espejo, Centro de Investigación Genética y Genómica, Universidad UTE, Quito, Ecuador
| |
Collapse
|
7
|
|
8
|
Hu W, Qin L, Li M, Pu X, Guo Y. A structural dissection of protein–RNA interactions based on different RNA base areas of interfaces. RSC Adv 2018; 8:10582-10592. [PMID: 35540439 PMCID: PMC9078961 DOI: 10.1039/c8ra00598b] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 03/05/2018] [Indexed: 11/21/2022] Open
Abstract
Protein–RNA interactions are very common cellular processes, but the mechanisms of interactions are not fully understood, mainly due to the complicated RNA structures. By the elaborate investigation on RNA structures of protein–RNA complexes, it was firstly found in this paper that RNAs in these complexes could be clearly classified into three classes (high, medium and low) based on the different levels of Pbase (the percentage of base area buried in the RNA interface). In view of the three RNA classes, more detailed analyses on protein–RNA interactions were comprehensively performed from various aspects, including interface area, structure, composition and interaction force, so as to achieve a deeper understanding of the recognition specificity for the three classes of protein–RNA interactions. According to our classification strategy, the three complex classes have significant differences in terms of almost all properties. Complexes in the high class have short and extended RNA structures and behave like protein–ssDNA interactions. Their hydrogen bonds and hydrophobic interactions are strong. For complexes in low class, their RNA structures are mainly double-stranded, like protein–dsDNA interactions, and electrostatic interactions frequently occur. The complexes in medium class have the longest RNA chains and largest average interface area. Meanwhile, they do not show any preference for the interaction force. On average, in terms of composition, secondary structures and intermolecular physicochemical properties, significant feature preferences can be observed in high and low complexes, but no highly specific features are found for medium complexes. We found that our proposed Pbase is an important parameter which can be used as a new determinant to distinguish protein–RNA complexes. For high and low complexes, we can more easily understand the specificity of the recognition process from the interface features than for medium complexes. In the future, medium complexes should be our research focus to further structurally analyze from more feature aspects. Overall, this study may contribute to further understanding of the mechanism of protein–RNA interactions on a more detailed level. Qualitative and quantitative measurements of the influence of structure and composition of RNA interfaces on protein–RNA interactions.![]()
Collapse
Affiliation(s)
- Wen Hu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Liu Qin
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Menglong Li
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Xuemei Pu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Yanzhi Guo
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| |
Collapse
|