1
|
Zhu E, Li X, Liu C, Pal NR. Boosting Drug-Disease Association Prediction for Drug Repositioning via Dual-Feature Extraction and Cross-Dual-Domain Decoding. J Chem Inf Model 2025. [PMID: 40278791 DOI: 10.1021/acs.jcim.5c00070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
The extraction of biomedical data has significant academic and practical value in contemporary biomedical sciences. In recent years, drug repositioning, a cost-effective strategy for drug development by discovering new indications for approved drugs, has gained increasing attention. However, many existing drug repositioning methods focus on mining information from adjacent nodes in biomedical networks without considering the potential inter-relationships between the feature spaces of drugs and diseases. This can lead to inaccurate encoding, resulting in biased mined drug-disease association information. To address this limitation, we propose a new model called Dual-Feature Drug Repurposing Neural Network (DFDRNN). DFDRNN allows the mining of two features (similarity and association) from the drug-disease biomedical networks to encode drugs and diseases. A self-attention mechanism is utilized to extract neighbor feature information. It incorporates two dual-feature extraction modules: the single-domain dual-feature extraction (SDDFE) module for extracting features within a single domain (drugs or diseases) and the cross-domain dual-feature extraction (CDDFE) module for extracting features across domains. By utilizing these modules, we ensure more appropriate encoding of drugs and diseases. A cross-dual-domain decoder is also designed to predict drug-disease associations in both domains. Our proposed DFDRNN model outperforms six state-of-the-art methods on four benchmark data sets, achieving an average AUROC of 0.946 and an average AUPR of 0.597. Case studies on three diseases show that the proposed DFDRNN model can be applied in real-world scenarios, demonstrating its significant potential in drug repositioning.
Collapse
Affiliation(s)
- Enqiang Zhu
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Xiang Li
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Chanjuan Liu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Nikhil R Pal
- Electronics and Communication Sciences Unit, Indian Statistical Institute, Calcutta 700108, India
| |
Collapse
|
2
|
Wang Y, Ding P, Wang C, He S, Gao X, Yu B. RPI-GGCN: Prediction of RNA-Protein Interaction Based on Interpretability Gated Graph Convolution Neural Network and Co-Regularized Variational Autoencoders. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7681-7695. [PMID: 38709606 DOI: 10.1109/tnnls.2024.3390935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
RNA-protein interactions (RPIs) play an important role in several fundamental cellular physiological processes, including cell motility, chromosome replication, transcription and translation, and signaling. Predicting RPI can guide the exploration of cellular biological functions, intervening in diseases, and designing drugs. Given this, this study proposes the RPI-gated graph convolutional network (RPI-GGCN) method for predicting RPI based on the gated graph convolutional neural network (GGCN) and co-regularized variational autoencoder (Co-VAE). First, different types of feature information were extracted from RNA and protein sequences by nine feature extraction methods. Second, Co-VAEs are used to eliminate the redundancy of fused features and generate optimal features. Finally, this study introduces gated cyclic units into graph convolutional networks (GCNs) to construct a model for RPI prediction, which efficiently extracts topological information and improves the model's interpretable feature learning and expression capabilities. In the fivefold cross-validation test, the RPI-GGCN method achieved prediction accuracies of 97.27%, 97.32%, 96.54%, 95.76%, and 94.98% on the RPI369, RPI488, RPI1446, RPI1807, and RPI2241 datasets. To test the generalization performance of the model, we used the model trained on RPI369 to predict the independent NPInter v3.0 dataset and achieved excellent performance in all six independent validation sets. By visualizing the RPI network graph based on the prediction results, we aim to provide a new perspective and reference for studying RPI mechanisms and exploring new RPIs. Extensive experimental results demonstrate that RPI-GGCN can provide an efficient, accurate, and stable RPI prediction method.
Collapse
|
3
|
Liu Y, Wu Q, Zhou L, Liu Y, Li C, Wei Z, Peng W, Yue Y, Zhu X. Disentangled similarity graph attention heterogeneous biological memory network for predicting disease-associated miRNAs. BMC Genomics 2024; 25:1161. [PMID: 39623332 PMCID: PMC11610307 DOI: 10.1186/s12864-024-11078-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 11/21/2024] [Indexed: 12/06/2024] Open
Abstract
BACKGROUND The association between MicroRNAs (miRNAs) and diseases is crucial in treating and exploring many diseases or cancers. Although wet-lab methods for predicting miRNA-disease associations (MDAs) are effective, they are often expensive and time-consuming. Significant advancements have been made using Graph Neural Network-based methods (GNN-MDAs) to address these challenges. However, these methods still face limitations, such as not considering nodes' deep-level similarity associations and hierarchical learning patterns. Additionally, current models do not retain the memory of previously learned heterogeneous historical information about miRNAs or diseases, only focusing on parameter learning without the capability to remember heterogeneous associations. RESULTS This study introduces the K-means disentangled high-level biological similarity to utilize potential hierarchical relationships fully and proposes a Graph Attention Heterogeneous Biological Memory Network architecture (DiGAMN) with memory capabilities. Extensive experiments were conducted across four datasets, comparing the DiGAMN model and its disentangling method against ten state-of-the-art non-disentangled methods and six traditional GNNs. DiGAMN excelled, achieving AUC scores of 96.35%, 96.10%, 96.01%, and 95.89% on the Data1 to Data4 datasets, respectively, surpassing all other models. These results confirm the superior performance of DiGAMN and its disentangling method. Additionally, various ablation studies were conducted to validate the contributions of different modules within the framework, and's encoding statuses and memory units of DiGAMN were visualized to explore the utility and functionality of its modules. Case studies confirmed the effectiveness of DiGAMN's predictions, identifying several new disease-associated miRNAs. CONCLUSIONS DiGAMN introduces the use of a disentangled biological similarity approach for the first time and successfully constructs a Disentangled Graph Attention Heterogeneous Biological Memory Network model. This network can learn disentangled representations of similarity information and effectively store the potential biological entanglement information of miRNAs and diseases. By integrating disentangled similarity information with a heterogeneous attention memory network, DiGAMN enhances the model's ability to capture and utilize complex underlying biological data, significantly outperforming many existing models. The concepts used in this method also provide new perspectives for predicting miRNAs associated with diseases.
Collapse
Affiliation(s)
- Yinbo Liu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Qi Wu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Le Zhou
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Yuchen Liu
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
- China University of Petroleum, Beijing, Beijing, 102249, China
| | - Chao Li
- Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Zhuoyu Wei
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Wei Peng
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Yi Yue
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China.
| | - Xiaolei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, 230036, China.
| |
Collapse
|
4
|
Tian Z, Yu Y, Ni F, Zou Q. Drug-target interaction prediction with collaborative contrastive learning and adaptive self-paced sampling strategy. BMC Biol 2024; 22:216. [PMID: 39334132 PMCID: PMC11437672 DOI: 10.1186/s12915-024-02012-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 09/06/2024] [Indexed: 09/30/2024] Open
Abstract
BACKGROUND Drug-target interaction (DTI) prediction plays a pivotal role in drug discovery and drug repositioning, enabling the identification of potential drug candidates. However, most previous approaches often do not fully utilize the complementary relationships among multiple biological networks, which limits their ability to learn more consistent representations. Additionally, the selection strategy of negative samples significantly affects the performance of contrastive learning methods. RESULTS In this study, we propose CCL-ASPS, a novel deep learning model that incorporates Collaborative Contrastive Learning (CCL) and Adaptive Self-Paced Sampling strategy (ASPS) for drug-target interaction prediction. CCL-ASPS leverages multiple networks to learn the fused embeddings of drugs and targets, ensuring their consistent representations from individual networks. Furthermore, ASPS dynamically selects more informative negative sample pairs for contrastive learning. Experiment results on the established dataset demonstrate that CCL-ASPS achieves significant improvements compared to current state-of-the-art methods. Moreover, ablation experiments confirm the contributions of the proposed CCL and ASPS strategies. CONCLUSIONS By integrating Collaborative Contrastive Learning and Adaptive Self-Paced Sampling, the proposed CCL-ASPS effectively addresses the limitations of previous methods. This study demonstrates that CCL-ASPS achieves notable improvements in DTI predictive performance compared to current state-of-the-art approaches. The case study and cold start experiments further illustrate the capability of CCL-ASPS to effectively predict previously unknown DTI, potentially facilitating the identification of new drug-target interactions.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, Henan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Yue Yu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, Henan, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Fengming Ni
- Department of Gastroenterology, The First Hospital of Jilin University, Changchun, 130021, China.
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
5
|
Wen S, Liu Y, Yang G, Chen W, Wu H, Zhu X, Wang Y. A method for miRNA diffusion association prediction using machine learning decoding of multi-level heterogeneous graph Transformer encoded representations. Sci Rep 2024; 14:20490. [PMID: 39227405 PMCID: PMC11371806 DOI: 10.1038/s41598-024-68897-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 07/29/2024] [Indexed: 09/05/2024] Open
Abstract
MicroRNAs (miRNAs) are a key class of endogenous non-coding RNAs that play a pivotal role in regulating diseases. Accurately predicting the intricate relationships between miRNAs and diseases carries profound implications for disease diagnosis, treatment, and prevention. However, these prediction tasks are highly challenging due to the complexity of the underlying relationships. While numerous effective prediction models exist for validating these associations, they often encounter information distortion due to limitations in efficiently retaining information during the encoding-decoding process. Inspired by Multi-layer Heterogeneous Graph Transformer and Machine Learning XGboost classifier algorithm, this study introduces a novel computational approach based on multi-layer heterogeneous encoder-machine learning decoder structure for miRNA-disease association prediction (MHXGMDA). First, we employ the multi-view similarity matrices as the input coding for MHXGMDA. Subsequently, we utilize the multi-layer heterogeneous encoder to capture the embeddings of miRNAs and diseases, aiming to capture the maximum amount of relevant features. Finally, the information from all layers is concatenated to serve as input to the machine learning classifier, ensuring maximal preservation of encoding details. We conducted a comprehensive comparison of seven different classifier models and ultimately selected the XGBoost algorithm as the decoder. This algorithm leverages miRNA embedding features and disease embedding features to decode and predict the association scores between miRNAs and diseases. We applied MHXGMDA to predict human miRNA-disease associations on two benchmark datasets. Experimental findings demonstrate that our approach surpasses several leading methods in terms of both the area under the receiver operating characteristic curve and the area under the precision-recall curve.
Collapse
Affiliation(s)
- SiJian Wen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - YinBo Liu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Guang Yang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - WenXi Chen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - HaiTao Wu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - XiaoLei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
| | - YongMei Wang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
- Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Hefei, 230036, China.
| |
Collapse
|