1
|
Park JH, Cho YR. Draw+: network-based computational drug repositioning with attention walking and noise filtering. Health Inf Sci Syst 2025; 13:14. [PMID: 39764174 PMCID: PMC11700073 DOI: 10.1007/s13755-024-00326-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Accepted: 12/11/2024] [Indexed: 02/02/2025] Open
Abstract
Purpose Drug repositioning, a strategy that repurposes already-approved drugs for novel therapeutic applications, provides a faster and more cost-effective alternative to traditional drug discovery. Network-based models have been adopted by many computational methodologies, especially those that use graph neural networks to predict drug-disease associations. However, these techniques frequently overlook the quality of the input network, which is a critical factor for achieving accurate predictions. Methods We present a novel network-based framework for drug repositioning, named DRAW+, which incorporates noise filtering and feature extraction using graph neural networks and attention mechanisms. The proposed model first constructs a heterogeneous network that integrates the drug-disease association network with the similarity networks of drugs and diseases, which are upgraded through reduced-rank singular value decomposition. Next, a subgraph surrounding the targeted drug-disease node pair is extracted, allowing the model to focus on local structures. Graph neural networks are then applied to extract structural representation, followed by attention walking to capture key features of the subgraph. Finally, a multi-layer perceptron classifies the subgraph as positive or negative, which indicates the presence of the link between the target node pair. Results Experimental validation across three benchmark datasets showed that DRAW+ outperformed seven state-of-the-art methods, achieving the highest average AUROC and AUPRC, 0.963 and 0.564, respectively. Moreover, DRAW+ demonstrated its robustness by achieving the best performance across two additional datasets, further confirming its generalizability and effectiveness in diverse settings. Conclusions The proposed network-based computational approach, DRAW+, demonstrates exceptional accuracy and robustness, confirming its effectiveness in drug repositioning tasks.
Collapse
Affiliation(s)
- Jong-Hoon Park
- Division of Software, Yonsei University, Mirae Campus, Yeonsedae-gil 1, Wonju-si, 26493 Gangwon-do Korea
| | - Young-Rae Cho
- Division of Software, Yonsei University, Mirae Campus, Yeonsedae-gil 1, Wonju-si, 26493 Gangwon-do Korea
- Division of Digital Healthcare, Yonsei University, Mirae Campus, Yeonsedae-gil 1, Wonju-si, Gangwon-do 26493 Korea
| |
Collapse
|
2
|
Wang G, Chen H, Wang H, Fu Y, Shi C, Cao C, Hu X. Heterogeneous Graph Contrastive Learning with Graph Diffusion for Drug Repositioning. J Chem Inf Model 2025. [PMID: 40377926 DOI: 10.1021/acs.jcim.5c00435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2025]
Abstract
Drug repositioning, which identifies novel therapeutic applications for existing drugs, offers a cost-effective alternative to traditional drug development. However, effectively capturing the complex relationships between drugs and diseases remains challenging. We present HGCL-DR, a novel heterogeneous graph contrastive learning framework for drug repositioning that effectively integrates global and local feature representations through three key components. First, we introduce an improved heterogeneous graph contrastive learning approach to model drug-disease relationships. Second, for local feature extraction, we employ a bidirectional graph convolutional network with a subgraph generation strategy in the bipartite drug-disease association graph, while utilizing a graph diffusion process to capture long-range dependencies in drug-drug and disease-disease relation graphs. Third, for global feature extraction, we leverage contrastive learning in the heterogeneous graph to enhance embedding consistency across different feature spaces. Extensive experiments on four benchmark data sets using 10-fold cross-validation demonstrate that HGCL-DR consistently outperforms state-of-the-art baselines in both AUPR, AUROC, and F1-score metrics. Ablation studies confirm the significance of each proposed component, while case studies on Alzheimer's disease and breast neoplasms validate HGCL-DR's practical utility in identifying novel drug candidates. These results establish HGCL-DR as an effective approach for computational drug repositioning.
Collapse
Affiliation(s)
- Guishen Wang
- School of Computer Science and Engineering, Changchun University of Technology, North Yuanda Street No. 3000, Changchun 130012, Jilin, China
| | - Honghan Chen
- School of Computer Science and Engineering, Changchun University of Technology, North Yuanda Street No. 3000, Changchun 130012, Jilin, China
| | - Handan Wang
- School of Computer Science and Engineering, Changchun University of Technology, North Yuanda Street No. 3000, Changchun 130012, Jilin, China
| | - Yuyouqiang Fu
- School of Computer Science and Engineering, Changchun University of Technology, North Yuanda Street No. 3000, Changchun 130012, Jilin, China
| | - Caiye Shi
- School of Computer Science and Engineering, Changchun University of Technology, North Yuanda Street No. 3000, Changchun 130012, Jilin, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Longmian Avenue No. 101, Nanjing 211166, Jiangsu, China
| | - Xiaowen Hu
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Longmian Avenue No. 101, Nanjing 211166, Jiangsu, China
| |
Collapse
|
3
|
He F, Duan L, Xing G, Chang X, Zhou H, Yu M. AMFGNN: an adaptive multi-view fusion graph neural network model for drug prediction. Front Pharmacol 2025; 16:1543966. [PMID: 40356971 PMCID: PMC12066569 DOI: 10.3389/fphar.2025.1543966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2024] [Accepted: 04/15/2025] [Indexed: 05/15/2025] Open
Abstract
Introduction Drug development is a complex and lengthy process, and drug-disease association prediction aims to significantly improve research efficiency and success rates by precisely identifying potential associations. However, existing methods for drug-disease association prediction still face limitations in feature representation, feature integration, and generalization capabilities. Methods To address these challenges, we propose a novel model named AMFGNN (Adaptive Multi-View Fusion Graph Neural Network). This model leverages an adaptive graph neural network and a graph attention network to extract drug features and disease features, respectively. These features are then used as the initial representations of nodes in the drug-disease association network to enable efficient information fusion. Additionally, the model incorporates a contrastive learning mechanism, which enhances the similarity and differentiation between drugs and diseases through cross-view contrastive learning, thereby improving the accuracy of association prediction. Furthermore, a Kolmogorov-Arnold network is employed to perform weighted fusion of various final features, optimizing prediction performance. Results AMFGNN demonstrates a significant advantage in predictive performance, achieving an average AUC value of 0.9453, which reflects the model's high accuracy in prediction. Discussion Cross-validation results across multiple datasets indicate that AMFGNN outperforms seven advanced drug-disease association prediction methods. Additionally, case studies on Hepatoblastoma, asthma and Alzheimer's disease further confirm the model's effectiveness and potential value in real-world applications.
Collapse
Affiliation(s)
- Fang He
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- Department of Child Growth and Development Clinic, The Seventh Medical Center of PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
| | - Lian Duan
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
| | - Guodong Xing
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
| | - Xiaojing Chang
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
| | - Huixia Zhou
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
| | - Mengnan Yu
- Faculty of Pediatrics, The Chinese PLA General Hospital, Beijing, China
- National Engineering Laboratory for Birth Defects Prevention and Control of Key Technology, Beijing, China
- Beijing Key Laboratory of Pediatric Organ Failure, Beijing, China
- Department of Pediatric Surgery, The Seventh Medical Center of PLA General Hospital, Beijing, China
| |
Collapse
|
4
|
Zhu E, Li X, Liu C, Pal NR. Boosting Drug-Disease Association Prediction for Drug Repositioning via Dual-Feature Extraction and Cross-Dual-Domain Decoding. J Chem Inf Model 2025. [PMID: 40278791 DOI: 10.1021/acs.jcim.5c00070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
The extraction of biomedical data has significant academic and practical value in contemporary biomedical sciences. In recent years, drug repositioning, a cost-effective strategy for drug development by discovering new indications for approved drugs, has gained increasing attention. However, many existing drug repositioning methods focus on mining information from adjacent nodes in biomedical networks without considering the potential inter-relationships between the feature spaces of drugs and diseases. This can lead to inaccurate encoding, resulting in biased mined drug-disease association information. To address this limitation, we propose a new model called Dual-Feature Drug Repurposing Neural Network (DFDRNN). DFDRNN allows the mining of two features (similarity and association) from the drug-disease biomedical networks to encode drugs and diseases. A self-attention mechanism is utilized to extract neighbor feature information. It incorporates two dual-feature extraction modules: the single-domain dual-feature extraction (SDDFE) module for extracting features within a single domain (drugs or diseases) and the cross-domain dual-feature extraction (CDDFE) module for extracting features across domains. By utilizing these modules, we ensure more appropriate encoding of drugs and diseases. A cross-dual-domain decoder is also designed to predict drug-disease associations in both domains. Our proposed DFDRNN model outperforms six state-of-the-art methods on four benchmark data sets, achieving an average AUROC of 0.946 and an average AUPR of 0.597. Case studies on three diseases show that the proposed DFDRNN model can be applied in real-world scenarios, demonstrating its significant potential in drug repositioning.
Collapse
Affiliation(s)
- Enqiang Zhu
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Xiang Li
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Chanjuan Liu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Nikhil R Pal
- Electronics and Communication Sciences Unit, Indian Statistical Institute, Calcutta 700108, India
| |
Collapse
|
5
|
Gan Y, Li S, Xu G, Yan C, Zou G. Multidependency Graph Convolutional Networks and Contrastive Learning for Drug Repositioning. J Chem Inf Model 2025; 65:3090-3103. [PMID: 40071716 DOI: 10.1021/acs.jcim.4c02424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2025]
Abstract
The goal of drug repositioning is to expedite the drug development process by finding novel therapeutic applications for approved drugs. Using multifeature learning, different computational drug repositioning techniques have recently been introduced to predict possible drug-disease relationships. Nevertheless, current graph-based methods tend to model drug-disease interaction relationships without considering the semantic influence of node-specific side information on graphs. These approaches also suffer from the noise and sparsity inherent in the data. To address these limitations, we propose MDGCN, a novel drug repositioning method that incorporates multidependency graph convolutional networks and contrastive learning. Based on drug and disease similarity matrices and the drug-disease relationships matrix, this approach constructs multidependency graphs. It subsequently employs graph convolutional networks to spread side information between various graphs in each layer. Meanwhile, the weak supervision of drug-disease connections is effectively addressed by introducing cross-view and cross-layer contrastive learning to align node embedding across various views. Extensive experiments show that MDGCN performs better in drug-disease association prediction than seven advanced methods, offering strong support for investigating novel therapeutic indications for medications of interest.
Collapse
Affiliation(s)
- Yanglan Gan
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Shengnan Li
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Guangwei Xu
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Cairong Yan
- School of Computer Science and Technology, Donghua University, Shanghai 201620, China
| | - Guobing Zou
- School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
| |
Collapse
|
6
|
Jia X, Sun X, Wang K, Li M. DRGCL: Drug Repositioning via Semantic-Enriched Graph Contrastive Learning. IEEE J Biomed Health Inform 2025; 29:1656-1667. [PMID: 38437145 DOI: 10.1109/jbhi.2024.3372527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2024]
Abstract
Drug repositioning greatly reduces drug development costs and time by discovering new indications for existing drugs. With the development of technology and large-scale biological databases, computational drug repositioning has increasingly attracted remarkable attention, which can narrow down repositioning candidates. Recently, graph neural networks (GNNs) have been widely used and achieved promising results in drug repositioning. However, the existing GNNs based methods usually focus on modeling the complex drug-disease association graph, but ignore the semantic information on the graph, which may lead to a lack of consistency of global topology information and local semantic information for the learned features. To alleviate the above challenge, we propose a novel drug repositioning model based on graph contrastive learning, termed DRGCL. First, we treat the known drug-disease associations as the topology graph. Second, we select the top- similar neighbor from drug/disease similarity information to construct the semantic graph rather than use the traditional data augmentation strategy, thereby maximally retaining rich semantic information. Finally, we pull closer to embedding consistency of the different embedding spaces by graph contrastive learning to enhance the topology and semantic feature on the graph. We have evaluated DRGCL on four benchmark datasets and the experiment results show that the proposed DRGCL is superior to the state-of-the-art methods. Especially, the average result of DRGCL is 11.92% higher than that of the second-best method in terms of AUPRC. The case studies further demonstrate the reliability of DRGCL.
Collapse
|
7
|
Tang X, Zhou C, Lu C, Meng Y, Xu J, Hu X, Tian G, Yang J. Enhancing Drug Repositioning Through Local Interactive Learning With Bilinear Attention Networks. IEEE J Biomed Health Inform 2025; 29:1644-1655. [PMID: 37988217 DOI: 10.1109/jbhi.2023.3335275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
Drug repositioning has emerged as a promising strategy for identifying new therapeutic applications for existing drugs. In this study, we present DRGBCN, a novel computational method that integrates heterogeneous information through a deep bilinear attention network to infer potential drugs for specific diseases. DRGBCN involves constructing a comprehensive drug-disease network by incorporating multiple similarity networks for drugs and diseases. Firstly, we introduce a layer attention mechanism to effectively learn the embeddings of graph convolutional layers from these networks. Subsequently, a bilinear attention network is constructed to capture pairwise local interactions between drugs and diseases. This combined approach enhances the accuracy and reliability of predictions. Finally, a multi-layer perceptron module is employed to evaluate potential drugs. Through extensive experiments on three publicly available datasets, DRGBCN demonstrates better performance over baseline methods in 10-fold cross-validation, achieving an average area under the receiver operating characteristic curve (AUROC) of 0.9399. Furthermore, case studies on bladder cancer and acute lymphoblastic leukemia confirm the practical application of DRGBCN in real-world drug repositioning scenarios. Importantly, our experimental results from the drug-disease network analysis reveal the successful clustering of similar drugs within the same community, providing valuable insights into drug-disease interactions. In conclusion, DRGBCN holds significant promise for uncovering new therapeutic applications of existing drugs, thereby contributing to the advancement of precision medicine.
Collapse
|
8
|
Liu Q, Chen Z, Wang B, Pan B, Zhang Z, Shen M, Zhao W, Zhang T, Li S, Liu L. Leveraging Network Target Theory for Efficient Prediction of Drug-Disease Interactions: A Transfer Learning Approach. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2409130. [PMID: 39874191 PMCID: PMC11923905 DOI: 10.1002/advs.202409130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2024] [Revised: 12/22/2024] [Indexed: 01/30/2025]
Abstract
Efficient virtual screening methods can expedite drug discovery and facilitate the development of innovative therapeutics. This study presents a novel transfer learning model based on network target theory, integrating deep learning techniques with diverse biological molecular networks to predict drug-disease interactions. By incorporating network techniques that leverage vast existing knowledge, the approach enables the extraction of more precise and informative drug features, resulting in the identification of 88,161 drug-disease interactions involving 7,940 drugs and 2,986 diseases. Furthermore, this model effectively addresses the challenge of balancing large-scale positive and negative samples, leading to improved performance across various evaluation metrics such as an Area under curve (AUC) of 0.9298 and an F1 score of 0.6316. Moreover, the algorithm accurately predicts drug combinations and achieves an F1 score of 0.7746 after fine-tuning. Additionally, it identifies two previously unexplored synergistic drug combinations for distinct cancer types in disease-specific biological network environments. These findings are further validated through in vitro cytotoxicity assays, demonstrating the potential of the model to enhance drug development and identify effective treatment regimens for specific diseases.
Collapse
Affiliation(s)
- Qingyuan Liu
- Department of Molecular Pharmacology, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, Tianjin's Clinical Research Center for CancerTianjin Medical University Cancer Institute & HospitalTianjin300060China
- Institute for TCM‐X, Department of AutomationTsinghua UniversityBeijing100084China
| | - Zizhen Chen
- Department of Molecular Pharmacology, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, Tianjin's Clinical Research Center for CancerTianjin Medical University Cancer Institute & HospitalTianjin300060China
| | - Boyang Wang
- Institute for TCM‐X, Department of AutomationTsinghua UniversityBeijing100084China
| | - Boyu Pan
- Department of Molecular Pharmacology, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, Tianjin's Clinical Research Center for CancerTianjin Medical University Cancer Institute & HospitalTianjin300060China
| | - Zhuoyu Zhang
- Institute for TCM‐X, Department of AutomationTsinghua UniversityBeijing100084China
| | - Miaomiao Shen
- Department of Molecular Pharmacology, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, Tianjin's Clinical Research Center for CancerTianjin Medical University Cancer Institute & HospitalTianjin300060China
| | - Weibo Zhao
- Institute for TCM‐X, Department of AutomationTsinghua UniversityBeijing100084China
| | - Tingyu Zhang
- Institute for TCM‐X, Department of AutomationTsinghua UniversityBeijing100084China
| | - Shao Li
- Institute for TCM‐X, Department of AutomationTsinghua UniversityBeijing100084China
- Henan Academy of SciencesHenan450046China
| | - Liren Liu
- Department of Molecular Pharmacology, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, Tianjin's Clinical Research Center for CancerTianjin Medical University Cancer Institute & HospitalTianjin300060China
| |
Collapse
|
9
|
Réda C, Vie JJ, Wolkenhauer O. Joint embedding-classifier learning for interpretable collaborative filtering. BMC Bioinformatics 2025; 26:26. [PMID: 39844056 PMCID: PMC11755841 DOI: 10.1186/s12859-024-06026-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Accepted: 12/27/2024] [Indexed: 01/24/2025] Open
Abstract
BACKGROUND Interpretability is a topical question in recommender systems, especially in healthcare applications. An interpretable classifier quantifies the importance of each input feature for the predicted item-user association in a non-ambiguous fashion. RESULTS We introduce the novel Joint Embedding Learning-classifier for improved Interpretability (JELI). By combining the training of a structured collaborative-filtering classifier and an embedding learning task, JELI predicts new user-item associations based on jointly learned item and user embeddings while providing feature-wise importance scores. Therefore, JELI flexibly allows the introduction of priors on the connections between users, items, and features. In particular, JELI simultaneously (a) learns feature, item, and user embeddings; (b) predicts new item-user associations; (c) provides importance scores for each feature. Moreover, JELI instantiates a generic approach to training recommender systems by encoding generic graph-regularization constraints. CONCLUSIONS First, we show that the joint training approach yields a gain in the predictive power of the downstream classifier. Second, JELI can recover feature-association dependencies. Finally, JELI induces a restriction in the number of parameters compared to baselines in synthetic and drug-repurposing data sets.
Collapse
Affiliation(s)
- Clémence Réda
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany.
| | | | - Olaf Wolkenhauer
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany
- Leibniz-Institute for Food Systems Biology, 85354, Freising, Germany
- Stellenbosch Institute of Advanced Study, Wallenberg Research Centre, Stellenbosch, 7602, South Africa
| |
Collapse
|
10
|
Réda C, Vie JJ, Wolkenhauer O. Comprehensive evaluation of pure and hybrid collaborative filtering in drug repurposing. Sci Rep 2025; 15:2711. [PMID: 39837888 PMCID: PMC11751339 DOI: 10.1038/s41598-025-85927-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 01/07/2025] [Indexed: 01/23/2025] Open
Abstract
Drug development is known to be a costly and time-consuming process, which is prone to high failure rates. Drug repurposing allows drug discovery by reusing already approved compounds. The outcomes of past clinical trials can be used to predict novel drug-disease associations by leveraging drug- and disease-related similarities. To tackle this classification problem, collaborative filtering with implicit feedback (and potentially additional data on drugs and diseases) has become popular. It can handle large imbalances between negative and positive known associations and known and unknown associations. However, properly evaluating the improvement over the state of the art is challenging, as there is no consensus approach to compare models. We propose a reproducible methodology for comparing collaborative filtering-based drug repurposing. We illustrate this method by comparing 11 models from the literature on eight diverse drug repurposing datasets. Based on this benchmark, we derive guidelines to ensure a fair and comprehensive evaluation of the performance of those models. In particular, an uncontrolled bias on unknown associations might lead to severe data leakage and a misestimation of the model's true performance. Moreover, in drug repurposing, the ability of a model to extrapolate beyond its training distribution is crucial and should also be assessed. Finally, we identified a subcategory of collaborative filtering that seems efficient and robust to distribution shifts. Benchmarks constitute an essential step towards increased reproducibility and more accessible development of competitive drug repurposing methods.
Collapse
Affiliation(s)
- Clémence Réda
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, 18051, Germany.
| | | | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, 18051, Germany
- Leibniz-Institute for Food Systems Biology, Freising, 85354, Germany
- Stellenbosch Institute of Advanced Study, Wallenberg Research Centre, Stellenbosch, 7602, South Africa
| |
Collapse
|
11
|
Tang X, Hou Y, Meng Y, Wang Z, Lu C, Lv J, Hu X, Xu J, Yang J. CDPMF-DDA: contrastive deep probabilistic matrix factorization for drug-disease association prediction. BMC Bioinformatics 2025; 26:5. [PMID: 39773275 PMCID: PMC11708303 DOI: 10.1186/s12859-024-06032-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Accepted: 12/27/2024] [Indexed: 01/11/2025] Open
Abstract
The process of new drug development is complex, whereas drug-disease association (DDA) prediction aims to identify new therapeutic uses for existing medications. However, existing graph contrastive learning approaches typically rely on single-view contrastive learning, which struggle to fully capture drug-disease relationships. Subsequently, we introduce a novel multi-view contrastive learning framework, named CDPMF-DDA, which enhances the model's ability to capture drug-disease associations by incorporating diverse information representations from different views. First, we decompose the original drug-disease association matrix into drug and disease feature matrices, which are then used to reconstruct the drug-disease association network, as well as the drug-drug and disease-disease similarity networks. This process effectively reduces noise in the data, establishing a reliable foundation for the networks produced. Next, we generate multiple contrastive views from both the original and generated networks. These views effectively capture hidden feature associations, significantly enhancing the model's ability to represent complex relationships. Extensive cross-validation experiments on three standard datasets show that CDPMF-DDA achieves an average AUC of 0.9475 and an AUPR of 0.5009, outperforming existing models. Additionally, case studies on Alzheimer's disease and epilepsy further validate the model's effectiveness, demonstrating its high accuracy and robustness in drug-disease association prediction. Based on a multi-view contrastive learning framework, CDPMF-DDA is capable of integrating multi-source information and effectively capturing complex drug-disease associations, making it a powerful tool for drug repositioning and the discovery of new therapeutic strategies.
Collapse
Affiliation(s)
- Xianfang Tang
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Yawen Hou
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Yajie Meng
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Zhaojing Wang
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Changcheng Lu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Juan Lv
- College of Traditional Chinese Medicine, Changsha Medical University, Changsha, 410000, China
| | - Xinrong Hu
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Junlin Xu
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China.
| | | |
Collapse
|
12
|
Van Norden M, Mangione W, Falls Z, Samudrala R. Strategies for robust, accurate, and generalizable benchmarking of drug discovery platforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.10.627863. [PMID: 39764006 PMCID: PMC11702551 DOI: 10.1101/2024.12.10.627863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/15/2025]
Abstract
Benchmarking is an important step in the improvement, assessment, and comparison of the performance of drug discovery platforms and technologies. We revised the existing benchmarking protocols in our Computational Analysis of Novel Drug Opportunities (CANDO) multiscale therapeutic discovery platform to improve utility and performance. We optimized multiple parameters used in drug candidate prediction and assessment with these updated benchmarking protocols. CANDO ranked 7.4% of known drugs in the top 10 compounds for their respective diseases/indications based on drug-indication associations/mappings obtained from the Comparative Toxicogenomics Database (CTD) using these optimized parameters. This increased to 12.1% when drug-indication mappings were obtained from the Therapeutic Targets Database. Performance on an indication was weakly correlated (Spearman correlation coefficient >0.3) with indication size (number of drugs associated with an indication) and moderately correlated (correlation coefficient >0.5) with compound chemical similarity. There was also moderate correlation between our new and original benchmarking protocols when assessing performance per indication using each protocol. Benchmarking results were also dependent on the source of the drug-indication mapping used: a higher proportion of indication-associated drugs were recalled in the top 100 compounds when using the Therapeutic Targets Database (TTD), which only includes FDA-approved drug-indication associations (in contrast to the CTD, which includes associations drawn from the literature). We also created compbench, a publicly available head-to-head benchmarking protocol that allows consistent assessment and comparison of different drug discovery platforms. Using this protocol, we compared two pipelines for drug repurposing within CANDO; our primary pipeline outperformed another similarity-based pipeline still in development that clusters signatures based on their associated Gene Ontology terms. Our study sets a precedent for the complete, comprehensive, and comparable benchmarking of drug discovery platforms, resulting in more accurate drug candidate predictions.
Collapse
Affiliation(s)
- Melissa Van Norden
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - William Mangione
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Zackary Falls
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| |
Collapse
|
13
|
Liu T, Wang S, Zhang Y, Li Y, Liu Y, Huang S. TIWMFLP: Two-Tier Interactive Weighted Matrix Factorization and Label Propagation Based on Similarity Matrix Fusion for Drug-Disease Association Prediction. J Chem Inf Model 2024; 64:8641-8654. [PMID: 39486090 DOI: 10.1021/acs.jcim.4c01589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2024]
Abstract
Accurately identifying new therapeutic uses for drugs is crucial for advancing pharmaceutical research and development. Matrix factorization is often used in association prediction due to its simplicity and high interpretability. However, existing matrix factorization models do not enable real-time interaction between molecular feature matrices and similarity matrices, nor do they consider the geometric structure of the matrices. Additionally, efficiently integrating multisource data remains a significant challenge. To address these issues, we propose a two-tier interactive weighted matrix factorization and label propagation model based on similarity matrix fusion (TIWMFLP) to assist in personalized treatment. First, we calculate the Gaussian and Laplace kernel similarities for drugs and diseases using known drug-disease associations. We then introduce a new multisource similarity fusion method, called similarity matrix fusion (SMF), to integrate these drug/disease similarities. SMF not only considers the different contributions represented by each neighbor but also incorporates drug-disease association information to enhance the contextual topological relationships and potential features of each drug/disease node in the network. Second, we innovatively developed a two-tier interactive weighted matrix factorization (TIWMF) method to process three biological networks. This method realizes for the first time the real-time interaction between the drug/disease feature matrix and its similarity matrix, allowing for a better capture of the complex relationships between drugs and diseases. Additionally, the weighted matrix of the drug/disease similarity matrix is introduced to preserve the underlying structure of the similarity matrix. Finally, the label propagation algorithm makes predictions based on the three updated biological networks. Experimental outcomes reveal that TIWMFLP consistently surpasses state-of-the-art models on four drug-disease data sets, two small molecule-miRNA data sets, and one miRNA-disease data set.
Collapse
Affiliation(s)
- Tiyao Liu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Shudong Wang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266525, China
| | - Yunyin Li
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Yingye Liu
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| | - Shiyuan Huang
- College of Computer Science and Technology, Qingdao Institute of Software, China University of Petroleum, Qingdao 266580, China
| |
Collapse
|
14
|
Dao NA, Le MH, Dang XT. Label Transfer for Drug Disease Association in Three Meta-Paths. Evol Bioinform Online 2024; 20:11769343241272414. [PMID: 39279816 PMCID: PMC11401013 DOI: 10.1177/11769343241272414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 07/15/2024] [Indexed: 09/18/2024] Open
Abstract
The identification of potential interactions and relationships between diseases and drugs is significant in public health care and drug discovery. As we all know, experimenting to determine the drug-disease interactions is very expensive in both time and money. However, there are still many drug-disease associations that are still undiscovered and potential. Therefore, the development of computational methods to explore the relationship between drugs and diseases is very important and essential. Many computational methods for predicting drug-disease associations have been developed based on known interactions to learn potential interactions of unknown drug-disease pairs. In this paper, we propose 3 new main groups of meta-paths based on the heterogeneous biological network of drug-protein-disease objects. For each meta-path, we design a machine learning model, then an integrated learning method is formed by these models. We evaluated our approach on 3 standard datasets which are DrugBank, OMIM, and Gottlieb's dataset. Experimental results demonstrate that the proposed method is better than some recent methods such as EMP-SVD, LRSSL, MBiRW, MPG-DDA, SCMFDD,. . . in some measures such as AUC, AUPR, and F1-score.
Collapse
|
15
|
Wang Z, Wei Z. PT-KGNN: A framework for pre-training biomedical knowledge graphs with graph neural networks. Comput Biol Med 2024; 178:108768. [PMID: 38936076 DOI: 10.1016/j.compbiomed.2024.108768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 05/23/2024] [Accepted: 06/15/2024] [Indexed: 06/29/2024]
Abstract
Biomedical knowledge graphs (KGs) serve as comprehensive data repositories that contain rich information about nodes and edges, providing modeling capabilities for complex relationships among biological entities. Many approaches either learn node features through traditional machine learning methods, or leverage graph neural networks (GNNs) to directly learn features of target nodes in the biomedical KGs and utilize them for downstream tasks. Motivated by the pre-training technique in natural language processing (NLP), we propose a framework named PT-KGNN (Pre-Training the biomedical KG with GNNs) to learn embeddings of nodes in a broader context by applying GNNs on the biomedical KG. We design several experiments to evaluate the effectivity of our proposed framework and the impact of the scale of KGs. The results of tasks consistently improve as the scale of the biomedical KG used for pre-training increases. Pre-training on large-scale biomedical KGs significantly enhances the drug-drug interaction (DDI) and drug-disease association (DDA) prediction performance on the independent dataset. The embeddings derived from a larger biomedical KG have demonstrated superior performance compared to those obtained from a smaller KG. By applying pre-training techniques on biomedical KGs, rich semantic and structural information can be learned, leading to enhanced performance on downstream tasks. it is evident that pre-training techniques hold tremendous potential and wide-ranging applications in bioinformatics.
Collapse
Affiliation(s)
- Zhenxing Wang
- School of Data Science, Fudan University, 220 Handan Rd., Shanghai, 200433, China.
| | - Zhongyu Wei
- School of Data Science, Fudan University, 220 Handan Rd., Shanghai, 200433, China.
| |
Collapse
|
16
|
Ghandikota SK, Jegga AG. Application of artificial intelligence and machine learning in drug repurposing. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 205:171-211. [PMID: 38789178 DOI: 10.1016/bs.pmbts.2024.03.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
The purpose of drug repurposing is to leverage previously approved drugs for a particular disease indication and apply them to another disease. It can be seen as a faster and more cost-effective approach to drug discovery and a powerful tool for achieving precision medicine. In addition, drug repurposing can be used to identify therapeutic candidates for rare diseases and phenotypic conditions with limited information on disease biology. Machine learning and artificial intelligence (AI) methodologies have enabled the construction of effective, data-driven repurposing pipelines by integrating and analyzing large-scale biomedical data. Recent technological advances, especially in heterogeneous network mining and natural language processing, have opened up exciting new opportunities and analytical strategies for drug repurposing. In this review, we first introduce the challenges in repurposing approaches and highlight some success stories, including those during the COVID-19 pandemic. Next, we review some existing computational frameworks in the literature, organized on the basis of the type of biomedical input data analyzed and the computational algorithms involved. In conclusion, we outline some exciting new directions that drug repurposing research may take, as pioneered by the generative AI revolution.
Collapse
Affiliation(s)
- Sudhir K Ghandikota
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Anil G Jegga
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States.
| |
Collapse
|
17
|
Huang MS, Han JC, Lin PY, You YT, Tsai RTH, Hsu WL. Surveying biomedical relation extraction: a critical examination of current datasets and the proposal of a new resource. Brief Bioinform 2024; 25:bbae132. [PMID: 38609331 PMCID: PMC11014787 DOI: 10.1093/bib/bbae132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 11/06/2023] [Accepted: 03/02/2023] [Indexed: 04/14/2024] Open
Abstract
Natural language processing (NLP) has become an essential technique in various fields, offering a wide range of possibilities for analyzing data and developing diverse NLP tasks. In the biomedical domain, understanding the complex relationships between compounds and proteins is critical, especially in the context of signal transduction and biochemical pathways. Among these relationships, protein-protein interactions (PPIs) are of particular interest, given their potential to trigger a variety of biological reactions. To improve the ability to predict PPI events, we propose the protein event detection dataset (PEDD), which comprises 6823 abstracts, 39 488 sentences and 182 937 gene pairs. Our PEDD dataset has been utilized in the AI CUP Biomedical Paper Analysis competition, where systems are challenged to predict 12 different relation types. In this paper, we review the state-of-the-art relation extraction research and provide an overview of the PEDD's compilation process. Furthermore, we present the results of the PPI extraction competition and evaluate several language models' performances on the PEDD. This paper's outcomes will provide a valuable roadmap for future studies on protein event detection in NLP. By addressing this critical challenge, we hope to enable breakthroughs in drug discovery and enhance our understanding of the molecular mechanisms underlying various diseases.
Collapse
Affiliation(s)
- Ming-Siang Huang
- Intelligent Agent Systems Laboratory, Department of Computer Science and Information Engineering, Asia University, New Taipei City, Taiwan
- National Institute of Cancer Research, National Health Research Institutes, Tainan, Taiwan
- Department of Computer Science and Information Engineering, College of Information and Electrical Engineering, Asia University, Taichung, Taiwan
| | - Jen-Chieh Han
- Intelligent Information Service Research Laboratory, Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Pei-Yen Lin
- Intelligent Agent Systems Laboratory, Department of Computer Science and Information Engineering, Asia University, New Taipei City, Taiwan
| | - Yu-Ting You
- Intelligent Agent Systems Laboratory, Department of Computer Science and Information Engineering, Asia University, New Taipei City, Taiwan
| | - Richard Tzong-Han Tsai
- Intelligent Information Service Research Laboratory, Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
- Center for Geographic Information Science, Research Center for Humanities and Social Sciences, Academia Sinica, Taipei, Taiwan
| | - Wen-Lian Hsu
- Intelligent Agent Systems Laboratory, Department of Computer Science and Information Engineering, Asia University, New Taipei City, Taiwan
- Department of Computer Science and Information Engineering, College of Information and Electrical Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|
18
|
Li Y, Yang Y, Tong Z, Wang Y, Mi Q, Bai M, Liang G, Li B, Shu K. A comparative benchmarking and evaluation framework for heterogeneous network-based drug repositioning methods. Brief Bioinform 2024; 25:bbae172. [PMID: 38647153 PMCID: PMC11033846 DOI: 10.1093/bib/bbae172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 02/25/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
Computational drug repositioning, which involves identifying new indications for existing drugs, is an increasingly attractive research area due to its advantages in reducing both overall cost and development time. As a result, a growing number of computational drug repositioning methods have emerged. Heterogeneous network-based drug repositioning methods have been shown to outperform other approaches. However, there is a dearth of systematic evaluation studies of these methods, encompassing performance, scalability and usability, as well as a standardized process for evaluating new methods. Additionally, previous studies have only compared several methods, with conflicting results. In this context, we conducted a systematic benchmarking study of 28 heterogeneous network-based drug repositioning methods on 11 existing datasets. We developed a comprehensive framework to evaluate their performance, scalability and usability. Our study revealed that methods such as HGIMC, ITRPCA and BNNR exhibit the best overall performance, as they rely on matrix completion or factorization. HINGRL, MLMC, ITRPCA and HGIMC demonstrate the best performance, while NMFDR, GROBMC and SCPMF display superior scalability. For usability, HGIMC, DRHGCN and BNNR are the top performers. Building on these findings, we developed an online tool called HN-DREP (http://hn-drep.lyhbio.com/) to facilitate researchers in viewing all the detailed evaluation results and selecting the appropriate method. HN-DREP also provides an external drug repositioning prediction service for a specific disease or drug by integrating predictions from all methods. Furthermore, we have released a Snakemake workflow named HN-DRES (https://github.com/lyhbio/HN-DRES) to facilitate benchmarking and support the extension of new methods into the field.
Collapse
Affiliation(s)
- Yinghong Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Yinqi Yang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Zhuohao Tong
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Yu Wang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Qin Mi
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Mingze Bai
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, P. R. China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, P. R. China
| | - Kunxian Shu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| |
Collapse
|
19
|
Lei S, Lei X, Chen M, Pan Y. Drug Repositioning Based on Deep Sparse Autoencoder and Drug-Disease Similarity. Interdiscip Sci 2024; 16:160-175. [PMID: 38103130 DOI: 10.1007/s12539-023-00593-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 11/03/2023] [Accepted: 11/06/2023] [Indexed: 12/17/2023]
Abstract
Drug repositioning is critical to drug development. Previous drug repositioning methods mainly constructed drug-disease heterogeneous networks to extract drug-disease features. However, these methods faced difficulty when we are using structurally simple models to deal with complex heterogeneous networks. Therefore, in this study, the researchers introduced a drug repositioning method named DRDSA. The method utilizes a deep sparse autoencoder and integrates drug-disease similarities. First, the researchers constructed a drug-disease feature network by incorporating information from drug chemical structure, disease semantic data, and existing known drug-disease associations. Then, we learned the low-dimensional representation of the feature network using a deep sparse autoencoder. Finally, we utilized a deep neural network to make predictions on new drug-disease associations based on the feature representation. The experimental results show that our proposed method has achieved optimal results on all four benchmark datasets, especially on the CTD dataset where AUC and AUPR reached 0.9619 and 0.9676, respectively, outperforming other baseline methods. In the case study, the researchers predicted the top ten antiviral drugs for COVID-19. Remarkably, six out of these predictions were subsequently validated by other literature sources.
Collapse
Affiliation(s)
- Song Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
| | - Ming Chen
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
- Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Shenzhen, 518055, China.
| |
Collapse
|
20
|
He S, Yun L, Yi H. Fusing graph transformer with multi-aggregate GCN for enhanced drug-disease associations prediction. BMC Bioinformatics 2024; 25:79. [PMID: 38378479 PMCID: PMC10877759 DOI: 10.1186/s12859-024-05705-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 02/14/2024] [Indexed: 02/22/2024] Open
Abstract
BACKGROUND Identification of potential drug-disease associations is important for both the discovery of new indications for drugs and for the reduction of unknown adverse drug reactions. Exploring the potential links between drugs and diseases is crucial for advancing biomedical research and improving healthcare. While advanced computational techniques play a vital role in revealing the connections between drugs and diseases, current research still faces challenges in the process of mining potential relationships between drugs and diseases using heterogeneous network data. RESULTS In this study, we propose a learning framework for fusing Graph Transformer Networks and multi-aggregate graph convolutional network to learn efficient heterogenous information graph representations for drug-disease association prediction, termed WMAGT. This method extensively harnesses the capabilities of a robust graph transformer, effectively modeling the local and global interactions of nodes by integrating a graph convolutional network and a graph transformer with self-attention mechanisms in its encoder. We first integrate drug-drug, drug-disease, and disease-disease networks to construct heterogeneous information graph. Multi-aggregate graph convolutional network and graph transformer are then used in conjunction with neural collaborative filtering module to integrate information from different domains into highly effective feature representation. CONCLUSIONS Rigorous cross-validation, ablation studies examined the robustness and effectiveness of the proposed method. Experimental results demonstrate that WMAGT outperforms other state-of-the-art methods in accurate drug-disease association prediction, which is beneficial for drug repositioning and drug safety research.
Collapse
Affiliation(s)
- Shihui He
- School of Information Science and Technology, Yunnan Normal University, Kunming, 650500, China
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education, Kunming, 650500, China
| | - Lijun Yun
- School of Information Science and Technology, Yunnan Normal University, Kunming, 650500, China.
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education, Kunming, 650500, China.
| | - Haicheng Yi
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710129, China.
| |
Collapse
|
21
|
Sun X, Jia X, Lu Z, Tang J, Li M. Drug repositioning with adaptive graph convolutional networks. Bioinformatics 2024; 40:btad748. [PMID: 38070161 PMCID: PMC10761094 DOI: 10.1093/bioinformatics/btad748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 11/27/2023] [Accepted: 12/08/2023] [Indexed: 01/04/2024] Open
Abstract
MOTIVATION Drug repositioning is an effective strategy to identify new indications for existing drugs, providing the quickest possible transition from bench to bedside. With the rapid development of deep learning, graph convolutional networks (GCNs) have been widely adopted for drug repositioning tasks. However, prior GCNs based methods exist limitations in deeply integrating node features and topological structures, which may hinder the capability of GCNs. RESULTS In this study, we propose an adaptive GCNs approach, termed AdaDR, for drug repositioning by deeply integrating node features and topological structures. Distinct from conventional graph convolution networks, AdaDR models interactive information between them with adaptive graph convolution operation, which enhances the expression of model. Concretely, AdaDR simultaneously extracts embeddings from node features and topological structures and then uses the attention mechanism to learn adaptive importance weights of the embeddings. Experimental results show that AdaDR achieves better performance than multiple baselines for drug repositioning. Moreover, in the case study, exploratory analyses are offered for finding novel drug-disease associations. AVAILABILITY AND IMPLEMENTATION The soure code of AdaDR is available at: https://github.com/xinliangSun/AdaDR.
Collapse
Affiliation(s)
- Xinliang Sun
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Xiao Jia
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Zhangli Lu
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, FI00014 Helsinki, Finland
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| |
Collapse
|
22
|
Son J, Kim D. Applying network link prediction in drug discovery: an overview of the literature. Expert Opin Drug Discov 2024; 19:43-56. [PMID: 37794688 DOI: 10.1080/17460441.2023.2267020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/02/2023] [Indexed: 10/06/2023]
Abstract
INTRODUCTION Network representation can give a holistic view of relationships for biomedical entities through network topology. Link prediction estimates the probability of link formation between the pair of unconnected nodes. In the drug discovery process, the link prediction method not only enables the detection of connectivity patterns but also predicts the effects of one biomedical entity to multiple entities simultaneously and vice versa, which is useful for many applications. AREAS COVERED The authors provide a comprehensive overview of network link prediction in drug discovery. Link prediction methodologies such as similarity-based approaches, embedding-based approaches, probabilistic model-based approaches, and preprocessing methods are summarized with examples. In addition to describing their properties and limitations, the authors discuss the applications of link prediction in drug discovery based on the relationship between biomedical concepts. EXPERT OPINION Link prediction is a powerful method to infer the existence of novel relationships in drug discovery. However, link prediction has been hampered by the sparsity of data and the lack of negative links in biomedical networks. With preprocessing to balance positive and negative samples and the collection of more data, the authors believe it is possible to develop more reliable link prediction methods that can become invaluable tools for successful drug discovery.
Collapse
Affiliation(s)
- Jeongtae Son
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Dongsup Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| |
Collapse
|
23
|
Meng Y, Wang Y, Xu J, Lu C, Tang X, Peng T, Zhang B, Tian G, Yang J. Drug repositioning based on weighted local information augmented graph neural network. Brief Bioinform 2023; 25:bbad431. [PMID: 38019732 PMCID: PMC10686358 DOI: 10.1093/bib/bbad431] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 10/13/2023] [Accepted: 11/05/2023] [Indexed: 12/01/2023] Open
Abstract
Drug repositioning, the strategy of redirecting existing drugs to new therapeutic purposes, is pivotal in accelerating drug discovery. While many studies have engaged in modeling complex drug-disease associations, they often overlook the relevance between different node embeddings. Consequently, we propose a novel weighted local information augmented graph neural network model, termed DRAGNN, for drug repositioning. Specifically, DRAGNN firstly incorporates a graph attention mechanism to dynamically allocate attention coefficients to drug and disease heterogeneous nodes, enhancing the effectiveness of target node information collection. To prevent excessive embedding of information in a limited vector space, we omit self-node information aggregation, thereby emphasizing valuable heterogeneous and homogeneous information. Additionally, average pooling in neighbor information aggregation is introduced to enhance local information while maintaining simplicity. A multi-layer perceptron is then employed to generate the final association predictions. The model's effectiveness for drug repositioning is supported by a 10-times 10-fold cross-validation on three benchmark datasets. Further validation is provided through analysis of the predicted associations using multiple authoritative data sources, molecular docking experiments and drug-disease network analysis, laying a solid foundation for future drug discovery.
Collapse
Affiliation(s)
- Yajie Meng
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Yi Wang
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Lushan Road (S), Yuelu District, Changsha, Hunan Province 410082, China
| | - Changcheng Lu
- College of Computer Science and Electronic Engineering, Hunan University, Lushan Road (S), Yuelu District, Changsha, Hunan Province 410082, China
| | - Xianfang Tang
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Tao Peng
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Bengong Zhang
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Geng Tian
- Geneis Beijing Co., Ltd, No. 31, New North Road, Laiguanying, Chaoyang District, Beijing 100102, China
| | - Jialiang Yang
- Geneis Beijing Co., Ltd, No. 31, New North Road, Laiguanying, Chaoyang District, Beijing 100102, China
| |
Collapse
|
24
|
Wang S, Li J, Wang D, Xu D, Jin J, Wang Y. Predicting Drug-Disease Associations Through Similarity Network Fusion and Multi-View Feature Projection Representation. IEEE J Biomed Health Inform 2023; 27:5165-5176. [PMID: 37527303 DOI: 10.1109/jbhi.2023.3300717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Predicting drug-disease associations (DDAs) through computational methods has become a prevalent trend in drug development because of their high efficiency and low cost. Existing methods usually focus on constructing heterogeneous networks by collecting multiple data resources to improve prediction ability. However, potential association possibilities of numerous unconfirmed drug-related or disease-related pairs are not sufficiently considered. In this article, we propose a novel computational model to predict new DDAs. First, a heterogeneous network is constructed, including four types of nodes (drugs, targets, cell lines, diseases) and three types of edges (associations, association scores, similarities). Second, an updating and merging-based similarity network fusion method, termed UM-SF, is presented to fuse various similarity networks with diverse weights. Finally, an intermediate layer-mediated multi-view feature projection representation method, termed IM-FP, is proposed to calculate the predicted DDA scores. This method uses multiple association scores to construct multi-view drug features, then projects them into disease space through the intermediate layer, where an intermediate layer similarity constraint is designed to learn the projection matrices. Results of comparative experiments reveal the effectiveness of our innovations. Comparisons with other state-of-the-art models by the 10-fold cross-validation experiment indicate our model's advantage on AUROC and AUPR metrics. Moreover, our proposed model successfully predicted 107 novel high-ranked DDAs.
Collapse
|
25
|
Xuan P, Xu K, Cui H, Nakaguchi T, Zhang T. Graph generative and adversarial strategy-enhanced node feature learning and self-calibrated pairwise attribute encoding for prediction of drug-related side effects. Front Pharmacol 2023; 14:1257842. [PMID: 37731739 PMCID: PMC10507253 DOI: 10.3389/fphar.2023.1257842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 08/17/2023] [Indexed: 09/22/2023] Open
Abstract
Background: Inferring drug-related side effects is beneficial for reducing drug development cost and time. Current computational prediction methods have concentrated on graph reasoning over heterogeneous graphs comprising the drug and side effect nodes. However, the various topologies and node attributes within multiple drug-side effect heterogeneous graphs have not been completely exploited. Methods: We proposed a new drug-side effect association prediction method, GGSC, to deeply integrate the diverse topologies and attributes from multiple heterogeneous graphs and the self-calibration attributes of each drug-side effect node pair. First, we created two heterogeneous graphs comprising the drug and side effect nodes and their related similarity and association connections. Since each heterogeneous graph has its specific topology and node attributes, a node feature learning strategy was designed and the learning for each graph was enhanced from a graph generative and adversarial perspective. We constructed a generator based on a graph convolutional autoencoder to encode the topological structure and node attributes from the whole heterogeneous graph and then generate the node features embedding the graph topology. A discriminator based on multilayer perceptron was designed to distinguish the generated topological features from the original ones. We also designed representation-level attention to discriminate the contributions of topological representations from multiple heterogeneous graphs and adaptively fused them. Finally, we constructed a self-calibration module based on convolutional neural networks to guide pairwise attribute learning through the features of the small latent space. Results: The comparison experiment results showed that GGSC had higher prediction performance than several state-of-the-art prediction methods. The ablation experiments demonstrated the effectiveness of topological enhancement learning, representation-level attention, and self-calibrated pairwise attribute learning. In addition, case studies over five drugs demonstrated GGSC's ability in discovering the potential drug-related side effect candidates. Conclusion: We proposed a drug-side effect association prediction method, and the method is beneficial for screening the reliable association candidates for the biologists to discover the actual associations.
Collapse
Affiliation(s)
- Ping Xuan
- Department of Computer Science, School of Engineering, Shantou University, Shantou, China
| | - Kai Xu
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, VI, Australia
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
- School of Mathematical Science, Heilongjiang University, Harbin, China
| |
Collapse
|
26
|
Ai C, Yang H, Ding Y, Tang J, Guo F. Low Rank Matrix Factorization Algorithm Based on Multi-Graph Regularization for Detecting Drug-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3033-3043. [PMID: 37159322 DOI: 10.1109/tcbb.2023.3274587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Detecting potential associations between drugs and diseases plays an indispensable role in drug development, which has also become a research hotspot in recent years. Compared with traditional methods, some computational approaches have the advantages of fast speed and low cost, which greatly accelerate the progress of predicting the drug-disease association. In this study, we propose a novel similarity-based method of low-rank matrix decomposition based on multi-graph regularization. On the basis of low-rank matrix factorization with L2 regularization, the multi-graph regularization constraint is constructed by combining a variety of similarity matrices from drugs and diseases respectively. In the experiments, we analyze the difference in the combination of different similarities, resulting that combining all the similarity information on drug space is unnecessary, and only a part of the similarity information can achieve the desired performance. Then our method is compared with other existing models on three data sets (Fdataset, Cdataset and LRSSLdataset) and have a good advantage in the evaluation measurement of AUPR. Besides, a case study experiment is conducted and showing that the superior ability for predicting the potential disease-related drugs of our model. Finally, we compare our model with some methods on six real world datasets, and our model has a good performance in detecting real world data.
Collapse
|
27
|
Zhu X, Lu W. Multi-Label Classification With Dual Tail-Node Augmentation for Drug Repositioning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3068-3079. [PMID: 37418410 DOI: 10.1109/tcbb.2023.3292883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Due to the lengthy and costly process of new drug discovery, increasing attention has been paid to drug repositioning, i.e., identifying new drug-disease associations. Current machine learning methods for drug repositioning mainly leverage matrix factorization or graph neural networks, and have achieved impressive performance. However, they often suffer from insufficient training labels of inter-domain associations, while ignore the intra-domain associations. Moreover, they often neglect the importance of tail nodes that have few known associations, which limits their effectiveness in drug repositioning. In this paper, we propose a novel multi-label classification model with dual Tail-Node Augmentation for Drug Repositioning (TNA-DR). We incorporate disease-disease similarity and drug-drug similarity information into k-nearest neighbor ( kNN) augmentation module and contrastive augmentation module, respectively, which effectively complements the weak supervision of drug-disease associations. Furthermore, before employing the two augmentation modules, we filter the nodes by their degrees, so that the two modules are only applied to tail nodes. We conduct 10-fold cross validation experiments on four different real-world datasets, and our model achieves the state-of-the-art performance on all the four datasets. We also demonstrate our model's capability of identifying drug candidates for new diseases and discovering potential new links between existing drugs and diseases.
Collapse
|
28
|
Wang Y, Liu JX, Wang J, Shang J, Gao YL. A Graph Representation Approach Based on Light Gradient Boosting Machine for Predicting Drug-Disease Associations. J Comput Biol 2023; 30:937-947. [PMID: 37486669 DOI: 10.1089/cmb.2023.0078] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2023] Open
Abstract
Determining the association between drug and disease is important in drug development. However, existing approaches for drug-disease associations (DDAs) prediction are too homogeneous in terms of feature extraction. Here, a novel graph representation approach based on light gradient boosting machine (GRLGB) is proposed for prediction of DDAs. After the introduction of the protein into a heterogeneous network, nodes features were extracted from two perspectives: network topology and biological knowledge. Finally, the GRLGB classifier was applied to predict potential DDAs. GRLGB achieved satisfactory results on Bdataset and Fdataset through 10-fold cross-validation. To further prove the reliability of the GRLGB, case studies involving anxiety disorders and clozapine were conducted. The results suggest that GRLGB can identify novel DDAs.
Collapse
Affiliation(s)
- Ying Wang
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Juan Wang
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao, Shandong, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, Shandong, China
| |
Collapse
|
29
|
Wang Y, Gao YL, Wang J, Li F, Liu JX. MSGCA: Drug-Disease Associations Prediction Based on Multi-Similarities Graph Convolutional Autoencoder. IEEE J Biomed Health Inform 2023; 27:3686-3694. [PMID: 37163398 DOI: 10.1109/jbhi.2023.3272154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Identifying drug-disease associations (DDAs) is critical to the development of drugs. Traditional methods to determine DDAs are expensive and inefficient. Therefore, it is imperative to develop more accurate and effective methods for DDAs prediction. Most current DDAs prediction methods utilize original DDAs matrix directly. However, the original DDAs matrix is sparse, which greatly affects the prediction consequences. Hence, a prediction method based on multi-similarities graph convolutional autoencoder (MSGCA) is proposed for DDAs prediction. First, MSGCA integrates multiple drug similarities and disease similarities using centered kernel alignment-based multiple kernel learning (CKA-MKL) algorithm to form new drug similarity and disease similarity, respectively. Second, the new drug and disease similarities are improved by linear neighborhood, and the DDAs matrix is reconstructed by weighted K nearest neighbor profiles. Next, the reconstructed DDAs and the improved drug and disease similarities are integrated into a heterogeneous network. Finally, the graph convolutional autoencoder with attention mechanism is utilized to predict DDAs. Compared with extant methods, MSGCA shows superior results on three datasets. Furthermore, case studies further demonstrate the reliability of MSGCA.
Collapse
|
30
|
Gao Z, Ma H, Zhang X, Wang Y, Wu Z. Similarity measures-based graph co-contrastive learning for drug-disease association prediction. Bioinformatics 2023; 39:btad357. [PMID: 37261859 PMCID: PMC10275904 DOI: 10.1093/bioinformatics/btad357] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 03/14/2023] [Accepted: 05/31/2023] [Indexed: 06/02/2023] Open
Abstract
MOTIVATION An imperative step in drug discovery is the prediction of drug-disease associations (DDAs), which tries to uncover potential therapeutic possibilities for already validated drugs. It is costly and time-consuming to predict DDAs using wet experiments. Graph Neural Networks as an emerging technique have shown superior capacity of dealing with DDA prediction. However, existing Graph Neural Networks-based DDA prediction methods suffer from sparse supervised signals. As graph contrastive learning has shined in mitigating sparse supervised signals, we seek to leverage graph contrastive learning to enhance the prediction of DDAs. Unfortunately, most conventional graph contrastive learning-based models corrupt the raw data graph to augment data, which are unsuitable for DDA prediction. Meanwhile, these methods could not model the interactions between nodes effectively, thereby reducing the accuracy of association predictions. RESULTS A model is proposed to tap potential drug candidates for diseases, which is called Similarity Measures-based Graph Co-contrastive Learning (SMGCL). For learning embeddings from complicated network topologies, SMGCL includes three essential processes: (i) constructs three views based on similarities between drugs and diseases and DDA information; (ii) two graph encoders are performed over the three views, so as to model both local and global topologies simultaneously; and (iii) a graph co-contrastive learning method is introduced, which co-trains the representations of nodes to maximize the agreement between them, thus generating high-quality prediction results. Contrastive learning serves as an auxiliary task for improving DDA predictions. Evaluated by cross-validations, SMGCL achieves pleasing comprehensive performances. Further proof of the SMGCL's practicality is provided by case study of Alzheimer's disease. AVAILABILITY AND IMPLEMENTATION https://github.com/Jcmorz/SMGCL.
Collapse
Affiliation(s)
- Zihao Gao
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| | - Huifang Ma
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
- Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, No.1 Jinji Road, Guilin, 541004, China
| | - Xiaohui Zhang
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| | - Yike Wang
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| | - Zheyu Wu
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| |
Collapse
|
31
|
Azuma I, Mizuno T, Kusuhara H. NRBdMF: A Recommendation Algorithm for Predicting Drug Effects Considering Directionality. J Chem Inf Model 2023; 63:474-483. [PMID: 36635231 DOI: 10.1021/acs.jcim.2c01210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Predicting the novel effects of drugs based on information about approved drugs can be regarded as a recommendation system. Matrix factorization is one of the most used recommendation systems, and various algorithms have been devised for it. A literature survey and summary of existing algorithms for predicting drug effects demonstrated that most such methods, including neighborhood regularized logistic matrix factorization, which was the best performer in benchmark tests, used a binary matrix that considers only the presence or absence of interactions. However, drug effects are known to have two opposite aspects, such as side effects and therapeutic effects. In the present study, we proposed using neighborhood regularized bidirectional matrix factorization (NRBdMF) to predict drug effects by incorporating bidirectionality, which is a characteristic property of drug effects. We used this proposed method for predicting side effects using a matrix that considered the bidirectionality of drug effects, in which known side effects were assigned a positive (+1) label and known treatment effects were assigned a negative (-1) label. The NRBdMF model, which utilizes drug bidirectional information, achieved enrichment of side effects at the top and indications at the bottom of the prediction list. This first attempt to consider the bidirectional nature of drug effects using NRBdMF showed that it reduced false positives and produced a highly interpretable output.
Collapse
Affiliation(s)
- Iori Azuma
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo113-0033, Japan
| | - Tadahaya Mizuno
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo113-0033, Japan
| | - Hiroyuki Kusuhara
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo113-0033, Japan
| |
Collapse
|
32
|
Jamali AA, Kusalik A, Wu FX. NMTF-DTI: A Nonnegative Matrix Tri-factorization Approach With Multiple Kernel Fusion for Drug-Target Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:586-594. [PMID: 34914594 DOI: 10.1109/tcbb.2021.3135978] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Prediction of drug-target interactions (DTIs) plays a significant role in drug development and drug discovery. Although this task requires a large investment in terms of time and cost, especially when it is performed experimentally, the results are not necessarily significant. Computational DTI prediction is a shortcut to reduce the risks of experimental methods. In this study, we propose an effective approach of nonnegative matrix tri-factorization, referred to as NMTF-DTI, to predict the interaction scores between drugs and targets. NMTF-DTI utilizes multiple kernels (similarity measures) for drugs and targets and Laplacian regularization to boost the prediction performance. The performance of NMTF-DTI is evaluated via cross-validation and is compared with existing DTI prediction methods in terms of the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision and recall curve (AUPR). We evaluate our method on four gold standard datasets, comparing to other state-of-the-art methods. Cross-validation and a separate, manually created dataset are used to set parameters. The results show that NMTF-DTI outperforms other competing methods. Moreover, the results of a case study also confirm the superiority of NMTF-DTI.
Collapse
|
33
|
Lei S, Lei X, Liu L. Drug repositioning based on heterogeneous networks and variational graph autoencoders. Front Pharmacol 2022; 13:1056605. [PMID: 36618933 PMCID: PMC9812491 DOI: 10.3389/fphar.2022.1056605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022] Open
Abstract
Predicting new therapeutic effects (drug repositioning) of existing drugs plays an important role in drug development. However, traditional wet experimental prediction methods are usually time-consuming and costly. The emergence of more and more artificial intelligence-based drug repositioning methods in the past 2 years has facilitated drug development. In this study we propose a drug repositioning method, VGAEDR, based on a heterogeneous network of multiple drug attributes and a variational graph autoencoder. First, a drug-disease heterogeneous network is established based on three drug attributes, disease semantic information, and known drug-disease associations. Second, low-dimensional feature representations for heterogeneous networks are learned through a variational graph autoencoder module and a multi-layer convolutional module. Finally, the feature representation is fed to a fully connected layer and a Softmax layer to predict new drug-disease associations. Comparative experiments with other baseline methods on three datasets demonstrate the excellent performance of VGAEDR. In the case study, we predicted the top 10 possible anti-COVID-19 drugs on the existing drug and disease data, and six of them were verified by other literatures.
Collapse
|
34
|
Song Y, Cui H, Zhang T, Yang T, Li X, Xuan P. Prediction of Drug-Related Diseases Through Integrating Pairwise Attributes and Neighbor Topological Structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2963-2974. [PMID: 34133286 DOI: 10.1109/tcbb.2021.3089692] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identifying new disease indications for the approved drugs can help reduce the cost and time of drug development. Most of the recent methods focus on exploiting the various information related to drugs and diseases for predicting the candidate drug-disease associations. However, the previous methods failed to deeply integrate the neighborhood topological structure and the node attributes of an interested drug-disease node pair. We propose a new prediction method, ANPred, to learn and integrate pairwise attribute information and neighbor topology information from the similarities and associations related to drugs and diseases. First, a bi-layer heterogeneous network with intra-layer and inter-layer connections is established to combine the drug similarities, the disease similarities, and the drug-disease associations. Second, the embedding of a pair of drug and disease is constructed based on integrating multiple biological premises about drugs and diseases. The learning framework based on multi-layer convolutional neural networks is designed to learn the attribute representation of the pair of drug and disease nodes from its embedding. The sequences composed of neighbor nodes are formed based on random walk on the heterogeneous network. A framework based on fully-connected autoencoder and skip-gram module is constructed to learn the neighbor topological representations of nodes. The cross-validation results indicate the performance of ANPred is superior to several state-of-the-art methods. The case studies on 5 drugs further confirm the ability of ANPred in discovering the potential drug-disease association candidates.
Collapse
|
35
|
Yella JK, Jegga AG. MGATRx: Discovering Drug Repositioning Candidates Using Multi-View Graph Attention. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2596-2604. [PMID: 34014830 PMCID: PMC10038065 DOI: 10.1109/tcbb.2021.3082466] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In-silico drug repositioning or predicting new indications for approved or late-stage clinical trial drugs is a resourceful and time-efficient strategy in drug discovery. However, inferring novel candidate drugs for a disease is challenging, given the heterogeneity and sparseness of the underlying biological entities and their relationships (e.g., disease/drug annotations). By integrating drug-centric and disease-centric annotations as multi-views, we propose a multi-view graph attention network for indication discovery (MGATRx). Unlike most current similarity-based methods, we employ graph attention network on the heterogeneous drug and disease data to learn the representation of nodes and identify associations. MGATRx outperformed four other state-of-art methods used for computational drug repositioning. Further, several of our predicted novel indications are either currently investigated or are supported by literature evidence, demonstrating the overall translational utility of MGATRx.
Collapse
|
36
|
Zhong C, Ai J, Yang Y, Ma F, Sun W. Small Molecular Drug Screening Based on Clinical Therapeutic Effect. Molecules 2022; 27:molecules27154807. [PMID: 35956770 PMCID: PMC9369618 DOI: 10.3390/molecules27154807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 07/22/2022] [Accepted: 07/25/2022] [Indexed: 11/16/2022] Open
Abstract
Virtual screening can significantly save experimental time and costs for early drug discovery. Drug multi-classification can speed up virtual screening and quickly predict the most likely class for a drug. In this study, 1019 drug molecules with actual therapeutic effects are collected from multiple databases and documents, and molecular sets are grouped according to therapeutic effect and mechanism of action. Molecular descriptors and molecular fingerprints are obtained through SMILES to quantify molecular structures. After using the Kennard–Stone method to divide the data set, a better combination can be obtained by comparing the combined results of five classification algorithms and a fusion method. Furthermore, for a specific data set, the model with the best performance is used to predict the validation data set. The test set shows that prediction accuracy can reach 0.862 and kappa coefficient can reach 0.808. The highest classification accuracy of the validation set is 0.873. The more reliable molecular set has been found, which could be used to predict potential attributes of unknown drug compounds and even to discover new use for old drugs. We hope this research can provide a reference for virtual screening of multiple classes of drugs at the same time in the future.
Collapse
Affiliation(s)
| | | | | | | | - Wei Sun
- Correspondence: ; Tel.: +86-10-64445826
| |
Collapse
|
37
|
Identification of Potential Parkinson's Disease Drugs Based on Multi-Source Data Fusion and Convolutional Neural Network. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27154780. [PMID: 35897954 PMCID: PMC9369596 DOI: 10.3390/molecules27154780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 07/20/2022] [Accepted: 07/22/2022] [Indexed: 11/20/2022]
Abstract
Parkinson’s disease (PD) is a serious neurodegenerative disease. Most of the current treatment can only alleviate symptoms, but not stop the progress of the disease. Therefore, it is crucial to find medicines to completely cure PD. Finding new indications of existing drugs through drug repositioning can not only reduce risk and cost, but also improve research and development efficiently. A drug repurposing method was proposed to identify potential Parkinson’s disease-related drugs based on multi-source data integration and convolutional neural network. Multi-source data were used to construct similarity networks, and topology information were utilized to characterize drugs and PD-associated proteins. Then, diffusion component analysis method was employed to reduce the feature dimension. Finally, a convolutional neural network model was constructed to identify potential associations between existing drugs and LProts (PD-associated proteins). Based on 10-fold cross-validation, the developed method achieved an accuracy of 91.57%, specificity of 87.24%, sensitivity of 95.27%, Matthews correlation coefficient of 0.8304, area under the receiver operating characteristic curve of 0.9731 and area under the precision–recall curve of 0.9727, respectively. Compared with the state-of-the-art approaches, the current method demonstrates superiority in some aspects, such as sensitivity, accuracy, robustness, etc. In addition, some of the predicted potential PD therapeutics through molecular docking further proved that they can exert their efficacy by acting on the known targets of PD, and may be potential PD therapeutic drugs for further experimental research. It is anticipated that the current method may be considered as a powerful tool for drug repurposing and pathological mechanism studies.
Collapse
|
38
|
Feng H, Xiang Y, Wang X, Xue W, Yue Z. MTAGCN: predicting miRNA-target associations in Camellia sinensis var. assamica through graph convolution neural network. BMC Bioinformatics 2022; 23:271. [PMID: 35820798 PMCID: PMC9275082 DOI: 10.1186/s12859-022-04819-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 07/01/2022] [Indexed: 11/10/2022] Open
Abstract
Background MircoRNAs (miRNAs) play a central role in diverse biological processes of Camellia sinensis var.assamica (CSA) through their associations with target mRNAs, including CSA growth, development and stress response. However, although the experiment methods of CSA miRNA-target identifications are costly and time-consuming, few computational methods have been developed to tackle the CSA miRNA-target association prediction problem. Results In this paper, we constructed a heterogeneous network for CSA miRNA and targets by integrating rich biological information, including a miRNA similarity network, a target similarity network, and a miRNA-target association network. We then proposed a deep learning framework of graph convolution networks with layer attention mechanism, named MTAGCN. In particular, MTAGCN uses the attention mechanism to combine embeddings of multiple graph convolution layers, employing the integrated embedding to score the unobserved CSA miRNA-target associations. Discussion Comprehensive experiment results on two tasks (balanced task and unbalanced task) demonstrated that our proposed model achieved better performance than the classic machine learning and existing graph convolution network-based methods. The analysis of these results could offer valuable information for understanding complex CSA miRNA-target association mechanisms and would make a contribution to precision plant breeding. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04819-3.
Collapse
Affiliation(s)
- Haisong Feng
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Ying Xiang
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Xiaosong Wang
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Wei Xue
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Zhenyu Yue
- School of Information and Computer, Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| |
Collapse
|
39
|
Integration of Neighbor Topologies Based on Meta-Paths and Node Attributes for Predicting Drug-Related Diseases. Int J Mol Sci 2022; 23:ijms23073870. [PMID: 35409235 PMCID: PMC8999005 DOI: 10.3390/ijms23073870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 03/15/2022] [Accepted: 03/15/2022] [Indexed: 02/04/2023] Open
Abstract
Identifying new disease indications for existing drugs can help facilitate drug development and reduce development cost. The previous drug–disease association prediction methods focused on data about drugs and diseases from multiple sources. However, they did not deeply integrate the neighbor topological information of drug and disease nodes from various meta-path perspectives. We propose a prediction method called NAPred to encode and integrate meta-path-level neighbor topologies, multiple kinds of drug attributes, and drug-related and disease-related similarities and associations. The multiple kinds of similarities between drugs reflect the degrees of similarity between two drugs from different perspectives. Therefore, we constructed three drug–disease heterogeneous networks according to these drug similarities, respectively. A learning framework based on fully connected neural networks and a convolutional neural network with an attention mechanism is proposed to learn information of the neighbor nodes of a pair of drug and disease nodes. The multiple neighbor sets composed of different kinds of nodes were formed respectively based on meta-paths with different semantics and different scales. We established the attention mechanisms at the neighbor-scale level and at the neighbor topology level to learn enhanced neighbor feature representations and enhanced neighbor topological representations. A convolutional-autoencoder-based module is proposed to encode the attributes of the drug–disease pair in three heterogeneous networks. Extensive experimental results indicated that NAPred outperformed several state-of-the-art methods for drug–disease association prediction, and the improved recall rates demonstrated that NAPred was able to retrieve more actual drug–disease associations from the top-ranked candidates. Case studies on five drugs further demonstrated the ability of NAPred to identify potential drug-related disease candidates.
Collapse
|
40
|
Deep Learning-Assisted Repurposing of Plant Compounds for Treating Vascular Calcification: An In Silico Study with Experimental Validation. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2022; 2022:4378413. [PMID: 35035662 PMCID: PMC8754599 DOI: 10.1155/2022/4378413] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 10/24/2021] [Accepted: 11/13/2021] [Indexed: 12/13/2022]
Abstract
Background Vascular calcification (VC) constitutes subclinical vascular burden and increases cardiovascular mortality. Effective therapeutics for VC remains to be procured. We aimed to use a deep learning-based strategy to screen and uncover plant compounds that potentially can be repurposed for managing VC. Methods We integrated drugome, interactome, and diseasome information from Comparative Toxicogenomic Database (CTD), DrugBank, PubChem, Gene Ontology (GO), and BioGrid to analyze drug-disease associations. A deep representation learning was done using a high-level description of the local network architecture and features of the entities, followed by learning the global embeddings of nodes derived from a heterogeneous network using the graph neural network architecture and a random forest classifier established for prediction. Predicted results were tested in an in vitro VC model for validity based on the probability scores. Results We collected 6,790 compounds with available Simplified Molecular-Input Line-Entry System (SMILES) data, 11,958 GO terms, 7,238 diseases, and 25,482 proteins, followed by local embedding vectors using an end-to-end transformer network and a node2vec algorithm and global embedding vectors learned from heterogeneous network via the graph neural network. Our algorithm conferred a good distinction between potential compounds, presenting as higher prediction scores for the compound categories with a higher potential but lower scores for other categories. Probability score-dependent selection revealed that antioxidants such as sulforaphane and daidzein were potentially effective compounds against VC, while catechin had low probability. All three compounds were validated in vitro. Conclusions Our findings exemplify the utility of deep learning in identifying promising VC-treating plant compounds. Our model can be a quick and comprehensive computational screening tool to assist in the early drug discovery process.
Collapse
|
41
|
Gao CQ, Zhou YK, Xin XH, Min H, Du PF. DDA-SKF: Predicting Drug-Disease Associations Using Similarity Kernel Fusion. Front Pharmacol 2022; 12:784171. [PMID: 35095495 PMCID: PMC8792612 DOI: 10.3389/fphar.2021.784171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/20/2021] [Indexed: 12/13/2022] Open
Abstract
Drug repositioning provides a promising and efficient strategy to discover potential associations between drugs and diseases. Many systematic computational drug-repositioning methods have been introduced, which are based on various similarities of drugs and diseases. In this work, we proposed a new computational model, DDA-SKF (drug-disease associations prediction using similarity kernels fusion), which can predict novel drug indications by utilizing similarity kernel fusion (SKF) and Laplacian regularized least squares (LapRLS) algorithms. DDA-SKF integrated multiple similarities of drugs and diseases. The prediction performances of DDA-SKF are better, or at least comparable, to all state-of-the-art methods. The DDA-SKF can work without sufficient similarity information between drug indications. This allows us to predict new purpose for orphan drugs. The source code and benchmarking datasets are deposited in a GitHub repository (https://github.com/GCQ2119216031/DDA-SKF).
Collapse
Affiliation(s)
| | | | | | | | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
42
|
Hu P, Huang YA, Mei J, Leung H, Chen ZH, Kuang ZM, You ZH, Hu L. Learning from low-rank multimodal representations for predicting disease-drug associations. BMC Med Inform Decis Mak 2021; 21:308. [PMID: 34736437 PMCID: PMC8567544 DOI: 10.1186/s12911-021-01648-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 10/06/2021] [Indexed: 12/15/2022] Open
Abstract
Background Disease-drug associations provide essential information for drug discovery and disease treatment. Many disease-drug associations remain unobserved or unknown, and trials to confirm these associations are time-consuming and expensive. To better understand and explore these valuable associations, it would be useful to develop computational methods for predicting unobserved disease-drug associations. With the advent of various datasets describing diseases and drugs, it has become more feasible to build a model describing the potential correlation between disease and drugs.
Results In this work, we propose a new prediction method, called LMFDA, which works in several stages. First, it studies the drug chemical structure, disease MeSH descriptors, disease-related phenotypic terms, and drug-drug interactions. On this basis, similarity networks of different sources are constructed to enrich the representation of drugs and diseases. Based on the fused disease similarity network and drug similarity network, LMFDA calculated the association score of each pair of diseases and drugs in the database. This method achieves good performance on Fdataset and Cdataset, AUROCs were 91.6% and 92.1% respectively, higher than many of the existing computational models. Conclusions The novelty of LMFDA lies in the introduction of multimodal fusion using low-rank tensors to fuse multiple similar networks and combine matrix complement technology to predict potential association. We have demonstrated that LMFDA can display excellent network integration ability for accurate disease-drug association inferring and achieve substantial improvement over the advanced approach. Overall, experimental results on two real-world networks dataset demonstrate that LMFDA able to delivers an excellent detecting performance. Results also suggest that perfecting similar networks with as much domain knowledge as possible is a promising direction for drug repositioning.
Collapse
Affiliation(s)
- Pengwei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Yu-An Huang
- The Hong Kong Polytechnic University, Hong Kong SAR, China
| | | | - Henry Leung
- Electrical and Computer Engineering, University of Calgary, Calgary, Canada
| | - Zhan-Heng Chen
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Ze-Min Kuang
- Beijing Anzhen Hospital of Capital Medical University, Beijing, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.
| |
Collapse
|
43
|
Xuan P, Gao L, Sheng N, Zhang T, Nakaguchi T. Graph Convolutional Autoencoder and Fully-Connected Autoencoder with Attention Mechanism Based Method for Predicting Drug-Disease Associations. IEEE J Biomed Health Inform 2021; 25:1793-1804. [PMID: 33216722 DOI: 10.1109/jbhi.2020.3039502] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Predicting novel uses for approved drugs helps in reducing the costs of drug development and facilitates the development process. Most of previous methods focused on the multi-source data related to drugs and diseases to predict the candidate associations between drugs and diseases. There are multiple kinds of similarities between drugs, and these similarities reflect how similar two drugs are from the different views, whereas most of the previous methods failed to deeply integrate these similarities. In addition, the topology structures of the multiple drug-disease heterogeneous networks constructed by using the different kinds of drug similarities are not fully exploited. We therefore propose GFPred, a method based on a graph convolutional autoencoder and a fully-connected autoencoder with an attention mechanism, to predict drug-related diseases. GFPred integrates drug-disease associations, disease similarities, three kinds of drug similarities and attributes of the drug nodes. Three drug-disease heterogeneous networks are constructed based on the different kinds of drug similarities. We construct a graph convolutional autoencoder module, and integrate the attributes of the drug and disease nodes in each network to learn the topology representations of each drug node and disease node. As the different kinds of drug attributes contribute differently to the prediction of drug-disease associations, we construct an attribute-level attention mechanism. A fully-connected autoencoder module is established to learn the attribute representations of the drug and disease nodes. Finally, the original features of the drug-disease node pairs are also important auxiliary information for their association prediction. A combined strategy based on a convolutional neural network is proposed to fully integrate the topology representations, the attribute representations, and the original features of the drug-disease pairs. The ablation studies showed the contributions of data related to three types of drug attributes. Comparison with other methods confirmed that GFPred achieved better performance than several state-of-the-art prediction methods. In particular, case studies confirmed that GFPred is able to retrieve more actual drug-disease associations in the top k part of the prediction results. It is helpful for biologists to discover real associations by wet-lab experiments.
Collapse
|
44
|
Wang J, Wang W, Yan C, Luo J, Zhang G. Predicting Drug-Disease Association Based on Ensemble Strategy. Front Genet 2021; 12:666575. [PMID: 34012464 PMCID: PMC8128144 DOI: 10.3389/fgene.2021.666575] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 03/23/2021] [Indexed: 12/29/2022] Open
Abstract
Drug repositioning is used to find new uses for existing drugs, effectively shortening the drug research and development cycle and reducing costs and risks. A new model of drug repositioning based on ensemble learning is proposed. This work develops a novel computational drug repositioning approach called CMAF to discover potential drug-disease associations. First, for new drugs and diseases or unknown drug-disease pairs, based on their known neighbor information, an association probability can be obtained by implementing the weighted K nearest known neighbors (WKNKN) method and improving the drug-disease association information. Then, a new drug similarity network and new disease similarity network can be constructed. Three prediction models are applied and ensembled to enable the final association of drug-disease pairs based on improved drug-disease association information and the constructed similarity network. The experimental results demonstrate that the developed approach outperforms recent state-of-the-art prediction models. Case studies further confirm the predictive ability of the proposed method. Our proposed method can effectively improve the prediction results.
Collapse
Affiliation(s)
- Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Wenxiu Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
| | - Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| |
Collapse
|
45
|
Zhao BW, You ZH, Wong L, Zhang P, Li HY, Wang L. MGRL: Predicting Drug-Disease Associations Based on Multi-Graph Representation Learning. Front Genet 2021; 12:657182. [PMID: 34054920 PMCID: PMC8153989 DOI: 10.3389/fgene.2021.657182] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 03/15/2021] [Indexed: 11/13/2022] Open
Abstract
Drug repositioning is an application-based solution based on mining existing drugs to find new targets, quickly discovering new drug-disease associations, and reducing the risk of drug discovery in traditional medicine and biology. Therefore, it is of great significance to design a computational model with high efficiency and accuracy. In this paper, we propose a novel computational method MGRL to predict drug-disease associations based on multi-graph representation learning. More specifically, MGRL first uses the graph convolution network to learn the graph representation of drugs and diseases from their self-attributes. Then, the graph embedding algorithm is used to represent the relationships between drugs and diseases. Finally, the two kinds of graph representation learning features were put into the random forest classifier for training. To the best of our knowledge, this is the first work to construct a multi-graph to extract the characteristics of drugs and diseases to predict drug-disease associations. The experiments show that the MGRL can achieve a higher AUC of 0.8506 based on five-fold cross-validation, which is significantly better than other existing methods. Case study results show the reliability of the proposed method, which is of great significance for practical applications.
Collapse
Affiliation(s)
- Bo-Wei Zhao
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Zhu-Hong You
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Leon Wong
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| | - Ping Zhang
- The School of Computer Sciences, BaoJi University of Arts and Sciences, Baoji, China
| | - Hao-Yuan Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, China
| | - Lei Wang
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.,University of Chinese Academy of Sciences, Beijing, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Ürümqi, China
| |
Collapse
|
46
|
Cancer Subtype Recognition Based on Laplacian Rank Constrained Multiview Clustering. Genes (Basel) 2021; 12:genes12040526. [PMID: 33916856 PMCID: PMC8065670 DOI: 10.3390/genes12040526] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 03/28/2021] [Accepted: 03/31/2021] [Indexed: 12/13/2022] Open
Abstract
Integrating multigenomic data to recognize cancer subtype is an important task in bioinformatics. In recent years, some multiview clustering algorithms have been proposed and applied to identify cancer subtype. However, these clustering algorithms ignore that each data contributes differently to the clustering results during the fusion process, and they require additional clustering steps to generate the final labels. In this paper, a new one-step method for cancer subtype recognition based on graph learning framework is designed, called Laplacian Rank Constrained Multiview Clustering (LRCMC). LRCMC first forms a graph for a single biological data to reveal the relationship between data points and uses affinity matrix to encode the graph structure. Then, it adds weights to measure the contribution of each graph and finally merges these individual graphs into a consensus graph. In addition, LRCMC constructs the adaptive neighbors to adjust the similarity of sample points, and it uses the rank constraint on the Laplacian matrix to ensure that each graph structure has the same connected components. Experiments on several benchmark datasets and The Cancer Genome Atlas (TCGA) datasets have demonstrated the effectiveness of the proposed algorithm comparing to the state-of-the-art methods.
Collapse
|
47
|
Ma Y, Li Q, Hu N, Li L. SeBioGraph: Semi-supervised Deep Learning for the Graph via Sustainable Knowledge Transfer. Front Neurorobot 2021; 15:665055. [PMID: 33867966 PMCID: PMC8047129 DOI: 10.3389/fnbot.2021.665055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 03/09/2021] [Indexed: 11/17/2022] Open
Abstract
Semi-supervised deep learning for the biomedical graph and advanced manufacturing graph is rapidly becoming an important topic in both academia and industry. Many existing types of research focus on semi-supervised link prediction and node classification, as well as the application of these methods in sustainable development and advanced manufacturing. To date, most manufacturing graph neural networks are mainly evaluated on social and information networks, which improve the quality of network representation y integrating neighbor node descriptions. However, previous methods have not yet been comprehensively studied on biomedical networks. Traditional techniques fail to achieve satisfying results, especially when labeled nodes are deficient in number. In this paper, a new semi-supervised deep learning method for the biomedical graph via sustainable knowledge transfer called SeBioGraph is proposed. In SeBioGraph, both node embedding and graph-specific prototype embedding are utilized as transferable metric space characterized. By incorporating prior knowledge learned from auxiliary graphs, SeBioGraph further promotes the performance of the target graph. Experimental results on the two-class node classification tasks and three-class link prediction tasks demonstrate that the SeBioGraph realizes state-of-the-art results. Finally, the method is thoroughly evaluated.
Collapse
Affiliation(s)
- Yugang Ma
- School of Architecture and Urban Planning, Chongqing University, Chongqing, China
| | - Qing Li
- School of Computer Science, Northwestern Polytechnical University, Shaanxi, China
| | - Nan Hu
- School of Management Science and Real Estate, Chongqing University, Chongqing, China
| | - Lili Li
- China Construction Science & Technology Group Co., Ltd. Shenzhen, China.,College of Civil and Environmental Engineering, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
48
|
Jarada TN, Rokne JG, Alhajj R. SNF-NN: computational method to predict drug-disease interactions using similarity network fusion and neural networks. BMC Bioinformatics 2021; 22:28. [PMID: 33482713 PMCID: PMC7821180 DOI: 10.1186/s12859-020-03950-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Accepted: 12/22/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Drug repositioning is an emerging approach in pharmaceutical research for identifying novel therapeutic potentials for approved drugs and discover therapies for untreated diseases. Due to its time and cost efficiency, drug repositioning plays an instrumental role in optimizing the drug development process compared to the traditional de novo drug discovery process. Advances in the genomics, together with the enormous growth of large-scale publicly available data and the availability of high-performance computing capabilities, have further motivated the development of computational drug repositioning approaches. More recently, the rise of machine learning techniques, together with the availability of powerful computers, has made the area of computational drug repositioning an area of intense activities. RESULTS In this study, a novel framework SNF-NN based on deep learning is presented, where novel drug-disease interactions are predicted using drug-related similarity information, disease-related similarity information, and known drug-disease interactions. Heterogeneous similarity information related to drugs and disease is fed to the proposed framework in order to predict novel drug-disease interactions. SNF-NN uses similarity selection, similarity network fusion, and a highly tuned novel neural network model to predict new drug-disease interactions. The robustness of SNF-NN is evaluated by comparing its performance with nine baseline machine learning methods. The proposed framework outperforms all baseline methods ([Formula: see text] = 0.867, and [Formula: see text]=0.876) using stratified 10-fold cross-validation. To further demonstrate the reliability and robustness of SNF-NN, two datasets are used to fairly validate the proposed framework's performance against seven recent state-of-the-art methods for drug-disease interaction prediction. SNF-NN achieves remarkable performance in stratified 10-fold cross-validation with [Formula: see text] ranging from 0.879 to 0.931 and [Formula: see text] from 0.856 to 0.903. Moreover, the efficiency of SNF-NN is verified by validating predicted unknown drug-disease interactions against clinical trials and published studies. CONCLUSION In conclusion, computational drug repositioning research can significantly benefit from integrating similarity measures in heterogeneous networks and deep learning models for predicting novel drug-disease interactions. The data and implementation of SNF-NN are available at http://pages.cpsc.ucalgary.ca/ tnjarada/snf-nn.php .
Collapse
Affiliation(s)
- Tamer N Jarada
- Department of Computer Science, University of Calgary, Calgary, AB, Canada
| | - Jon G Rokne
- Department of Computer Science, University of Calgary, Calgary, AB, Canada
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, AB, Canada.
- Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey.
- Department of Health Informatics, University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
49
|
Shi W, Chen X, Deng L. A Review of Recent Developments and Progress in Computational Drug Repositioning. Curr Pharm Des 2021; 26:3059-3068. [PMID: 31951162 DOI: 10.2174/1381612826666200116145559] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 01/09/2020] [Indexed: 12/27/2022]
Abstract
Computational drug repositioning is an efficient approach towards discovering new indications for existing drugs. In recent years, with the accumulation of online health-related information and the extensive use of biomedical databases, computational drug repositioning approaches have achieved significant progress in drug discovery. In this review, we summarize recent advancements in drug repositioning. Firstly, we explicitly demonstrated the available data source information which is conducive to identifying novel indications. Furthermore, we provide a summary of the commonly used computing approaches. For each method, we briefly described techniques, case studies, and evaluation criteria. Finally, we discuss the limitations of the existing computing approaches.
Collapse
Affiliation(s)
- Wanwan Shi
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Xuegong Chen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
50
|
Li J, Zhang S, Wan Y, Zhao Y, Shi J, Zhou Y, Cui Q. MISIM v2.0: a web server for inferring microRNA functional similarity based on microRNA-disease associations. Nucleic Acids Res 2020; 47:W536-W541. [PMID: 31069374 PMCID: PMC6602518 DOI: 10.1093/nar/gkz328] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Revised: 04/14/2019] [Accepted: 04/25/2019] [Indexed: 01/11/2023] Open
Abstract
MicroRNAs (miRNAs) are one class of important small non-coding RNA molecules and play critical roles in health and disease. Therefore, it is important and necessary to evaluate the functional relationship of miRNAs and then predict novel miRNA-disease associations. For this purpose, here we developed the updated web server MISIM (miRNA similarity) v2.0. Besides a 3-fold increase in data content compared with MISIM v1.0, MISIM v2.0 improved the original MISIM algorithm by implementing both positive and negative miRNA-disease associations. That is, the MISIM v2.0 scores could be positive or negative, whereas MISIM v1.0 only produced positive scores. Moreover, MISIM v2.0 achieved an algorithm for novel miRNA-disease prediction based on MISIM v2.0 scores. Finally, MISIM v2.0 provided network visualization and functional enrichment analysis for functionally paired miRNAs. The MISIM v2.0 web server is freely accessible at http://www.lirmed.com/misim/.
Collapse
Affiliation(s)
- Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China.,Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Shan Zhang
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Yanping Wan
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Yingshu Zhao
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China.,Sanbo Brain Institute, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China
| |
Collapse
|