1
|
Jiang X, Wen L, Li W, Que D, Ming L. DTGHAT: multi-molecule heterogeneous graph transformer based on multi-molecule graph for drug-target identification. Front Pharmacol 2025; 16:1596216. [PMID: 40356956 PMCID: PMC12066497 DOI: 10.3389/fphar.2025.1596216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2025] [Accepted: 04/14/2025] [Indexed: 05/15/2025] Open
Abstract
Introduction Drug target identification is a fundamental step in drug discovery and plays a pivotal role in new therapies development. Existing computational methods focus on the direct interactions between drugs and targets, often ignoring the complex interrelationships between drugs, targets and various biomolecules in the human system. Method To address this limitation, we propose a novel prediction model named DTGHAT (Drug and Target Association Prediction using Heterogeneous Graph Attention Transformer based on Molecular Heterogeneous). DTGHAT utilizes a graph attention transformer to identify novel targets from 15 heterogeneous drug-gene-disease networks characterized by chemical, genomic, phenotypic, and cellular networks. Result In a 5-fold cross-validation study, DTGHAT achieved an area under the receiver operating characteristic curve (AUC) of 0.9634, which is at least 4% higher than current state-of-the-art methods. Characterization ablation experiments highlight the importance of integrating biomolecular data from multiple sources in revealing drug-target interactions. In addition, a case study on cancer drugs further validates DTGHAT's effectiveness in predicting novel drug target identification. DTGHAT is free and available at: https://github.com/stella-007/DTGHAT.git.
Collapse
Affiliation(s)
- Xinchen Jiang
- The National Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Hunan provincical key laboratory of Neurorestoratology, The Second Affiliated Hospital of Hunan Normal University, Changsha, China
| | - Lu Wen
- Hunan provincical key laboratory of Neurorestoratology, The Second Affiliated Hospital of Hunan Normal University, Changsha, China
- Department of Ophthalmology, 921 Hospital of Joint Logistics Support Force People’s Liberation Army of China, (The Second Affiliated Hospital of Hunan Normal University), Changsha, China
| | - Wenshui Li
- The National Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Hunan provincical key laboratory of Neurorestoratology, The Second Affiliated Hospital of Hunan Normal University, Changsha, China
| | - Deng Que
- The National Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Hunan provincical key laboratory of Neurorestoratology, The Second Affiliated Hospital of Hunan Normal University, Changsha, China
- Department of Neurology, 921 Hospital of Joint Logistics Support Force People’s Liberation Army of China, (The Second Affiliated Hospital of Hunan Normal University), Changsha, China
| | - Lu Ming
- The National Local Joint Engineering Laboratory of Animal Peptide Drug Development, College of Life Sciences, Hunan Normal University, Changsha, China
- Hunan provincical key laboratory of Neurorestoratology, The Second Affiliated Hospital of Hunan Normal University, Changsha, China
| |
Collapse
|
2
|
Luo H, Yang H, Zhang G, Wang J, Luo J, Yan C. KGRDR: a deep learning model based on knowledge graph and graph regularized integration for drug repositioning. Front Pharmacol 2025; 16:1525029. [PMID: 40008124 PMCID: PMC11850324 DOI: 10.3389/fphar.2025.1525029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Accepted: 01/13/2025] [Indexed: 02/27/2025] Open
Abstract
Computational drug repositioning, serving as an effective alternative to traditional drug discovery plays a key role in optimizing drug development. This approach can accelerate the development of new therapeutic options while reducing costs and mitigating risks. In this study, we propose a novel deep learning-based framework KGRDR containing multi-similarity integration and knowledge graph learning to predict potential drug-disease interactions. Specifically, a graph regularized approach is applied to integrate multiple drug and disease similarity information, which can effectively eliminate noise data and obtain integrated similarity features of drugs and diseases. Then, topological feature representations of drugs and diseases are learned from constructed biomedical knowledge graphs (KGs) which encompasses known drug-related and disease-related interactions. Next, the similarity features and topological features are fused by utilizing an attention-based feature fusion method. Finally, drug-disease associations are predicted using the graph convolutional network. Experimental results demonstrate that KGRDR achieves better performance when compared with the state-of-the-art drug-disease prediction methods. Moreover, case study results further validate the effectiveness of KGRDR in predicting novel drug-disease interactions.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, China
| | - Hui Yang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, China
| | - Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, China
- Academy for Advanced Interdisciplinary Studies, Henan University, Zhengzhou, China
| |
Collapse
|
3
|
Picard M, Leclercq M, Bodein A, Scott-Boyer MP, Perin O, Droit A. Improving drug repositioning with negative data labeling using large language models. J Cheminform 2025; 17:16. [PMID: 39905466 DOI: 10.1186/s13321-025-00962-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Accepted: 01/20/2025] [Indexed: 02/06/2025] Open
Abstract
INTRODUCTION Drug repositioning offers numerous advantages, such as faster development timelines, reduced costs, and lower failure rates in drug development. Supervised machine learning is commonly used to score drug candidates but is hindered by the lack of reliable negative data-drugs that fail due to inefficacy or toxicity- which is difficult to access, lowering their prediction accuracy and generalization. Positive-Unlabeled (PU) learning has been used to overcome this issue by either randomly sampling unlabeled drugs or identifying probable negatives but still suffers from misclassification or oversimplified decision boundaries. RESULTS We proposed a novel strategy using Large Language Models (GPT-4) to analyze all clinical trials on prostate cancer and systematically identify true negatives. This approach showed remarkable improvement in predictive accuracy on independent test sets with a Matthews Correlation Coefficient of 0.76 (± 0.33) compared to 0.55 (± 0.15) and 0.48 (± 0.18) for two commonly used PU learning approaches. Using our labeling strategy, we created a training set of 26 positive and 54 experimentally validated negative drugs. We then applied a machine learning ensemble to this new dataset to assess the repurposing potential of the remaining 11,043 drugs in the DrugBank database. This analysis identified 980 potential candidates for prostate cancer. A detailed review of the top 30 revealed 9 promising drugs targeting various mechanisms such as genomic instability, p53 regulation, or TMPRSS2-ERG fusion. CONCLUSION By expanding our negative data labeling approach to all diseases within the ClinicalTrials.gov database, our method could greatly advance supervised drug repositioning, offering a more accurate and data-driven path for discovering new treatments.
Collapse
Affiliation(s)
- Milan Picard
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Mickael Leclercq
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Antoine Bodein
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Marie Pier Scott-Boyer
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Olivier Perin
- Digital Transformation and Innovation Department, L'Oréal Advanced Research, Aulnay-Sous-Bois, France
| | - Arnaud Droit
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada.
| |
Collapse
|
4
|
Dao NA, Le MH, Dang XT. Label Transfer for Drug Disease Association in Three Meta-Paths. Evol Bioinform Online 2024; 20:11769343241272414. [PMID: 39279816 PMCID: PMC11401013 DOI: 10.1177/11769343241272414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 07/15/2024] [Indexed: 09/18/2024] Open
Abstract
The identification of potential interactions and relationships between diseases and drugs is significant in public health care and drug discovery. As we all know, experimenting to determine the drug-disease interactions is very expensive in both time and money. However, there are still many drug-disease associations that are still undiscovered and potential. Therefore, the development of computational methods to explore the relationship between drugs and diseases is very important and essential. Many computational methods for predicting drug-disease associations have been developed based on known interactions to learn potential interactions of unknown drug-disease pairs. In this paper, we propose 3 new main groups of meta-paths based on the heterogeneous biological network of drug-protein-disease objects. For each meta-path, we design a machine learning model, then an integrated learning method is formed by these models. We evaluated our approach on 3 standard datasets which are DrugBank, OMIM, and Gottlieb's dataset. Experimental results demonstrate that the proposed method is better than some recent methods such as EMP-SVD, LRSSL, MBiRW, MPG-DDA, SCMFDD,. . . in some measures such as AUC, AUPR, and F1-score.
Collapse
|
5
|
Zhu Y, Ning C, Zhang N, Wang M, Zhang Y. GSRF-DTI: a framework for drug-target interaction prediction based on a drug-target pair network and representation learning on a large graph. BMC Biol 2024; 22:156. [PMID: 39020316 PMCID: PMC11256582 DOI: 10.1186/s12915-024-01949-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 07/01/2024] [Indexed: 07/19/2024] Open
Abstract
BACKGROUND Identification of potential drug-target interactions (DTIs) with high accuracy is a key step in drug discovery and repositioning, especially concerning specific drug targets. Traditional experimental methods for identifying the DTIs are arduous, time-intensive, and financially burdensome. In addition, robust computational methods have been developed for predicting the DTIs and are widely applied in drug discovery research. However, advancing more precise algorithms for predicting DTIs is essential to meet the stringent standards demanded by drug discovery. RESULTS We proposed a novel method called GSRF-DTI, which integrates networks with a deep learning algorithm to identify DTIs. Firstly, GSRF-DTI learned the embedding representation of drugs and targets by integrating multiple drug association information and target association information, respectively. Then, GSRF-DTI considered the influence of drug-target pair (DTP) association on DTI prediction to construct a drug-target pair network (DTP-NET). Next, we utilized GraphSAGE on DTP-NET to learn the potential features of the network and applied random forest (RF) to predict the DTIs. Furthermore, we conducted ablation experiments to validate the necessity of integrating different types of network features for identifying DTIs. It is worth noting that GSRF-DTI proposed three novel DTIs. CONCLUSIONS GSRF-DTI not only considered the influence of the interaction relationship between drug and target but also considered the impact of DTP association relationship on DTI prediction. We initially use GraphSAGE to aggregate the neighbor information of nodes for better identification. Experimental analysis on Luo's dataset and the newly constructed dataset revealed that the GSRF-DTI framework outperformed several state-of-the-art methods significantly.
Collapse
Affiliation(s)
- Yongdi Zhu
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Chunhui Ning
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Weihai, Shandong, China.
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, Shandong, China.
| |
Collapse
|
6
|
Li N, Yang Z, Wang J, Lin H. Drug-target interaction prediction using knowledge graph embedding. iScience 2024; 27:109393. [PMID: 38952679 PMCID: PMC11215290 DOI: 10.1016/j.isci.2024.109393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 01/16/2024] [Accepted: 02/28/2024] [Indexed: 07/03/2024] Open
Abstract
The prediction of drug-target interactions (DTIs) is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. Computational approaches to predicting DTIs can provide important insights into drug mechanisms of action. However, current methods for predicting DTIs based on the structural information of the knowledge graph may suffer from the sparseness and incompleteness of the knowledge graph and neglect the latent type information of the knowledge graph. In this paper, we propose TTModel, a knowledge graph embedding model for DTI prediction. By exploiting biomedical text and type information, TTModel can learn latent text semantics and type information to improve the performance of representation learning. Comprehensive experiments on two public datasets demonstrate that our model outperforms the state-of-the-art methods significantly on the task of DTI prediction.
Collapse
Affiliation(s)
- Nan Li
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Zhihao Yang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Jian Wang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Hongfei Lin
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| |
Collapse
|
7
|
Li J, Yang X, Guan Y, Pan Z. Prediction of Drug–Target Interaction Using Dual-Network Integrated Logistic Matrix Factorization and Knowledge Graph Embedding. Molecules 2022; 27:molecules27165131. [PMID: 36014371 PMCID: PMC9412517 DOI: 10.3390/molecules27165131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 08/02/2022] [Accepted: 08/05/2022] [Indexed: 11/16/2022] Open
Abstract
Nowadays, drug–target interactions (DTIs) prediction is a fundamental part of drug repositioning. However, on the one hand, drug–target interactions prediction models usually consider drugs or targets information, which ignore prior knowledge between drugs and targets. On the other hand, models incorporating priori knowledge cannot make interactions prediction for under-studied drugs and targets. Hence, this article proposes a novel dual-network integrated logistic matrix factorization DTIs prediction scheme (Ro-DNILMF) via a knowledge graph embedding approach. This model adds prior knowledge as input data into the prediction model and inherits the advantages of the DNILMF model, which can predict under-studied drug–target interactions. Firstly, a knowledge graph embedding model based on relational rotation (RotatE) is trained to construct the interaction adjacency matrix and integrate prior knowledge. Secondly, a dual-network integrated logistic matrix factorization prediction model (DNILMF) is used to predict new drugs and targets. Finally, several experiments conducted on the public datasets are used to demonstrate that the proposed method outperforms the single base-line model and some mainstream methods on efficiency.
Collapse
Affiliation(s)
- Jiaxin Li
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Xixin Yang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
- School of Automation, Qingdao University, Qingdao 266017, China
- Correspondence:
| | - Yuanlin Guan
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
- School of Mechanical & Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China
| | - Zhenkuan Pan
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| |
Collapse
|
8
|
Zhao L, Zhu Y, Wang J, Wen N, Wang C, Cheng L. A brief review of protein-ligand interaction prediction. Comput Struct Biotechnol J 2022; 20:2831-2838. [PMID: 35765652 PMCID: PMC9189993 DOI: 10.1016/j.csbj.2022.06.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/30/2022] [Accepted: 06/01/2022] [Indexed: 01/21/2023] Open
Abstract
The task of identifying protein–ligand interactions (PLIs) plays a prominent role in the field of drug discovery. However, it is infeasible to identify potential PLIs via costly and laborious in vitro experiments. There is a need to develop PLI computational prediction approaches to speed up the drug discovery process. In this review, we summarize a brief introduction to various computation-based PLIs. We discuss these approaches, in particular, machine learning-based methods, with illustrations of different emphases based on mainstream trends. Moreover, we analyzed three research dynamics that can be further explored in future studies.
Collapse
Affiliation(s)
- Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
- Corresponding authors.
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, China
- Corresponding authors.
| |
Collapse
|
9
|
Shang Y, Gao L, Zou Q, Yu L. Prediction of drug-target interactions based on multi-layer network representation learning. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.12.068] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
10
|
Rui M, Pang H, Ji W, Wang S, Yu X, Wang L, Feng C. Development of simultaneous interaction prediction approach (SiPA) for the expansion of interaction network of traditional Chinese medicine. Chin Med 2020; 15:90. [PMID: 32863859 PMCID: PMC7448979 DOI: 10.1186/s13020-020-00369-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 08/19/2020] [Indexed: 11/21/2022] Open
Abstract
Background Due to the lack of enough interaction data among compositions, targets and diseases, it is difficult to construct a complete network of Traditional Chinese Medicine (TCM) that comprehensively reflects active compositions and their synergistic network in terms of specific diseases. Therefore, mapping of the full spectrum of interaction between compounds and their targets is of central importance when we use network pharmacology approach to explore the therapeutic potential of the TCM. Methods To address this challenge, we developed a large-scale simultaneous interaction prediction approach (SiPA) integrated one interaction network based simple inference model (SIM), focusing on ‘logical relevance’ between compounds, proteins or diseases, and another compound-target correlation space based interaction prediction model (CTCS-IPM) that was built on the basis of the canonical correlation analysis (CCA) to estimate the position of compounds (or targets) in compound-protein correlated space. Then SiPA was applied to discover reliable multiple interactions for interaction network expansion of a TCM, compound Salvia miltiorrhiza. By means of network analysis, potential active compounds and their related network synergy underlying cardiovascular diseases were evaluated between expanded and original interaction networks. Part of new interactions were validated with existing experimental evidence and molecular docking. Results As evaluated with known test dataset, the established combination approach was proved to make highly accurate prediction, showing a well prediction performance for the SIM and a high recall rate of 85.2% for the CTCS-IPM. Then 710 pairs of new compound-target interactions, 24 pairs of new compound-cardiovascular disease interactions and 294 pairs of new cardiovascular disease-protein interactions were predicted for compound Salvia miltiorrhiza. Results of network analysis suggested the network expansion could dramatically improve the completeness and effectiveness of the network. Validation results of literature and molecular docking manifested that inferred interactions had good reliability. Conclusions We provided a practical and efficient way for large-scale inference of multiple interactions of TCM ingredients, which was not limited by the lack of negative samples, sample size and target 3D structures. SiPA could help researchers more accurately prioritize the effective compounds and more completely explore network synergy of TCM for treating specific diseases, indicating a potential way for effectively identifying candidate compound (or target) in drug discovery.
Collapse
Affiliation(s)
- Mengjie Rui
- School of Pharmacy, Jiangsu University, Zhenjiang, 212013 People's Republic of China
| | - Hui Pang
- School of Pharmacy, Jiangsu University, Zhenjiang, 212013 People's Republic of China
| | - Wei Ji
- School of Pharmacy, Jiangsu University, Zhenjiang, 212013 People's Republic of China
| | - Siqi Wang
- School of Pharmacy, Jiangsu University, Zhenjiang, 212013 People's Republic of China
| | - Xuefei Yu
- School of Pharmacy, Jiangsu University, Zhenjiang, 212013 People's Republic of China
| | - Lilong Wang
- School of Pharmacy, Jiangsu University, Zhenjiang, 212013 People's Republic of China
| | - Chunlai Feng
- School of Pharmacy, Jiangsu University, Zhenjiang, 212013 People's Republic of China
| |
Collapse
|
11
|
Mohamed SK, Nounu A. Predicting The Effects of Chemical-Protein Interactions On Proteins Using Tensor Factorisation. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020; 2020:430-439. [PMID: 32477664 PMCID: PMC7233103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Understanding the different effects of chemical substances on human proteins is fundamental for designing new drugs. It is also important for elucidating the different mechanisms of action of drugs that can cause side-effects. In this context, computational methods for predicting chemical-protein interactions can provide valuable insights on the relation between therapeutic chemical substances and proteins. Their predictions therefore can help in multiple tasks such as drug repurposing, identifying new drug side-effects, etc. Despite their useful predictions, these methods are unable to predict the different implications - such as change in protein expression, abundance, etc, - of chemical - protein interactions. Therefore, In this work, we study the modelling of chemical-protein interactions' effects on proteins activity using computational approaches. We hereby propose using 3D tensors to model chemicals, their target proteins and the effects associated to their interactions. We then use multi-part embedding tensor factorisation to predict the different effects of chemicals on human proteins. We assess the predictive accuracy of our proposed method using a benchmark dataset that we built. We then show by computational experimental evaluation that our approach outperforms other tensor factorisation methods in the task of predicting effects of chemicals on human proteins.
Collapse
Affiliation(s)
- Sameh K Mohamed
- Data Science Institute, NUI Galway, Galway, Ireland
- Insight Centre for Data Analytics, NUI Galway, Galway, Ireland
| | - Aayah Nounu
- Insight Centre for Data Analytics, NUI Galway, Galway, Ireland
| |
Collapse
|
12
|
Nováček V, Mohamed SK. Predicting Polypharmacy Side-effects Using Knowledge Graph Embeddings. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020; 2020:449-458. [PMID: 32477666 PMCID: PMC7233093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Polypharmacy is the use of drug combinations and is commonly used for treating complex and terminal diseases. Despite its effectiveness in many cases, it poses high risks of adverse side effects. Polypharmacy side-effects occur due to unwanted interactions of combined drugs, and they can cause severe complications to patients which results in increasing the risks of morbidity and leading to new mortalities. The use of drug polypharmacy is currently in its early stages; thus, the knowledge of their probable side-effects is limited. This encouraged multiple works to investigate machine learning techniques to efficiently and reliably predict adverse effects of drug combinations. In this context, the Decagon model is known to provide state-of-the-art results. It models polypharmacy side-effect data as a knowledge graph and formulates finding possible adverse effects as a link prediction task over the knowledge graph. The link prediction is solved using an embedding model based on graph convolutions. Despite its effectiveness, the Decagon approach still suffers from a high rate of false positives. In this work, we propose a new knowledge graph embedding technique that uses multi-part embedding vectors to predict polypharmacy side-effects. Like in the Decagon model, we model polypharmacy side effects as a knowledge graph. However, we perform the link prediction task using an approach based on tensor decomposition. Our experimental evaluation shows that our approach outperforms the Decagon model with 12% and 16% margins in terms of the area under the ROC and precision recall curves, respectively.
Collapse
Affiliation(s)
- Vít Nováček
- Data Science Institute, NUI Galway
- Insight Centre for Data Analytics, NUI Galway
| | - Sameh K Mohamed
- Data Science Institute, NUI Galway
- Insight Centre for Data Analytics, NUI Galway
| |
Collapse
|
13
|
Wang X, Liu Y, Lu F, Li H, Gao P, Wei D. Dipeptide Frequency of Word Frequency and Graph Convolutional Networks for DTA Prediction. Front Bioeng Biotechnol 2020; 8:267. [PMID: 32318557 PMCID: PMC7147459 DOI: 10.3389/fbioe.2020.00267] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 03/13/2020] [Indexed: 11/13/2022] Open
Abstract
Deep learning is an effective method to capture drug-target binding affinity, but low accuracy is still an obstacle to be overcome. Thus, we propose a novel predictor for drug-target binding affinity based on dipeptide frequency of word frequency encoding and a hybrid graph convolutional network. Word frequency characteristics of natural language are used to improve the frequency characteristics of peptides to express target proteins. For each drug molecules, the five different features of drug atoms and the atomic bond relationships are expressed as graphs. The obtained protein features and graph structure are used as the input of convolution neural network and the input of graph convolution neural network, respectively. A prediction model is established to predict the drug affinity by calculating the hidden relationship. In the KIBA data set test experiment, the consistency coefficient of the model is 0.901, which is 0.01 higher than the existing model, and the MSE (mean square error) of the model is 0.126, which is 5% lower than the existing model. In Davis data set test experiment, the consistency coefficient of the model is 0.895, which is 0.006 higher than the existing model, and the MSE of the model is 0.220, which is 4% lower than the existing model. These results show that our proposed method can not only predict the affinity better than those existing models, but also outperform unitary deep learning approaches.
Collapse
Affiliation(s)
- Xianfang Wang
- School of Computer Science and Technology, Henan Institute of Technology, Xinxiang, China.,School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Yifeng Liu
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Fan Lu
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Hongfei Li
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Peng Gao
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Dongqing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
14
|
Thafar M, Raies AB, Albaradei S, Essack M, Bajic VB. Comparison Study of Computational Prediction Tools for Drug-Target Binding Affinities. Front Chem 2019; 7:782. [PMID: 31824921 PMCID: PMC6879652 DOI: 10.3389/fchem.2019.00782] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/30/2019] [Indexed: 12/30/2022] Open
Abstract
The drug development is generally arduous, costly, and success rates are low. Thus, the identification of drug-target interactions (DTIs) has become a crucial step in early stages of drug discovery. Consequently, developing computational approaches capable of identifying potential DTIs with minimum error rate are increasingly being pursued. These computational approaches aim to narrow down the search space for novel DTIs and shed light on drug functioning context. Most methods developed to date use binary classification to predict if the interaction between a drug and its target exists or not. However, it is more informative but also more challenging to predict the strength of the binding between a drug and its target. If that strength is not sufficiently strong, such DTI may not be useful. Therefore, the methods developed to predict drug-target binding affinities (DTBA) are of great value. In this study, we provide a comprehensive overview of the existing methods that predict DTBA. We focus on the methods developed using artificial intelligence (AI), machine learning (ML), and deep learning (DL) approaches, as well as related benchmark datasets and databases. Furthermore, guidance and recommendations are provided that cover the gaps and directions of the upcoming work in this research area. To the best of our knowledge, this is the first comprehensive comparison analysis of tools focused on DTBA with reference to AI/ML/DL.
Collapse
Affiliation(s)
- Maha Thafar
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Arwa Bin Raies
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Vladimir B. Bajic
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|