51
|
Matrix factorization with denoising autoencoders for prediction of drug–target interactions. Mol Divers 2022:10.1007/s11030-022-10492-8. [DOI: 10.1007/s11030-022-10492-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Accepted: 07/01/2022] [Indexed: 11/25/2022]
|
52
|
Cheng Z, Yan C, Wu FX, Wang J. Drug-Target Interaction Prediction Using Multi-Head Self-Attention and Graph Attention Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2208-2218. [PMID: 33956632 DOI: 10.1109/tcbb.2021.3077905] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identifying drug-target interactions (DTIs) is an important step in the process of new drug discovery and drug repositioning. Accurate predictions for DTIs can improve the efficiency in the drug discovery and development. Although rapid advances in deep learning technologies have generated various computational methods, it is still appealing to further investigate how to design efficient networks for predicting DTIs. In this study, we propose an end-to-end deep learning method (called MHSADTI) to predict DTIs based on the graph attention network and multi-head self-attention mechanism. First, the characteristics of drugs and proteins are extracted by the graph attention network and multi-head self-attention mechanism, respectively. Then, the attention scores are used to consider which amino acid subsequence in a protein is more important for the drug to predict its interactions. Finally, we predict DTIs by a fully connected layer after obtaining the feature vectors of drugs and proteins. MHSADTI takes advantage of self-attention mechanism for obtaining long-dependent contextual relationship in amino acid sequences and predicting DTI interpretability. More effective molecular characteristics are also obtained by the attention mechanism in graph attention networks. Multiple cross validation experiments are adopted to assess the performance of our MHSADTI. The experiments on four datasets, human, C.elegans, DUD-E and DrugBank show our method outperforms the state-of-the-art methods in terms of AUC, Precision, Recall, AUPR and F1-score. In addition, the case studies further demonstrate that our method can provide effective visualizations to interpret the prediction results from biological insights.
Collapse
|
53
|
Wang S, Li J, Wang Y, Juan L. A Neighborhood-Based Global Network Model to Predict Drug-Target Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2017-2025. [PMID: 33687846 DOI: 10.1109/tcbb.2021.3064614] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The detection of drug-target interactions (DTIs) plays an important role in drug discovery and development, making DTI prediction urgent to be solved. Existing computational methods usually utilize drug similarity, target similarity and DTI information to make prediction, providing the convenience of fast time and low cost. However, they usually learn features for drugs and targets separately, lacking of a global consideration. In this study, we proposed a novel neighborhood-based global network model, named as NGN, to accurately predict DTIs from the global perspective. We designed a distance constraint for features of all entities (drugs and targets) in the latent space to ensure the close distance between adjacent entities, and defined a global probability matrix to compute the predicted DTI scores on our constructed neighborhood-based global network. Results showed that NGN obtained advantageous performance compared with other state-of-the-art methods, especially surpassing them by 4.2-9.1 percent on AUPR values in the biggest dataset. Furthermore, several novel high-ranked DTIs were successfully predicted with confirmations by public sources, demonstrating the effectiveness of our method.
Collapse
|
54
|
Xu X, Xuan P, Zhang T, Chen B, Sheng N. Inferring Drug-Target Interactions Based on Random Walk and Convolutional Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2294-2304. [PMID: 33729947 DOI: 10.1109/tcbb.2021.3066813] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Computational strategies for identifying new drug-target interactions (DTIs) can guide the process of drug discovery, reduce the cost and time of drug development, and thus promote drug development. Most recently proposed methods predict DTIs via integration of heterogeneous data related to drugs and proteins. However, previous methods have failed to deeply integrate these heterogeneous data and learn deep feature representations of multiple original similarities and interactions related to drugs and proteins. We therefore constructed a heterogeneous network by integrating a variety of connection relationships about drugs and proteins, including drugs, proteins, and drug side effects, as well as their similarities, interactions, and associations. A DTI prediction method based on random walk and convolutional neural network was proposed and referred to as DTIPred. DTIPred not only takes advantage of various original features related to drugs and proteins, but also integrates the topological information of heterogeneous networks. The prediction model is composed of two sides and learns the deep feature representation of a drug-protein pair. On the left side, random walk with restart is applied to learn the topological vectors of drug and protein nodes. The topological representation is further learned by the constructed deep learning frame based on convolutional neural network. The right side of the model focuses on integrating multiple original similarities and interactions of drugs and proteins to learn the original representation of the drug-protein pair. The results of cross-validation experiments demonstrate that DTIPred achieves better prediction performance than several state-of-the-art methods. During the validation process, DTIPred can retrieve more actual drug-protein interactions within the top part of the predicted results, which may be more helpful to biologists. In addition, case studies on five drugs further demonstrate the ability of DTIPred to discover potential drug-protein interactions.
Collapse
|
55
|
Qiu Y, Zhang Y, Deng Y, Liu S, Zhang W. A Comprehensive Review of Computational Methods For Drug-Drug Interaction Detection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1968-1985. [PMID: 34003753 DOI: 10.1109/tcbb.2021.3081268] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The detection of drug-drug interactions (DDIs) is a crucial task for drug safety surveillance, which provides effective and safe co-prescriptions of multiple drugs. Since laboratory researches are often complicated, costly and time-consuming, it's urgent to develop computational approaches to detect drug-drug interactions. In this paper, we conduct a comprehensive review of state-of-the-art computational methods falling into three categories: literature-based extraction methods, machine learning-based prediction methods and pharmacovigilance-based data mining methods. Literature-based extraction methods detect DDIs from published literature using natural language processing techniques; machine learning-based prediction methods build prediction models based on the known DDIs in databases and predict novel ones; pharmacovigilance-based data mining methods usually apply statistical techniques on various electronic data to detect drug-drug interaction signals. We first present the taxonomy of drug-drug interaction detection methods and provide the outlines of three categories of methods. Afterwards, we respectively introduce research backgrounds and data sources of three categories, and illustrate their representative approaches as well as evaluation metrics. Finally, we discuss the current challenges of existing methods and highlight potential opportunities for future directions.
Collapse
|
56
|
Detecting Drug–Target Interactions with Feature Similarity Fusion and Molecular Graphs. BIOLOGY 2022; 11:biology11070967. [PMID: 36101348 PMCID: PMC9312204 DOI: 10.3390/biology11070967] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/12/2022] [Accepted: 06/24/2022] [Indexed: 12/03/2022]
Abstract
Simple Summary Accurate identification of potential targets for drugs to interact with can accelerate drug development. The identification of drug–target interactions can provide insights into hidden drug efficacy. This paper presents a prediction model based on feature similarity fusion that can identify crucial features of drugs and targets to help predict drug–target interactions. Abstract The key to drug discovery is the identification of a target and a corresponding drug compound. Effective identification of drug–target interactions facilitates the development of drug discovery. In this paper, drug similarity and target similarity are considered, and graphical representations are used to extract internal structural information and intermolecular interaction information about drugs and targets. First, drug similarity and target similarity are fused using the similarity network fusion (SNF) method. Then, the graph isomorphic network (GIN) is used to extract the features with information about the internal structure of drug molecules. For target proteins, feature extraction is carried out using TextCNN to efficiently capture the features of target protein sequences. Three different divisions (CVD, CVP, CVT) are used on the standard dataset, and experiments are carried out separately to validate the performance of the model for drug–target interaction prediction. The experimental results show that our method achieves better results on AUC and AUPR. The docking results also show the superiority of the proposed model in predicting drug–target interactions.
Collapse
|
57
|
Zong N, Li N, Wen A, Ngo V, Yu Y, Huang M, Chowdhury S, Jiang C, Fu S, Weinshilboum R, Jiang G, Hunter L, Liu H. BETA: a comprehensive benchmark for computational drug-target prediction. Brief Bioinform 2022; 23:6596989. [PMID: 35649342 PMCID: PMC9294420 DOI: 10.1093/bib/bbac199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/10/2022] [Accepted: 04/29/2022] [Indexed: 11/14/2022] Open
Abstract
Internal validation is the most popular evaluation strategy used for drug-target predictive models. The simple random shuffling in the cross-validation, however, is not always ideal to handle large, diverse and copious datasets as it could potentially introduce bias. Hence, these predictive models cannot be comprehensively evaluated to provide insight into their general performance on a variety of use-cases (e.g. permutations of different levels of connectiveness and categories in drug and target space, as well as validations based on different data sources). In this work, we introduce a benchmark, BETA, that aims to address this gap by (i) providing an extensive multipartite network consisting of 0.97 million biomedical concepts and 8.5 million associations, in addition to 62 million drug-drug and protein-protein similarities and (ii) presenting evaluation strategies that reflect seven cases (i.e. general, screening with different connectivity, target and drug screening based on categories, searching for specific drugs and targets and drug repurposing for specific diseases), a total of seven Tests (consisting of 344 Tasks in total) across multiple sampling and validation strategies. Six state-of-the-art methods covering two broad input data types (chemical structure- and gene sequence-based and network-based) were tested across all the developed Tasks. The best-worst performing cases have been analyzed to demonstrate the ability of the proposed benchmark to identify limitations of the tested methods for running over the benchmark tasks. The results highlight BETA as a benchmark in the selection of computational strategies for drug repurposing and target discovery.
Collapse
Affiliation(s)
- Nansu Zong
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| | - Ning Li
- Center for Structure Biology, Center for Cancer Research, National Cancer Institute, Frederick, MD
| | - Andrew Wen
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| | - Victoria Ngo
- Betty Irene Moore School of Nursing, University of California Davis Health, Sacramento, CA.,Stanford Health Policy, Stanford School of Medicine and Freeman Spogli Institute for International Studies, Palo Alto, CA
| | - Yue Yu
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| | - Ming Huang
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| | - Shaika Chowdhury
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| | - Chao Jiang
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL
| | - Sunyang Fu
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| | - Richard Weinshilboum
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN
| | - Guoqian Jiang
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| | - Lawrence Hunter
- Department of Pharmacology, University of Colorado Denver, Aurora, CO
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics Research, Mayo Clinic, Rochester, MN
| |
Collapse
|
58
|
Xuan P, Zhang X, Zhang Y, Hu K, Nakaguchi T, Zhang T. multi-type neighbors enhanced global topology and pairwise attribute learning for drug-protein interaction prediction. Brief Bioinform 2022; 23:6581435. [PMID: 35514190 DOI: 10.1093/bib/bbac120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 03/07/2022] [Accepted: 03/15/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Accurate identification of proteins interacted with drugs helps reduce the time and cost of drug development. Most of previous methods focused on integrating multisource data about drugs and proteins for predicting drug-target interactions (DTIs). There are both similarity connection and interaction connection between two drugs, and these connections reflect their relationships from different perspectives. Similarly, two proteins have various connections from multiple perspectives. However, most of previous methods failed to deeply integrate these connections. In addition, multiple drug-protein heterogeneous networks can be constructed based on multiple kinds of connections. The diverse topological structures of these networks are still not exploited completely. RESULTS We propose a novel model to extract and integrate multi-type neighbor topology information, diverse similarities and interactions related to drugs and proteins. Firstly, multiple drug-protein heterogeneous networks are constructed according to multiple kinds of connections among drugs and those among proteins. The multi-type neighbor node sequences of a drug node (or a protein node) are formed by random walks on each network and they reflect the hidden neighbor topological structure of the node. Secondly, a module based on graph neural network (GNN) is proposed to learn the multi-type neighbor topologies of each node. We propose attention mechanisms at neighbor node level and at neighbor type level to learn more informative neighbor nodes and neighbor types. A network-level attention is also designed to enhance the context dependency among multiple neighbor topologies of a pair of drug and protein nodes. Finally, the attribute embedding of the drug-protein pair is formulated by a proposed embedding strategy, and the embedding covers the similarities and interactions about the pair. A module based on three-dimensional convolutional neural networks (CNN) is constructed to deeply integrate pairwise attributes. Extensive experiments have been performed and the results indicate GCDTI outperforms several state-of-the-art prediction methods. The recall rate estimation over the top-ranked candidates and case studies on 5 drugs further demonstrate GCDTI's ability in discovering potential drug-protein interactions.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.,School of Computer Science, Shaanxi Normal University, Xi'an 710062, China
| | - Xiaowen Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Yu Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Kaimiao Hu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
59
|
Vo TH, Nguyen NTK, Kha QH, Le NQK. On the road to explainable AI in drug-drug interactions prediction: A systematic review. Comput Struct Biotechnol J 2022; 20:2112-2123. [PMID: 35832629 PMCID: PMC9092071 DOI: 10.1016/j.csbj.2022.04.021] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Revised: 04/15/2022] [Accepted: 04/15/2022] [Indexed: 12/26/2022] Open
Abstract
Over the past decade, polypharmacy instances have been common in multi-diseases treatment. However, unwanted drug-drug interactions (DDIs) that might cause unexpected adverse drug events (ADEs) in multiple regimens therapy remain a significant issue. Since artificial intelligence (AI) is ubiquitous today, many AI prediction models have been developed to predict DDIs to support clinicians in pharmacotherapy-related decisions. However, even though DDI prediction models have great potential for assisting physicians in polypharmacy decisions, there are still concerns regarding the reliability of AI models due to their black-box nature. Building AI models with explainable mechanisms can augment their transparency to address the above issue. Explainable AI (XAI) promotes safety and clarity by showing how decisions are made in AI models, especially in critical tasks like DDI predictions. In this review, a comprehensive overview of AI-based DDI prediction, including the publicly available source for AI-DDIs studies, the methods used in data manipulation and feature preprocessing, the XAI mechanisms to promote trust of AI, especially for critical tasks as DDIs prediction, the modeling methods, is provided. Limitations and the future directions of XAI in DDIs are also discussed.
Collapse
Affiliation(s)
- Thanh Hoa Vo
- Master Program in Clinical Genomics and Proteomics, College of Pharmacy, Taipei Medical University, Taipei 110, Taiwan
| | - Ngan Thi Kim Nguyen
- School of Nutrition and Health Sciences, College of Nutrition, Taipei Medical University, Taipei 11031, Taiwan
| | - Quang Hien Kha
- International Master/Ph.D. Program in Medicine, College of Medicine, Taipei Medical University, Taipei 110, Taiwan
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 106, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei 106, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei 110, Taiwan
| |
Collapse
|
60
|
Amiri Souri E, Laddach R, Karagiannis SN, Papageorgiou LG, Tsoka S. Novel drug-target interactions via link prediction and network embedding. BMC Bioinformatics 2022; 23:121. [PMID: 35379165 PMCID: PMC8978405 DOI: 10.1186/s12859-022-04650-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 03/17/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As many interactions between the chemical and genomic space remain undiscovered, computational methods able to identify potential drug-target interactions (DTIs) are employed to accelerate drug discovery and reduce the required cost. Predicting new DTIs can leverage drug repurposing by identifying new targets for approved drugs. However, developing an accurate computational framework that can efficiently incorporate chemical and genomic spaces remains extremely demanding. A key issue is that most DTI predictions suffer from the lack of experimentally validated negative interactions or limited availability of target 3D structures. RESULTS We report DT2Vec, a pipeline for DTI prediction based on graph embedding and gradient boosted tree classification. It maps drug-drug and protein-protein similarity networks to low-dimensional features and the DTI prediction is formulated as binary classification based on a strategy of concatenating the drug and target embedding vectors as input features. DT2Vec was compared with three top-performing graph similarity-based algorithms on a standard benchmark dataset and achieved competitive results. In order to explore credible novel DTIs, the model was applied to data from the ChEMBL repository that contain experimentally validated positive and negative interactions which yield a strong predictive model. Then, the developed model was applied to all possible unknown DTIs to predict new interactions. The applicability of DT2Vec as an effective method for drug repurposing is discussed through case studies and evaluation of some novel DTI predictions is undertaken using molecular docking. CONCLUSIONS The proposed method was able to integrate and map chemical and genomic space into low-dimensional dense vectors and showed promising results in predicting novel DTIs.
Collapse
Affiliation(s)
- E Amiri Souri
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London, WC2B 4BG, UK
| | - R Laddach
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London, WC2B 4BG, UK
- St. John's Institute of Dermatology, School of Basic and Medical Biosciences, King's College London, Guy's Hospital, London, SE1 9RT, UK
| | - S N Karagiannis
- St. John's Institute of Dermatology, School of Basic and Medical Biosciences, King's College London, Guy's Hospital, London, SE1 9RT, UK
- Breast Cancer Now Research Unit, School of Cancer and Pharmaceutical Sciences, King's College London, Guy's Cancer Centre, London, SE1 9RT, UK
| | - L G Papageorgiou
- Centre for Process Systems Engineering, Department of Chemical Engineering, University College London, Torrington Place, London, WC1E 7JE, UK
| | - S Tsoka
- Department of Informatics, Faculty of Natural, Mathematical and Engineering Sciences, King's College London, Bush House, London, WC2B 4BG, UK.
| |
Collapse
|
61
|
Affinity2Vec: drug-target binding affinity prediction through representation learning, graph mining, and machine learning. Sci Rep 2022; 12:4751. [PMID: 35306525 PMCID: PMC8934358 DOI: 10.1038/s41598-022-08787-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 03/08/2022] [Indexed: 11/21/2022] Open
Abstract
Drug-target interaction (DTI) prediction plays a crucial role in drug repositioning and virtual drug screening. Most DTI prediction methods cast the problem as a binary classification task to predict if interactions exist or as a regression task to predict continuous values that indicate a drug's ability to bind to a specific target. The regression-based methods provide insight beyond the binary relationship. However, most of these methods require the three-dimensional (3D) structural information of targets which are still not generally available to the targets. Despite this bottleneck, only a few methods address the drug-target binding affinity (DTBA) problem from a non-structure-based approach to avoid the 3D structure limitations. Here we propose Affinity2Vec, as a novel regression-based method that formulates the entire task as a graph-based problem. To develop this method, we constructed a weighted heterogeneous graph that integrates data from several sources, including drug-drug similarity, target-target similarity, and drug-target binding affinities. Affinity2Vec further combines several computational techniques from feature representation learning, graph mining, and machine learning to generate or extract features, build the model, and predict the binding affinity between the drug and the target with no 3D structural data. We conducted extensive experiments to evaluate and demonstrate the robustness and efficiency of the proposed method on benchmark datasets used in state-of-the-art non-structured-based drug-target binding affinity studies. Affinity2Vec showed superior and competitive results compared to the state-of-the-art methods based on several evaluation metrics, including mean squared error, rm2, concordance index, and area under the precision-recall curve.
Collapse
|
62
|
Li J, Wang J, Lv H, Zhang Z, Wang Z. IMCHGAN: Inductive Matrix Completion With Heterogeneous Graph Attention Networks for Drug-Target Interactions Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:655-665. [PMID: 34115592 DOI: 10.1109/tcbb.2021.3088614] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identification of targets among known drugs plays an important role in drug repurposing and discovery. Computational approaches for prediction of drug-target interactions (DTIs)are highly desired in comparison to traditional biological experiments as its fast and low price. Moreover, recent advances of systems biology approaches have generated large-scale heterogeneous, biological information networks data, which offer opportunities for machine learning-based identification of DTIs. We present a novel Inductive Matrix Completion with Heterogeneous Graph Attention Network approach (IMCHGAN)for predicting DTIs. IMCHGAN first adopts a two-level neural attention mechanism approach to learn drug and target latent feature representations from the DTI heterogeneous network respectively. Then, the learned latent features are fed into the Inductive Matrix Completion (IMC)prediction score model which computes the best projection from drug space onto target space and output DTI score via the inner product of projected drug and target feature representations. IMCHGAN is an end-to-end neural network learning framework where the parameters of both the prediction score model and the feature representation learning model are simultaneously optimized via backpropagation under supervising of the observed known drug-target interactions data. We compare IMCHGAN with other state-of-the-art baselines on two real DTI experimental datasets. The results show that our method is superior to existing methods in term of AUC and AUPR. Moreover, IMCHGAN also shows it has strong predictive power for novel (unknown)DTIs. All datasets and code can be obtained from https://github.com/ljatynu/IMCHGAN/.
Collapse
|
63
|
Drug-target interaction prediction via an ensemble of weighted nearest neighbors with interaction recovery. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02495-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
64
|
Staszak M, Staszak K, Wieszczycka K, Bajek A, Roszkowski K, Tylkowski B. Machine learning in drug design: Use of artificial intelligence to explore the chemical structure–biological activity relationship. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1568] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Maciej Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Katarzyna Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Karolina Wieszczycka
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Anna Bajek
- Department of Tissue Engineering Collegium Medicum, Nicolaus Copernicus University Bydgoszcz Poland
| | - Krzysztof Roszkowski
- Department of Oncology Collegium Medicum Nicolaus Copernicus University Bydgoszcz Poland
| | - Bartosz Tylkowski
- Department of Chemical Engineering University Rovira i Virgili Tarragona Spain
- Eurecat, Centre Tecnològic de Catalunya Chemical Technologies Unit Tarragona Spain
| |
Collapse
|
65
|
Hu K, Cui H, Zhang T, Sun C, Xuan P. ALDPI: adaptively learning importance of multi-scale topologies and multi-modality similarities for drug-protein interaction prediction. Brief Bioinform 2022; 23:6519792. [PMID: 35108362 DOI: 10.1093/bib/bbab606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 12/20/2021] [Accepted: 12/28/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Effective computational methods to predict drug-protein interactions (DPIs) are vital for drug discovery in reducing the time and cost of drug development. Recent DPI prediction methods mainly exploit graph data composed of multiple kinds of connections among drugs and proteins. Each node in the graph usually has topological structures with multiple scales formed by its first-order neighbors and multi-order neighbors. However, most of the previous methods do not consider the topological structures of multi-order neighbors. In addition, deep integration of the multi-modality similarities of drugs and proteins is also a challenging task. RESULTS We propose a model called ALDPI to adaptively learn the multi-scale topologies and multi-modality similarities with various significance levels. We first construct a drug-protein heterogeneous graph, which is composed of the interactions and the similarities with multiple modalities among drugs and proteins. An adaptive graph learning module is then designed to learn important kinds of connections in heterogeneous graph and generate new topology graphs. A module based on graph convolutional autoencoders is established to learn multiple representations, which imply the node attributes and multiple-scale topologies composed of one-order and multi-order neighbors, respectively. We also design an attention mechanism at neighbor topology level to distinguish the importance of these representations. Finally, since each similarity modality has its specific features, we construct a multi-layer convolutional neural network-based module to learn and fuse multi-modality features to obtain the attribute representation of each drug-protein node pair. Comprehensive experimental results show ALDPI's superior performance over six state-of-the-art methods. The results of recall rates of top-ranked candidates and case studies on five drugs further demonstrate the ability of ALDPI to discover potential drug-related protein candidates. CONTACT zhang@hlju.edu.cn.
Collapse
Affiliation(s)
- Kaimiao Hu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Chang Sun
- College of Computer Science, Nankai University, Tianjin 300071, China
| | - Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
66
|
A novel graph mining approach to predict and evaluate food-drug interactions. Sci Rep 2022; 12:1061. [PMID: 35058561 PMCID: PMC8776972 DOI: 10.1038/s41598-022-05132-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 01/05/2022] [Indexed: 12/26/2022] Open
Abstract
Food-drug interactions (FDIs) arise when nutritional dietary consumption regulates biochemical mechanisms involved in drug metabolism. This study proposes FDMine, a novel systematic framework that models the FDI problem as a homogenous graph. Our dataset consists of 788 unique approved small molecule drugs with metabolism-related drug-drug interactions and 320 unique food items, composed of 563 unique compounds. The potential number of interactions is 87,192 and 92,143 for disjoint and joint versions of the graph. We defined several similarity subnetworks comprising food-drug similarity, drug-drug similarity, and food-food similarity networks. A unique part of the graph involves encoding the food composition as a set of nodes and calculating a content contribution score. To predict new FDIs, we considered several link prediction algorithms and various performance metrics, including the precision@top (top 1%, 2%, and 5%) of the newly predicted links. The shortest path-based method has achieved a precision of 84%, 60% and 40% for the top 1%, 2% and 5% of FDIs identified, respectively. We validated the top FDIs predicted using FDMine to demonstrate its applicability, and we relate therapeutic anti-inflammatory effects of food items informed by FDIs. FDMine is publicly available to support clinicians and researchers.
Collapse
|
67
|
Yang Z, Zhong W, Zhao L, Yu-Chian Chen C. MGraphDTA: deep multiscale graph neural network for explainable drug-target binding affinity prediction. Chem Sci 2022; 13:816-833. [PMID: 35173947 PMCID: PMC8768884 DOI: 10.1039/d1sc05180f] [Citation(s) in RCA: 128] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 12/17/2021] [Indexed: 12/22/2022] Open
Abstract
Predicting drug-target affinity (DTA) is beneficial for accelerating drug discovery. Graph neural networks (GNNs) have been widely used in DTA prediction. However, existing shallow GNNs are insufficient to capture the global structure of compounds. Besides, the interpretability of the graph-based DTA models highly relies on the graph attention mechanism, which can not reveal the global relationship between each atom of a molecule. In this study, we proposed a deep multiscale graph neural network based on chemical intuition for DTA prediction (MGraphDTA). We introduced a dense connection into the GNN and built a super-deep GNN with 27 graph convolutional layers to capture the local and global structure of the compound simultaneously. We also developed a novel visual explanation method, gradient-weighted affinity activation mapping (Grad-AAM), to analyze a deep learning model from the chemical perspective. We evaluated our approach using seven benchmark datasets and compared the proposed method to the state-of-the-art deep learning (DL) models. MGraphDTA outperforms other DL-based approaches significantly on various datasets. Moreover, we show that Grad-AAM creates explanations that are consistent with pharmacologists, which may help us gain chemical insights directly from data beyond human perception. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of DTA prediction modeling.
Collapse
Affiliation(s)
- Ziduo Yang
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
| | - Weihe Zhong
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
- Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University Guangzhou 510655 China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
- Department of Medical Research, China Medical University Hospital Taichung 40447 Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University Taichung 41354 Taiwan
| |
Collapse
|
68
|
Zhang P, Wei Z, Che C, Jin B. DeepMGT-DTI: Transformer network incorporating multilayer graph information for Drug-Target interaction prediction. Comput Biol Med 2022; 142:105214. [PMID: 35030496 DOI: 10.1016/j.compbiomed.2022.105214] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 12/26/2021] [Accepted: 01/02/2022] [Indexed: 12/29/2022]
Abstract
Drug-target interaction (DTI) prediction reduces the cost and time of drug development, and plays a vital role in drug discovery. However, most of research does not fully explore the molecular structures of drug compounds in DTI prediction. To this end, we propose a deep learning model to capture the molecular structure information of drug compounds for DTI prediction. This model utilizes a transformer network incorporating multilayer graph information, which captures the features of a drug's molecular structure so that the interactions between atoms of drug compounds can be explored more deeply. At the same time, a convolutional neural network is employed to capture the local residue information in the target sequence, and effectively extract the feature information of the target. The experiments on the DrugBank dataset showed that the proposed model outperformed previous models based on the structure of target sequences. The results indicate that the improved transformer network fuses the feature information between layers in the graph convolutional neural network and extracts the interaction data for the molecular structure. The drug repositioning experiment on COVID-19 and Alzheimer's disease demonstrated the proposed model's ability to find therapeutic drugs in drug discovery. The code of our model is available at https://github.com/zhangpl109/DeepMGT-DTI.
Collapse
Affiliation(s)
- Peiliang Zhang
- Key Laboratory of Advanced Design and Intelligent Computing (Dalian University), Ministry of Education, Dalian, 116622, China.
| | - Ziqi Wei
- School of Software, Tsinghua University, Beijing, 100084, China.
| | - Chao Che
- Key Laboratory of Advanced Design and Intelligent Computing (Dalian University), Ministry of Education, Dalian, 116622, China.
| | - Bo Jin
- School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian, 116024, China.
| |
Collapse
|
69
|
Wang S, Li J, Wang Y. M2PP: a novel computational model for predicting drug-targeted pathogenic proteins. BMC Bioinformatics 2022; 23:7. [PMID: 34983358 PMCID: PMC8728953 DOI: 10.1186/s12859-021-04522-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 12/07/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Detecting pathogenic proteins is the origin way to understand the mechanism and resist the invasion of diseases, making pathogenic protein prediction develop into an urgent problem to be solved. Prediction for genome-wide proteins may be not necessarily conducive to rapidly cure diseases as developing new drugs specifically for the predicted pathogenic protein always need major expenditures on time and cost. In order to facilitate disease treatment, computational method to predict pathogenic proteins which are targeted by existing drugs should be exploited. RESULTS In this study, we proposed a novel computational model to predict drug-targeted pathogenic proteins, named as M2PP. Three types of features were presented on our constructed heterogeneous network (including target proteins, diseases and drugs), which were based on the neighborhood similarity information, drug-inferred information and path information. Then, a random forest regression model was trained to score unconfirmed target-disease pairs. Five-fold cross-validation experiment was implemented to evaluate model's prediction performance, where M2PP achieved advantageous results compared with other state-of-the-art methods. In addition, M2PP accurately predicted high ranked pathogenic proteins for common diseases with public biomedical literature as supporting evidence, indicating its excellent ability. CONCLUSIONS M2PP is an effective and accurate model to predict drug-targeted pathogenic proteins, which could provide convenience for the future biological researches.
Collapse
Affiliation(s)
- Shiming Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China
| | - Jie Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
| |
Collapse
|
70
|
Song T, Wang G, Ding M, Rodriguez-Paton A, Wang X, Wang S. Network-Based Approaches for Drug Repositioning. Mol Inform 2021; 41:e2100200. [PMID: 34970871 DOI: 10.1002/minf.202100200] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 12/05/2021] [Indexed: 12/25/2022]
Abstract
With deep learning creeping up into the ranks of big data, new models based on deep learning and massive data have made great leaps forward rapidly in the field of drug repositioning. However, there is no relevant review to summarize the transformations and development process of models and their data in the field of drug repositioning. Among all the computational methods, network-based methods play an extraordinary role. In view of these circumstances, understanding and comparing existing network-based computational methods applied in drug repositioning will help us recognize the cutting-edge technologies and offer valuable information for relevant researchers. Therefore, in this review, we present an interpretation of the series of important network-based methods applied in drug repositioning, together with their comparisons and development process.
Collapse
Affiliation(s)
- Tao Song
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.,Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Campus de Montegancedo, Boadilla del Monte, 28660, Madrid, Spain
| | - Gan Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Mao Ding
- Department of Neurology Medicine, The Second Hospital, Cheeloo College of Medicine, Shandong University, Ji Nan Shi, Jinan, 250033, China
| | - Alfonso Rodriguez-Paton
- Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Campus de Montegancedo, Boadilla del Monte, 28660, Madrid, Spain
| | - Xun Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.,China High Performance Computer Research Center, Institute of Computer Technology, Chinese Academy of Science, Beijing, 100190, Beijing, China
| | - Shudong Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| |
Collapse
|
71
|
Sorkhi AG, Abbasi Z, Mobarakeh MI, Pirgazi J. Drug-target interaction prediction using unifying of graph regularized nuclear norm with bilinear factorization. BMC Bioinformatics 2021; 22:555. [PMID: 34789169 PMCID: PMC8597250 DOI: 10.1186/s12859-021-04464-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 10/29/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Wet-lab experiments for identification of interactions between drugs and target proteins are time-consuming, costly and labor-intensive. The use of computational prediction of drug-target interactions (DTIs), which is one of the significant points in drug discovery, has been considered by many researchers in recent years. It also reduces the search space of interactions by proposing potential interaction candidates. RESULTS In this paper, a new approach based on unifying matrix factorization and nuclear norm minimization is proposed to find a low-rank interaction. In this combined method, to solve the low-rank matrix approximation, the terms in the DTI problem are used in such a way that the nuclear norm regularized problem is optimized by a bilinear factorization based on Rank-Restricted Soft Singular Value Decomposition (RRSSVD). In the proposed method, adjacencies between drugs and targets are encoded by graphs. Drug-target interaction, drug-drug similarity, target-target, and combination of similarities have also been used as input. CONCLUSIONS The proposed method is evaluated on four benchmark datasets known as Enzymes (E), Ion channels (ICs), G protein-coupled receptors (GPCRs) and nuclear receptors (NRs) based on AUC, AUPR, and time measure. The results show an improvement in the performance of the proposed method compared to the state-of-the-art techniques.
Collapse
Affiliation(s)
- Ali Ghanbari Sorkhi
- Faculty of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, P.O. Box, 48518-78195 Behshahr, Iran
| | - Zahra Abbasi
- School of Medicine, Faculty of Medical Biotechnology, Shahroud University of Medical Sciences, Shahroud, Iran
| | | | - Jamshid Pirgazi
- Faculty of Electrical and Computer Engineering, University of Science and Technology of Mazandaran, P.O. Box, 48518-78195 Behshahr, Iran
| |
Collapse
|
72
|
Gaudelet T, Day B, Jamasb AR, Soman J, Regep C, Liu G, Hayter JBR, Vickers R, Roberts C, Tang J, Roblin D, Blundell TL, Bronstein MM, Taylor-King JP. Utilizing graph machine learning within drug discovery and development. Brief Bioinform 2021; 22:bbab159. [PMID: 34013350 PMCID: PMC8574649 DOI: 10.1093/bib/bbab159] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 04/01/2021] [Accepted: 04/05/2021] [Indexed: 12/15/2022] Open
Abstract
Graph machine learning (GML) is receiving growing interest within the pharmaceutical and biotechnology industries for its ability to model biomolecular structures, the functional relationships between them, and integrate multi-omic datasets - amongst other data types. Herein, we present a multidisciplinary academic-industrial review of the topic within the context of drug discovery and development. After introducing key terms and modelling approaches, we move chronologically through the drug development pipeline to identify and summarize work incorporating: target identification, design of small molecules and biologics, and drug repurposing. Whilst the field is still emerging, key milestones including repurposed drugs entering in vivo studies, suggest GML will become a modelling framework of choice within biomedical machine learning.
Collapse
Affiliation(s)
| | - Ben Day
- Relation Therapeutics, London, UK
- The Computer Laboratory, University of Cambridge, UK
| | - Arian R Jamasb
- Relation Therapeutics, London, UK
- The Computer Laboratory, University of Cambridge, UK
- Department of Biochemistry, University of Cambridge, UK
| | | | | | | | | | | | | | - Jian Tang
- Mila, the Quebec AI Institute, Canada
- HEC Montreal, Canada
| | - David Roblin
- Relation Therapeutics, London, UK
- Juvenescence, London, UK
- The Francis Crick Institute, London, UK
| | | | - Michael M Bronstein
- Relation Therapeutics, London, UK
- Department of Computing, Imperial College London, UK
- Twitter, UK
| | | |
Collapse
|
73
|
Jung YS, Kim Y, Cho YR. Comparative analysis of network-based approaches and machine learning algorithms for predicting drug-target interactions. Methods 2021; 198:19-31. [PMID: 34737033 DOI: 10.1016/j.ymeth.2021.10.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 10/21/2021] [Accepted: 10/22/2021] [Indexed: 01/06/2023] Open
Abstract
Computational prediction of drug-target interactions (DTIs) is of particular importance in the process of drug repositioning because of its efficiency in selecting potential candidates for DTIs. A variety of computational methods for predicting DTIs have been proposed over the past decade. Our interest is which methods or techniques are the most advantageous for increasing prediction accuracy. This article provides a comprehensive overview of network-based, machine learning, and integrated DTI prediction methods. The network-based methods handle a DTI network along with drug and target similarities in a matrix form and apply graph-theoretic algorithms to identify new DTIs. Machine learning methods use known DTIs and the features of drugs and target proteins as training data to build a predictive model. Integrated methods combine these two techniques. We assessed the prediction performance of the selected state-of-the-art methods using two different benchmark datasets. Our experimental results demonstrate that the integrated methods outperform the others in general. Some previous methods showed low accuracy on predicting interactions of unknown drugs which do not exist in the training dataset. Combining similarity matrices from multiple features by data fusion was not beneficial in increasing prediction accuracy. Finally, we analyzed future directions for further improvements in DTI predictions.
Collapse
Affiliation(s)
- Yi-Sue Jung
- Division of Software, Yonsei University - Mirae Campus, Republic of Korea
| | - Yoonbee Kim
- Division of Software, Yonsei University - Mirae Campus, Republic of Korea
| | - Young-Rae Cho
- Division of Software, Yonsei University - Mirae Campus, Republic of Korea; Division of Digital Healthcare, Yonsei University - Mirae Campus, Republic of Korea.
| |
Collapse
|
74
|
Xuan P, Chen B, Zhang T, Yang Y. Prediction of Drug-Target Interactions Based on Network Representation Learning and Ensemble Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2671-2681. [PMID: 32340959 DOI: 10.1109/tcbb.2020.2989765] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Identifying interactions between drugs and target proteins is a critical step in the drug development process, as it helps identify new targets for drugs and accelerate drug development. The number of known drug-protein interactions (positive samples) is much lower than that of the unknown ones (negative samples), which forms a class imbalance. Most previous methods only utilised part of the negative samples to train the prediction model, so most of the information on negative samples was neglected. Therefore, a new method must be developed to predict candidate drug-related proteins and fully utilise negative samples to improve prediction performance. We present a method based on non-negative matrix factorisation and gradient boosting decision tree (GBDT), named NGDTP, to identify the candidate drug-protein interactions. NGDTP integrates multiple kinds of protein similarities, drugs-proteins interactions, and multiple kinds of drugs similarities at different levels, including target proteins of drugs, drug-related diseases, and side effects of drugs. We propose a network representation learning method based on matrix factorisation to learn low-dimensional vector representations of drug and protein nodes. On the basis of these low-dimensional node representations, a GBDT-based prediction model was constructed and it obtains the association scores through establishing multiple decision trees for a drug-protein pairs. NGDTP is an ensemble learning model that fully utilises all the negative samples to effectively alleviate the problem of class imbalance. NGDTP achieves superior prediction performance when it is compared with several state-of-the-art methods. The experimental results indicate that NGDTP also retrieves more actual drug-protein interactions in the top part of prediction result, which drew significant attention from the biologists. In addition, case studies on 10 drugs further confirmed the ability of the NGDTP to identify potential candidate proteins for drugs.
Collapse
|
75
|
Xuan P, Fan M, Cui H, Zhang T, Nakaguchi T. GVDTI: graph convolutional and variational autoencoders with attribute-level attention for drug-protein interaction prediction. Brief Bioinform 2021; 23:6412398. [PMID: 34718408 DOI: 10.1093/bib/bbab453] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 09/14/2021] [Accepted: 10/02/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Identifying proteins that interact with drugs plays an important role in the initial period of developing drugs, which helps to reduce the development cost and time. Recent methods for predicting drug-protein interactions mainly focus on exploiting various data about drugs and proteins. These methods failed to completely learn and integrate the attribute information of a pair of drug and protein nodes and their attribute distribution. RESULTS We present a new prediction method, GVDTI, to encode multiple pairwise representations, including attention-enhanced topological representation, attribute representation and attribute distribution. First, a framework based on graph convolutional autoencoder is constructed to learn attention-enhanced topological embedding that integrates the topology structure of a drug-protein network for each drug and protein nodes. The topological embeddings of each drug and each protein are then combined and fused by multi-layer convolution neural networks to obtain the pairwise topological representation, which reveals the hidden topological relationships between drug and protein nodes. The proposed attribute-wise attention mechanism learns and adjusts the importance of individual attribute in each topological embedding of drug and protein nodes. Secondly, a tri-layer heterogeneous network composed of drug, protein and disease nodes is created to associate the similarities, interactions and associations across the heterogeneous nodes. The attribute distribution of the drug-protein node pair is encoded by a variational autoencoder. The pairwise attribute representation is learned via a multi-layer convolutional neural network to deeply integrate the attributes of drug and protein nodes. Finally, the three pairwise representations are fused by convolutional and fully connected neural networks for drug-protein interaction prediction. The experimental results show that GVDTI outperformed other seven state-of-the-art methods in comparison. The improved recall rates indicate that GVDTI retrieved more actual drug-protein interactions in the top ranked candidates than conventional methods. Case studies on five drugs further confirm GVDTI's ability in discovering the potential candidate drug-related proteins. CONTACT zhang@hlju.edu.cn Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Mengsi Fan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| |
Collapse
|
76
|
Xuan P, Hu K, Cui H, Zhang T, Nakaguchi T. Learning multi-scale heterogeneous representations and global topology for drug-target interaction prediction. IEEE J Biomed Health Inform 2021; 26:1891-1902. [PMID: 34673498 DOI: 10.1109/jbhi.2021.3121798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Identification of drug-target interactions (DTIs) plays a critical role in drug discovery and repositioning. Deep integration of inter-connections and intra-similarities between heterogeneous multi-source data related to drugs and targets, however, is a challenging issue. We propose a DTI prediction model by learning from drug and protein related multi-scale attributes and global topology formed by heterogeneous connections. A drug-protein-disease heterogeneous network (RPD-Net) is firstly constructed to associate diverse similarities, interactions and associations across nodes. Secondly, we propose a multi-scale pairwise deep representation learning module consisting of a new embedding strategy to integrate diverse inter-relations and intra-relations, and dilation convolutions for multi-scale deep representation extraction. A global topology learning module is proposed which is composed of strategy based on non-negative matrix factorization (NMF) to extract topology from RPD-Net, and a new relational-level attention mechanism for discriminative topology embedding. Experimental results using public dataset demonstrate improved performance over state-of-the-art methods and contributions of our major innovations. Evaluation results by top k recall rates and case studies on five drugs further show the effectiveness in retrieving potential target candidates for drugs.
Collapse
|
77
|
Wang XR, Cao TT, Jia CM, Tian XM, Wang Y. Quantitative prediction model for affinity of drug-target interactions based on molecular vibrations and overall system of ligand-receptor. BMC Bioinformatics 2021; 22:497. [PMID: 34649499 PMCID: PMC8515642 DOI: 10.1186/s12859-021-04389-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 09/20/2021] [Indexed: 12/27/2022] Open
Abstract
Background The study of drug–target interactions (DTIs) affinity plays an important role in safety assessment and pharmacology. Currently, quantitative structure–activity relationship (QSAR) and molecular docking (MD) are most common methods in research of DTIs affinity. However, they often built for a specific target or several targets, and most QSAR and MD methods were based either on structure of drug molecules or on structure of receptors with low accuracy and small scope of application. How to construct quantitative prediction models with high accuracy and wide applicability remains a challenge. To this end, this paper screened molecular descriptors based on molecular vibrations and took molecule-target as a whole system to construct prediction models with high accuracy-wide applicability based on dissociation constant (Kd) and concentration for 50% of maximal effect (EC50), and to provide reference for quantifying affinity of DTIs. Results After comprehensive comparison, the results showed that RF models are optimal models to analyze and predict DTIs affinity with coefficients of determination (R2) are all greater than 0.94. Compared to the quantitative models reported in literatures, the RF models developed in this paper have higher accuracy and wide applicability. In addition, E-state molecular descriptors associated with molecular vibrations and normalized Moreau-Broto autocorrelation (G3), Moran autocorrelation (G4), transition-distribution (G7) protein descriptors are of higher importance in the quantification of DTIs. Conclusion Through screening molecular descriptors based on molecular vibrations and taking molecule-target as whole system, we obtained optimal models based on RF with more accurate-widely applicable, which indicated that selection of molecular descriptors associated with molecular vibrations and the use of molecular-target as whole system are reliable methods for improving performance of models. It can provide reference for quantifying affinity of DTIs. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04389-w.
Collapse
Affiliation(s)
- Xian-Rui Wang
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Ting-Ting Cao
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Cong Min Jia
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Xue-Mei Tian
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Yun Wang
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China.
| |
Collapse
|
78
|
Thafar MA, Olayan RS, Albaradei S, Bajic VB, Gojobori T, Essack M, Gao X. DTi2Vec: Drug-target interaction prediction using network embedding and ensemble learning. J Cheminform 2021; 13:71. [PMID: 34551818 PMCID: PMC8459562 DOI: 10.1186/s13321-021-00552-w] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 09/05/2021] [Indexed: 11/21/2022] Open
Abstract
Drug-target interaction (DTI) prediction is a crucial step in drug discovery and repositioning as it reduces experimental validation costs if done right. Thus, developing in-silico methods to predict potential DTI has become a competitive research niche, with one of its main focuses being improving the prediction accuracy. Using machine learning (ML) models for this task, specifically network-based approaches, is effective and has shown great advantages over the other computational methods. However, ML model development involves upstream hand-crafted feature extraction and other processes that impact prediction accuracy. Thus, network-based representation learning techniques that provide automated feature extraction combined with traditional ML classifiers dealing with downstream link prediction tasks may be better-suited paradigms. Here, we present such a method, DTi2Vec, which identifies DTIs using network representation learning and ensemble learning techniques. DTi2Vec constructs the heterogeneous network, and then it automatically generates features for each drug and target using the nodes embedding technique. DTi2Vec demonstrated its ability in drug-target link prediction compared to several state-of-the-art network-based methods, using four benchmark datasets and large-scale data compiled from DrugBank. DTi2Vec showed a statistically significant increase in the prediction performances in terms of AUPR. We verified the "novel" predicted DTIs using several databases and scientific literature. DTi2Vec is a simple yet effective method that provides high DTI prediction performance while being scalable and efficient in computation, translating into a powerful drug repositioning tool.
Collapse
Affiliation(s)
- Maha A Thafar
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
- College of Computers and Information Technology, Computer Science Department, Taif University, Taif, Kingdom of Saudi Arabia
| | - Rawan S Olayan
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
| | - Vladimir B Bajic
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
79
|
Yue Y, He S. DTI-HeNE: a novel method for drug-target interaction prediction based on heterogeneous network embedding. BMC Bioinformatics 2021; 22:418. [PMID: 34479477 PMCID: PMC8414716 DOI: 10.1186/s12859-021-04327-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 08/02/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Prediction of the drug-target interaction (DTI) is a critical step in the drug repurposing process, which can effectively reduce the following workload for experimental verification of potential drugs' properties. In recent studies, many machine-learning-based methods have been proposed to discover unknown interactions between drugs and protein targets. A recent trend is to use graph-based machine learning, e.g., graph embedding to extract features from drug-target networks and then predict new drug-target interactions. However, most of the graph embedding methods are not specifically designed for DTI predictions; thus, it is difficult for these methods to fully utilize the heterogeneous information of drugs and targets (e.g., the respective vertex features of drugs and targets and path-based interactive features between drugs and targets). RESULTS We propose a DTI prediction method DTI-HeNE (DTI based on Heterogeneous Network Embedding), which is specifically designed to cope with the bipartite DTI relations for generating high-quality embeddings of drug-target pairs. This method splits a heterogeneous DTI network into a bipartite DTI network, multiple drug homogeneous networks and target homogeneous networks, and extracts features from these sub-networks separately to better utilize the characteristics of bipartite DTI relations as well as the auxiliary similarity information related to drugs and targets. The features extracted from each sub-network are integrated using pathway information between these sub-networks to acquire new features, i.e., embedding vectors of drug-target pairs. Finally, these features are fed into a random forest (RF) model to predict novel DTIs. CONCLUSIONS Our experimental results show that, the proposed DTI network embedding method can learn higher-quality features of heterogeneous drug-target interaction networks for novel DTIs discovery.
Collapse
Affiliation(s)
- Yang Yue
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 100083, China
| | - Shan He
- Centre for Computational Biology, School of Computer Science, The University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK.
| |
Collapse
|
80
|
Chen C, Shi H, Jiang Z, Salhi A, Chen R, Cui X, Yu B. DNN-DTIs: Improved drug-target interactions prediction using XGBoost feature selection and deep neural network. Comput Biol Med 2021; 136:104676. [PMID: 34375902 DOI: 10.1016/j.compbiomed.2021.104676] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 07/18/2021] [Accepted: 07/19/2021] [Indexed: 02/03/2023]
Abstract
Analysis and prediction of drug-target interactions (DTIs) play an important role in understanding drug mechanisms, as well as drug repositioning and design. Machine learning (ML)-based methods for DTIs prediction can mitigate the shortcomings of time-consuming and labor-intensive experimental approaches, while providing new ideas and insights for drug design. We propose a novel pipeline for predicting drug-target interactions, called DNN-DTIs. First, the target information is characterized by a number of features, namely, pseudo-amino acid composition, pseudo position-specific scoring matrix, conjoint triad composition, transition and distribution, Moreau-Broto autocorrelation, and structural features. The drug compounds are subsequently encoded using substructure fingerprints. Next, eXtreme gradient boosting (XGBoost) is used to determine the subset of non-redundant features of importance. The optimal balanced set of sample vectors is obtained by applying the synthetic minority oversampling technique (SMOTE). Finally, a DTIs predictor, DNN-DTIs, is developed based on a deep neural network (DNN) via a layer-by-layer learning scheme. Experimental results indicate that DNN-DTIs achieves better performance than other state-of-the-art predictors with ACC values of 98.78%, 98.60%, 97.98%, 98.24% and 98.00% on Enzyme, Ion Channels (IC), GPCR, Nuclear Receptors (NR) and Kuang's datasets. Therefore, the accurate prediction performance of DNN-DTIs makes it a favored choice for contributing to the study of DTIs, especially drug repositioning.
Collapse
Affiliation(s)
- Cheng Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China; School of Computer Science and Technology, Shandong University, Qingdao, 266237, China
| | - Han Shi
- Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Zhiwen Jiang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Adil Salhi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Ruixin Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Xuefeng Cui
- School of Computer Science and Technology, Shandong University, Qingdao, 266237, China
| | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China; Key Laboratory of Computational Science and Application of Hainan Province, Haikou, 571158, China.
| |
Collapse
|
81
|
Pliakos K, Vens C, Tsoumakas G. Predicting Drug-Target Interactions With Multi-Label Classification and Label Partitioning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1596-1607. [PMID: 31689203 DOI: 10.1109/tcbb.2019.2951378] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Identifying drug-target interactions is crucial for drug discovery. Despite modern technologies used in drug screening, experimental identification of drug-target interactions is an extremely demanding task. Predicting drug-target interactions in silico can thereby facilitate drug discovery as well as drug repositioning. Various machine learning models have been developed over the years to predict such interactions. Multi-output learning models in particular have drawn the attention of the scientific community due to their high predictive performance and computational efficiency. These models are based on the assumption that all the labels are correlated with each other. However, this assumption is too optimistic. Here, we address drug-target interaction prediction as a multi-label classification task that is combined with label partitioning. We show that building multi-output learning models over groups (clusters) of labels often leads to superior results. The performed experiments confirm the efficiency of the proposed framework.
Collapse
|
82
|
Sun C, Cao Y, Wei JM, Liu J. Autoencoder-based Drug-Target Interaction Prediction by Preserving the Consistency of Chemical Properties and Functions of Drugs. Bioinformatics 2021; 37:3618-3625. [PMID: 34019069 DOI: 10.1093/bioinformatics/btab384] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 05/06/2021] [Accepted: 05/18/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Exploring the potential drug-target interactions (DTIs) is a key step in drug discovery and repurposing. In recent years, predicting the probable DTIs through computational methods has gradually become a research hot spot. However, most of the previous studies failed to judiciously take into account the consistency between the chemical properties of drug and its functions. The changes of these relationships may lead to a severely negative effect on the prediction of DTIs. RESULTS We propose an autoencoder-based method, AEFS, under spatial consistency constraints to predict DTIs. A heterogeneous network is established to integrate the information of drugs, proteins and diseases. The original drug features are projected to an embedding (protein) space by a multi-layer encoder, and further projected into label (disease) space by a decoder. In this process, the clinical information of drugs is introduced to assist the DTI prediction. By maintaining the distribution of drug correlation in the original feature, embedding and label space, AEFS keeps the consistency between chemical properties and functions of drugs. Experimental comparisons indicate that AEFS is more robust for imbalanced data and of significantly superior performance in DTI prediction. Case studies further confirm its ability to mine the latent drug-target interactions. AVAILABILITY The code of AEFS is available at https://github.com/JackieSun818/AEFS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chang Sun
- College of Computer Science, Nankai University, Tianjin, 300071, China.,Institute of Big Data, Nankai University, Tianjin, 300071, China
| | - Yangkun Cao
- School of Artificial Intelligence, Jilin University, Changchun, 130012, China
| | - Jin-Mao Wei
- College of Computer Science, Nankai University, Tianjin, 300071, China.,Institute of Big Data, Nankai University, Tianjin, 300071, China
| | - Jian Liu
- College of Computer Science, Nankai University, Tianjin, 300071, China.,Institute of Big Data, Nankai University, Tianjin, 300071, China
| |
Collapse
|
83
|
Deng D, Chen X, Zhang R, Lei Z, Wang X, Zhou F. XGraphBoost: Extracting Graph Neural Network-Based Features for a Better Prediction of Molecular Properties. J Chem Inf Model 2021; 61:2697-2705. [PMID: 34009965 DOI: 10.1021/acs.jcim.0c01489] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Determining the properties of chemical molecules is essential for screening candidates similar to a specific drug. These candidate molecules are further evaluated for their target binding affinities, side effects, target missing probabilities, etc. Conventional machine learning algorithms demonstrated satisfying prediction accuracies of molecular properties. A molecule cannot be directly loaded into a machine learning model, and a set of engineered features needs to be designed and calculated from a molecule. Such hand-crafted features rely heavily on the experiences of the investigating researchers. The concept of graph neural networks (GNNs) was recently introduced to describe the chemical molecules. The features may be automatically and objectively extracted from the molecules through various types of GNNs, e.g., GCN (graph convolution network), GGNN (gated graph neural network), DMPNN (directed message passing neural network), etc. However, the training of a stable GNN model requires a huge number of training samples and a large amount of computing power, compared with the conventional machine learning strategies. This study proposed the integrated framework XGraphBoost to extract the features using a GNN and build an accurate prediction model of molecular properties using the classifier XGBoost. The proposed framework XGraphBoost fully inherits the merits of the GNN-based automatic molecular feature extraction and XGBoost-based accurate prediction performance. Both classification and regression problems were evaluated using the framework XGraphBoost. The experimental results strongly suggest that XGraphBoost may facilitate the efficient and accurate predictions of various molecular properties. The source code is freely available to academic users at https://github.com/chenxiaowei-vincent/XGraphBoost.git.
Collapse
Affiliation(s)
- Daiguo Deng
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China
| | - Xiaowei Chen
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China
| | - Ruochi Zhang
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China.,College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, P.R. China
| | - Zengrong Lei
- Fermion Technology Co., Ltd., Guangzhou, Guangdong 510000, P.R. China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P.R. China
| | - Fengfeng Zhou
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, P.R. China
| |
Collapse
|
84
|
Cao K, Xiao Y, Hou M. Correlation-driven framework based on graph convolutional network for clinical disease classification. J STAT COMPUT SIM 2021. [DOI: 10.1080/00949655.2021.1921777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Kai Cao
- School of Mathematics and Statistics, Central South University, Changsha, People’s Republic of China
| | - Ying Xiao
- The Second Xiangya Hospital, Central South University, Changsha, People’s Republic of China
| | - Muzhou Hou
- School of Mathematics and Statistics, Central South University, Changsha, People’s Republic of China
| |
Collapse
|
85
|
Yang Z, Zhong W, Zhao L, Chen CYC. ML-DTI: Mutual Learning Mechanism for Interpretable Drug-Target Interaction Prediction. J Phys Chem Lett 2021; 12:4247-4261. [PMID: 33904745 DOI: 10.1021/acs.jpclett.1c00867] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Deep learning (DL) provides opportunities for the identification of drug-target interactions (DTIs). The challenges of applying DL lie primarily with the lack of interpretability. Also, most of the existing DL-based methods formulate the drug and target encoder as two independent modules without considering the relationship between them. In this study, we propose a mutual learning mechanism to bridge the gap between the two encoders. We formulated the DTI problem from a global perspective by inserting mutual learning layers between the two encoders. The mutual learning layer was achieved by multihead attention and position-aware attention. The neural attention mechanism also provides effective visualization, which makes it easier to analyze a model. We evaluated our approach using three benchmark kinase data sets under different experimental settings and compared the proposed method to three baseline models. We found that the four methods yielded similar results in the random split setting (training and test sets share common drugs and targets), while the proposed method increases the predictive performance significantly in the orphan-target and orphan-drug split setting (training and test sets share only targets or drugs). The experimental results demonstrated that the proposed method improved the generalization and interpretation capability of DTI modeling.
Collapse
Affiliation(s)
- Ziduo Yang
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
| | - Weihe Zhong
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
- Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510655, China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
- Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| |
Collapse
|
86
|
Li S, Wang Y, Li C, Yang N, Yu H, Zhou W, Chen S, Yang S, Li Y. Study on Hepatotoxicity of Rhubarb Based on Metabolomics and Network Pharmacology. DRUG DESIGN DEVELOPMENT AND THERAPY 2021; 15:1883-1902. [PMID: 33976539 PMCID: PMC8106470 DOI: 10.2147/dddt.s301417] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/13/2021] [Indexed: 12/12/2022]
Abstract
Background Rhubarb, as a traditional Chinese medicine, is the preferred drug for the treatment of stagnation and constipation in clinical practice. It has been reported that rhubarb possesses hepatotoxicity, but its mechanism in vivo is still unclear. Methods In this study, the chemical components in rhubarb were identified based on UPLC-Q-TOF/MS combined with data postprocessing technology. The metabolic biomarkers obtained through metabolomics technology were related to rhubarb-induced hepatotoxicity. Furthermore, the potential targets of rhubarb-induced hepatotoxicity were obtained by network pharmacology involving the above components and metabolites. Meanwhile, GO gene enrichment analysis and KEGG pathway analysis were performed on the common targets. Results Twenty-eight components in rhubarb were identified based on UPLC-Q-TOF/MS, and 242 targets related to rhubarb ingredients were predicted. Nine metabolic biomarkers obtained through metabolomics technology were closely related to rhubarb-induced hepatotoxicity, and 282 targets of metabolites were predicted. Among them, the levels of 4 metabolites, namely dynorphin B (10–13), cervonoyl ethanolamide, lysoPE (18:2), and 3-hydroxyphenyl 2-hydroxybenzoate, significantly increased, while the levels of 5 metabolites, namely dopamine, biopterin, choline, coenzyme Q9 and P1, P4-bis (5ʹ-uridyl) tetraphosphate significantly decreased. In addition, 166 potential targets of rhubarb-induced hepatotoxicity were obtained by network pharmacology. The KEGG pathway analysis was performed on the common targets to obtain 46 associated signaling pathways. Conclusion These data suggested that rhubarb may cause liver toxicity due to its action on dopamine D1 receptor (DRD1), dopamine D2 receptor (DRD2), phosphodiesterase 4B (PDE4B), vanilloid receptor (TRPV1); transient receptor potential cation channel subfamily M member 8 (TRPM8), prostanoid EP2 receptor (PTGER2), acetylcholinesterase (ACHE), muscarinic acetylcholine receptor M3 (CHRM3) through the cAMP signaling pathway, cholinergic synapses, and inflammatory mediators to regulate TRP channels. Metabolomics technology and network pharmacology were integrated to explore rhubarb hepatotoxicity to promote the reasonable clinical application of rhubarb.
Collapse
Affiliation(s)
- Shanze Li
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Yuming Wang
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Chunyan Li
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Na Yang
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Hongxin Yu
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Wenjie Zhou
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Siyu Chen
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Shenshen Yang
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| | - Yubo Li
- Tianjin University of Traditional Chinese Medicine, Tianjin, People's Republic of China
| |
Collapse
|
87
|
Sajadi SZ, Zare Chahooki MA, Gharaghani S, Abbasi K. AutoDTI++: deep unsupervised learning for DTI prediction by autoencoders. BMC Bioinformatics 2021; 22:204. [PMID: 33879050 PMCID: PMC8056558 DOI: 10.1186/s12859-021-04127-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 04/09/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Drug-target interaction (DTI) plays a vital role in drug discovery. Identifying drug-target interactions related to wet-lab experiments are costly, laborious, and time-consuming. Therefore, computational methods to predict drug-target interactions are an essential task in the drug discovery process. Meanwhile, computational methods can reduce search space by proposing potential drugs already validated on wet-lab experiments. Recently, deep learning-based methods in drug-target interaction prediction have gotten more attention. Traditionally, DTI prediction methods' performance heavily depends on additional information, such as protein sequence and molecular structure of the drug, as well as deep supervised learning. RESULTS This paper proposes a method based on deep unsupervised learning for drug-target interaction prediction called AutoDTI++. The proposed method includes three steps. The first step is to pre-process the interaction matrix. Since the interaction matrix is sparse, we solved the sparsity of the interaction matrix with drug fingerprints. Then, in the second step, the AutoDTI approach is introduced. In the third step, we post-preprocess the output of the AutoDTI model. CONCLUSIONS Experimental results have shown that we were able to improve the prediction performance. To this end, the proposed method has been compared to other algorithms using the same reference datasets. The proposed method indicates that the experimental results of running five repetitions of tenfold cross-validation on golden standard datasets (Nuclear Receptors, GPCRs, Ion channels, and Enzymes) achieve good performance with high accuracy.
Collapse
Affiliation(s)
| | | | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Karim Abbasi
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| |
Collapse
|
88
|
Xuan P, Zhang Y, Cui H, Zhang T, Guo M, Nakaguchi T. Integrating multi-scale neighbouring topologies and cross-modal similarities for drug-protein interaction prediction. Brief Bioinform 2021; 22:6220173. [PMID: 33839743 DOI: 10.1093/bib/bbab119] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 02/15/2021] [Accepted: 03/12/2021] [Indexed: 01/02/2023] Open
Abstract
MOTIVATION Identifying the proteins that interact with drugs can reduce the cost and time of drug development. Existing computerized methods focus on integrating drug-related and protein-related data from multiple sources to predict candidate drug-target interactions (DTIs). However, multi-scale neighboring node sequences and various kinds of drug and protein similarities are neither fully explored nor considered in decision making. RESULTS We propose a drug-target interaction prediction method, DTIP, to encode and integrate multi-scale neighbouring topologies, multiple kinds of similarities, associations, interactions related to drugs and proteins. We firstly construct a three-layer heterogeneous network to represent interactions and associations across drug, protein, and disease nodes. Then a learning framework based on fully-connected autoencoder is proposed to learn the nodes' low-dimensional feature representations within the heterogeneous network. Secondly, multi-scale neighbouring sequences of drug and protein nodes are formulated by random walks. A module based on bidirectional gated recurrent unit is designed to learn the neighbouring sequential information and integrate the low-dimensional features of nodes. Finally, we propose attention mechanisms at feature level, neighbouring topological level and similarity level to learn more informative features, topologies and similarities. The prediction results are obtained by integrating neighbouring topologies, similarities and feature attributes using a multiple layer CNN. Comprehensive experimental results over public dataset demonstrated the effectiveness of our innovative features and modules. Comparison with other state-of-the-art methods and case studies of five drugs further validated DTIP's ability in discovering the potential candidate drug-related proteins.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Yu Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| |
Collapse
|
89
|
Liu Z, Chen Q, Lan W, Pan H, Hao X, Pan S. GADTI: Graph Autoencoder Approach for DTI Prediction From Heterogeneous Network. Front Genet 2021; 12:650821. [PMID: 33912218 PMCID: PMC8072283 DOI: 10.3389/fgene.2021.650821] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 03/12/2021] [Indexed: 12/26/2022] Open
Abstract
Identifying drug–target interaction (DTI) is the basis for drug development. However, the method of using biochemical experiments to discover drug-target interactions has low coverage and high costs. Many computational methods have been developed to predict potential drug-target interactions based on known drug-target interactions, but the accuracy of these methods still needs to be improved. In this article, a graph autoencoder approach for DTI prediction (GADTI) was proposed to discover potential interactions between drugs and targets using a heterogeneous network, which integrates diverse drug-related and target-related datasets. Its encoder consists of two components: a graph convolutional network (GCN) and a random walk with restart (RWR). And the decoder is DistMult, a matrix factorization model, using embedding vectors from encoder to discover potential DTIs. The combination of GCN and RWR can provide nodes with more information through a larger neighborhood, and it can also avoid over-smoothing and computational complexity caused by multi-layer message passing. Based on the 10-fold cross-validation, we conduct three experiments in different scenarios. The results show that GADTI is superior to the baseline methods in both the area under the receiver operator characteristic curve and the area under the precision–recall curve. In addition, based on the latest Drugbank dataset (V5.1.8), the case study shows that 54.8% of new approved DTIs are predicted by GADTI.
Collapse
Affiliation(s)
- Zhixian Liu
- School of Medical, Guangxi University, Nanning, China.,School of Electronics and Information Engineering, Beibu Gulf University, Qinzhou, China
| | - Qingfeng Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Haiming Pan
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Xinkun Hao
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Shirui Pan
- Department of Data Science and AI, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
90
|
Zheng Y, Wu Z. A Machine Learning-Based Biological Drug-Target Interaction Prediction Method for a Tripartite Heterogeneous Network. ACS OMEGA 2021; 6:3037-3045. [PMID: 33553921 PMCID: PMC7860102 DOI: 10.1021/acsomega.0c05377] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 12/25/2020] [Indexed: 06/12/2023]
Abstract
Drug repositioning is the identification of interactions between drugs and target proteins in pharmaceutical sciences. Traditional large-scale validation through chemical experiments is time-consuming and expensive, while drug repositioning can drastically decrease the cost and duration taken by traditional drug development. With the rapid advancement of high-throughput technologies and the explosion of various biological and medical data, computational drug repositioning methods have been used to systematically identify potential drug-target interactions. Some of them are based on a particular class of machine learning algorithms called kernel methods. In this paper, we propose a new machine learning prediction method combining multiple kernels into a tripartite heterogeneous drug-target-disease interaction spaces in order to integrate multiple sources of biological information simultaneously. This novel network algorithm extends the traditional drug-target interaction bipartite graph to the third disease layer. Meanwhile, Gaussian kernel functions on heterogeneous networks and the regularized least square method of the Kronecker product are used to predict new drug-target interactions. The values of AUPR (area under the precision-recall curve) and AUC (the area under the receiver operating characteristic curve) of the proposed algorithm are significantly improved. Especially, the AUC values are improved to 0.99, 0.99, 0.97, and 0.96 on four benchmark data sets. These experimental results substantiate that the network topology can be used for predicting drug-target interactions.
Collapse
Affiliation(s)
- Ying Zheng
- School of Computer & Communication
Engineering, Changsha University of Science
& Technology, Changsha 410000, China
| | - Zheng Wu
- School of Computer & Communication
Engineering, Changsha University of Science
& Technology, Changsha 410000, China
| |
Collapse
|
91
|
Jarada TN, Rokne JG, Alhajj R. SNF-NN: computational method to predict drug-disease interactions using similarity network fusion and neural networks. BMC Bioinformatics 2021; 22:28. [PMID: 33482713 PMCID: PMC7821180 DOI: 10.1186/s12859-020-03950-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Accepted: 12/22/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Drug repositioning is an emerging approach in pharmaceutical research for identifying novel therapeutic potentials for approved drugs and discover therapies for untreated diseases. Due to its time and cost efficiency, drug repositioning plays an instrumental role in optimizing the drug development process compared to the traditional de novo drug discovery process. Advances in the genomics, together with the enormous growth of large-scale publicly available data and the availability of high-performance computing capabilities, have further motivated the development of computational drug repositioning approaches. More recently, the rise of machine learning techniques, together with the availability of powerful computers, has made the area of computational drug repositioning an area of intense activities. RESULTS In this study, a novel framework SNF-NN based on deep learning is presented, where novel drug-disease interactions are predicted using drug-related similarity information, disease-related similarity information, and known drug-disease interactions. Heterogeneous similarity information related to drugs and disease is fed to the proposed framework in order to predict novel drug-disease interactions. SNF-NN uses similarity selection, similarity network fusion, and a highly tuned novel neural network model to predict new drug-disease interactions. The robustness of SNF-NN is evaluated by comparing its performance with nine baseline machine learning methods. The proposed framework outperforms all baseline methods ([Formula: see text] = 0.867, and [Formula: see text]=0.876) using stratified 10-fold cross-validation. To further demonstrate the reliability and robustness of SNF-NN, two datasets are used to fairly validate the proposed framework's performance against seven recent state-of-the-art methods for drug-disease interaction prediction. SNF-NN achieves remarkable performance in stratified 10-fold cross-validation with [Formula: see text] ranging from 0.879 to 0.931 and [Formula: see text] from 0.856 to 0.903. Moreover, the efficiency of SNF-NN is verified by validating predicted unknown drug-disease interactions against clinical trials and published studies. CONCLUSION In conclusion, computational drug repositioning research can significantly benefit from integrating similarity measures in heterogeneous networks and deep learning models for predicting novel drug-disease interactions. The data and implementation of SNF-NN are available at http://pages.cpsc.ucalgary.ca/ tnjarada/snf-nn.php .
Collapse
Affiliation(s)
- Tamer N Jarada
- Department of Computer Science, University of Calgary, Calgary, AB, Canada
| | - Jon G Rokne
- Department of Computer Science, University of Calgary, Calgary, AB, Canada
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, AB, Canada.
- Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey.
- Department of Health Informatics, University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
92
|
Knowledge-Graph-Based Drug Repositioning against COVID-19 by Graph Convolutional Network with Attention Mechanism. FUTURE INTERNET 2021. [DOI: 10.3390/fi13010013] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
The current global crisis caused by COVID-19 almost halted normal life in most parts of the world. Due to the long development cycle for new drugs, drug repositioning becomes an effective method of screening drugs for COVID-19. To find suitable drugs for COVID-19, we add COVID-19-related information into our medical knowledge graph and utilize a knowledge-graph-based drug repositioning method to screen potential therapeutic drugs for COVID-19. Specific steps are as follows. Firstly, the information about COVID-19 is collected from the latest published literature, and gene targets of COVID-19 are added to the knowledge graph. Then, the information of COVID-19 of the knowledge graph is extracted and a drug–disease interaction prediction model based on Graph Convolutional Network with Attention (Att-GCN) is established. Att-GCN is used to extract features from the knowledge graph and the prediction matrix reconstructed through matrix operation. We evaluate the model by predicting drugs for both ordinary diseases and COVID-19. The model can achieve area under curve (AUC) of 0.954 and area under the precise recall area curve (AUPR) of 0.851 for ordinary diseases. On the drug repositioning experiment for COVID-19, five drugs predicted by the models have proved effective in clinical treatment. The experimental results confirm that the model can predict drug–disease interaction effectively for both normal diseases and COVID-19.
Collapse
|
93
|
Jarada TN, Rokne JG, Alhajj R. SNF–CVAE: Computational method to predict drug–disease interactions using similarity network fusion and collective variational autoencoder. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106585] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
94
|
Gao D, Chen Q, Zeng Y, Jiang M, Zhang Y. Applications of Machine Learning in Drug Target Discovery. Curr Drug Metab 2020; 21:790-803. [PMID: 32723266 DOI: 10.2174/1567201817999200728142023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 03/12/2020] [Accepted: 05/13/2020] [Indexed: 12/15/2022]
Abstract
Drug target discovery is a critical step in drug development. It is the basis of modern drug development because it determines the target molecules related to specific diseases in advance. Predicting drug targets by computational methods saves a great deal of financial and material resources compared to in vitro experiments. Therefore, several computational methods for drug target discovery have been designed. Recently, machine learning (ML) methods in biomedicine have developed rapidly. In this paper, we present an overview of drug target discovery methods based on machine learning. Considering that some machine learning methods integrate network analysis to predict drug targets, network-based methods are also introduced in this article. Finally, the challenges and future outlook of drug target discovery are discussed.
Collapse
Affiliation(s)
- Dongrui Gao
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Qingyuan Chen
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Meng Jiang
- School of Mechanical Automotive Engineering, Nanyang Institute of Technology, Nanyang 473000, China
| | - Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| |
Collapse
|
95
|
Huang L, Luo H, Li S, Wu FX, Wang J. Drug-drug similarity measure and its applications. Brief Bioinform 2020; 22:5956929. [PMID: 33152756 DOI: 10.1093/bib/bbaa265] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/13/2020] [Accepted: 09/14/2020] [Indexed: 02/01/2023] Open
Abstract
Drug similarities play an important role in modern biology and medicine, as they help scientists gain deep insights into drugs' therapeutic mechanisms and conduct wet labs that may significantly improve the efficiency of drug research and development. Nowadays, a number of drug-related databases have been constructed, with which many methods have been developed for computing similarities between drugs for studying associations between drugs, human diseases, proteins (drug targets) and more. In this review, firstly, we briefly introduce the publicly available drug-related databases. Secondly, based on different drug features, interaction relationships and multimodal data, we summarize similarity calculation methods in details. Then, we discuss the applications of drug similarities in various biological and medical areas. Finally, we evaluate drug similarity calculation methods with common evaluation metrics to illustrate the important roles of drug similarity measures on different applications.
Collapse
Affiliation(s)
- Lan Huang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Huimin Luo
- School of Computer and Information Engineering at Henan University, Kaifeng, China
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
96
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
97
|
Drug-target interactions prediction using marginalized denoising model on heterogeneous networks. BMC Bioinformatics 2020; 21:330. [PMID: 32703151 PMCID: PMC7653902 DOI: 10.1186/s12859-020-03662-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 07/14/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Drugs achieve pharmacological functions by acting on target proteins. Identifying interactions between drugs and target proteins is an essential task in old drug repositioning and new drug discovery. To recommend new drug candidates and reposition existing drugs, computational approaches are commonly adopted. Compared with the wet-lab experiments, the computational approaches have lower cost for drug discovery and provides effective guidance in the subsequent experimental verification. How to integrate different types of biological data and handle the sparsity of drug-target interaction data are still great challenges. RESULTS In this paper, we propose a novel drug-target interactions (DTIs) prediction method incorporating marginalized denoising model on heterogeneous networks with association index kernel matrix and latent global association. The experimental results on benchmark datasets and new compiled datasets indicate that compared to other existing methods, our method achieves higher scores of AUC (area under curve of receiver operating characteristic) and larger values of AUPR (area under precision-recall curve). CONCLUSIONS The performance improvement in our method depends on the association index kernel matrix and the latent global association. The association index kernel matrix calculates the sharing relationship between drugs and targets. The latent global associations address the false positive issue caused by network link sparsity. Our method can provide a useful approach to recommend new drug candidates and reposition existing drugs.
Collapse
|
98
|
Thafar MA, Olayan RS, Ashoor H, Albaradei S, Bajic VB, Gao X, Gojobori T, Essack M. DTiGEMS+: drug-target interaction prediction using graph embedding, graph mining, and similarity-based techniques. J Cheminform 2020; 12:44. [PMID: 33431036 PMCID: PMC7325230 DOI: 10.1186/s13321-020-00447-2] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 06/16/2020] [Indexed: 12/14/2022] Open
Abstract
In silico prediction of drug–target interactions is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. However, developing such computational methods is not an easy task, but is much needed, as current methods that predict potential drug–target interactions suffer from high false-positive rates. Here we introduce DTiGEMS+, a computational method that predicts Drug–Target interactions using Graph Embedding, graph Mining, and Similarity-based techniques. DTiGEMS+ combines similarity-based as well as feature-based approaches, and models the identification of novel drug–target interactions as a link prediction problem in a heterogeneous network. DTiGEMS+ constructs the heterogeneous network by augmenting the known drug–target interactions graph with two other complementary graphs namely: drug–drug similarity, target–target similarity. DTiGEMS+ combines different computational techniques to provide the final drug target prediction, these techniques include graph embeddings, graph mining, and machine learning. DTiGEMS+ integrates multiple drug–drug similarities and target–target similarities into the final heterogeneous graph construction after applying a similarity selection procedure as well as a similarity fusion algorithm. Using four benchmark datasets, we show DTiGEMS+ substantially improves prediction performance compared to other state-of-the-art in silico methods developed to predict of drug-target interactions by achieving the highest average AUPR across all datasets (0.92), which reduces the error rate by 33.3% relative to the second-best performing model in the state-of-the-art methods comparison.
Collapse
Affiliation(s)
- Maha A Thafar
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Collage of Computers and Information Technology, Taif University, Taif, Kingdom of Saudi Arabia
| | - Rawan S Olayan
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Haitham Ashoor
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
| | - Vladimir B Bajic
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
99
|
Li X, Rousseau JF, Ding Y, Song M, Lu W. Understanding Drug Repurposing From the Perspective of Biomedical Entities and Their Evolution: Bibliographic Research Using Aspirin. JMIR Med Inform 2020; 8:e16739. [PMID: 32543442 PMCID: PMC7327595 DOI: 10.2196/16739] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Revised: 01/08/2020] [Accepted: 03/31/2020] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Drug development is still a costly and time-consuming process with a low rate of success. Drug repurposing (DR) has attracted significant attention because of its significant advantages over traditional approaches in terms of development time, cost, and safety. Entitymetrics, defined as bibliometric indicators based on biomedical entities (eg, diseases, drugs, and genes) studied in the biomedical literature, make it possible for researchers to measure knowledge evolution and the transfer of drug research. OBJECTIVE The purpose of this study was to understand DR from the perspective of biomedical entities (diseases, drugs, and genes) and their evolution. METHODS In the work reported in this paper, we extended the bibliometric indicators of biomedical entities mentioned in PubMed to detect potential patterns of biomedical entities in various phases of drug research and investigate the factors driving DR. We used aspirin (acetylsalicylic acid) as the subject of the study since it can be repurposed for many applications. We propose 4 easy, transparent measures based on entitymetrics to investigate DR for aspirin: Popularity Index (P1), Promising Index (P2), Prestige Index (P3), and Collaboration Index (CI). RESULTS We found that the maxima of P1, P3, and CI are closely associated with the different repurposing phases of aspirin. These metrics enabled us to observe the way in which biomedical entities interacted with the drug during the various phases of DR and to analyze the potential driving factors for DR at the entity level. P1 and CI were indicative of the dynamic trends of a specific biomedical entity over a long time period, while P2 was more sensitive to immediate changes. P3 reflected the early signs of the practical value of biomedical entities and could be valuable for tracking the research frontiers of a drug. CONCLUSIONS In-depth studies of side effects and mechanisms, fierce market competition, and advanced life science technologies are driving factors for DR. This study showcases the way in which researchers can examine the evolution of DR using entitymetrics, an approach that can be valuable for enhancing decision making in the field of drug discovery and development.
Collapse
Affiliation(s)
- Xin Li
- Information Retrieval and Knowledge Mining Laboratory, School of Information Management, Wuhan University, Wuhan, China.,School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, United States
| | - Justin F Rousseau
- Department of Population Health and Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX, United States
| | - Ying Ding
- School of Information, Dell Medical School, The University of Texas Austin, Austin, TX, United States
| | - Min Song
- Department of Library and Information Science, Yonsei University, Seoul, Republic of Korea
| | - Wei Lu
- Information Retrieval and Knowledge Mining Laboratory, School of Information Management, Wuhan University, Wuhan, China
| |
Collapse
|
100
|
Zhao T, Hu Y, Valsdottir LR, Zang T, Peng J. Identifying drug-target interactions based on graph convolutional network and deep neural network. Brief Bioinform 2020; 22:2141-2150. [PMID: 32367110 DOI: 10.1093/bib/bbaa044] [Citation(s) in RCA: 147] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 03/05/2020] [Accepted: 03/06/2020] [Indexed: 12/21/2022] Open
Abstract
Identification of new drug-target interactions (DTIs) is an important but a time-consuming and costly step in drug discovery. In recent years, to mitigate these drawbacks, researchers have sought to identify DTIs using computational approaches. However, most existing methods construct drug networks and target networks separately, and then predict novel DTIs based on known associations between the drugs and targets without accounting for associations between drug-protein pairs (DPPs). To incorporate the associations between DPPs into DTI modeling, we built a DPP network based on multiple drugs and proteins in which DPPs are the nodes and the associations between DPPs are the edges of the network. We then propose a novel learning-based framework, 'graph convolutional network (GCN)-DTI', for DTI identification. The model first uses a graph convolutional network to learn the features for each DPP. Second, using the feature representation as an input, it uses a deep neural network to predict the final label. The results of our analysis show that the proposed framework outperforms some state-of-the-art approaches by a large margin.
Collapse
Affiliation(s)
- Tianyi Zhao
- Department of Computer Science at Harbin Institute of Technology. He currently works as a bioinformatician in Beth Israel Deaconess Medical Center
| | - Yang Hu
- Department of Life Science at Harbin Institute of Technology. His expertise is bioinformatics
| | - Linda R Valsdottir
- MS in Biology and works as a scientific writer at the Smith Center for Outcomes Research in Cardiology at Beth Israel Deaconess Medical Center in Boston, MA. Her work is focused on helping researchers communicate their findings in an effort to translate novel analytical approaches and clinical expertise into improved outcomes for patients
| | - Tianyi Zang
- School of Computer Science and Technology at Harbin Institute of Technology (HIT), China. Before joining HIT in 2009, he was a research fellow at the Department of Computer Science at University of Oxford, UK. His current research is concerned with biomedical bigdata computing and algorithms, deep-learning algorithms for network data, intelligent recommendation algorithms, and modeling and analysis methods for complex systems
| | - Jiajie Peng
- School of Computer Science at Northwestern Polytechnical University. His expertise is computational biology and machine learning. Availability and implementation: https://github.com/zty2009/GCN-DNN/
| |
Collapse
|