1
|
Lyu B, Gou W, Xu F, Chen L, Wang Z, Ren Z, Liu G, Li Y, Hou W. Target Discovery Driven by Chemical Biology and Computational Biology. CHEM REC 2025; 25:e202400182. [PMID: 39811950 DOI: 10.1002/tcr.202400182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Revised: 12/06/2024] [Indexed: 01/16/2025]
Abstract
Target identification is crucial for drug screening and development because it can reveal the mechanism of drug action and ensure the reliability and accuracy of the results. Chemical biology, an interdisciplinary field combining chemistry and biology, can assist in this process by studying the interactions between active molecular compounds and proteins and their physiological effects. It can also help predict potential drug targets or candidates, develop new biomarker assays and diagnostic reagents, and evaluate the selectivity and range of active compounds to reduce the risk of off-target effects. Chemical biology can achieve these goals using techniques such as changing protein thermal stability, enzyme sensitivity, and molecular structure and applying probes, isotope labeling and mass spectrometry. Concurrently, computational biology employs a diverse array of computational models to predict drug targets. This approach also offers innovative avenues for repurposing existing drugs. In this paper, we review the reported chemical biology and computational biology techniques for identifying different types of targets that can provide valuable insights for drug target discovery.
Collapse
Affiliation(s)
- Bohai Lyu
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
- Tianjin University of Traditional Chinese Medicine, Tianjin, 301617, China
| | - Wenfeng Gou
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Feifei Xu
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Leyuan Chen
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Zhiyun Wang
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Zhonghao Ren
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
- Department of Pharmacology, Shenyang Pharmaceutical University, 103 Wenhua Road, Shenhe District, Shenyang, 110016, China
| | - Gaiting Liu
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
- Tianjin University of Traditional Chinese Medicine, Tianjin, 301617, China
| | - Yiliang Li
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| | - Wenbin Hou
- Institute of Radiation Medicine, Peking Union Medical College & Chinese Academy of Medical Sciences, Tianjin, 300192, China
| |
Collapse
|
2
|
Han GS, Gao Q, Peng LZ, Tang J. Hessian Regularized
L
2
,
1
-Nonnegative Matrix Factorization and Deep Learning for miRNA-Disease Associations Prediction. Interdiscip Sci 2024; 16:176-191. [PMID: 38099958 DOI: 10.1007/s12539-023-00594-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 11/05/2023] [Accepted: 11/07/2023] [Indexed: 02/22/2024]
Abstract
Since the identification of microRNAs (miRNAs), empirical research has demonstrated their crucial involvement in the functioning of organisms. Investigating miRNAs significantly bolsters efforts related to averting, diagnosing, and treating intricate human maladies. Yet, exploring every conceivable miRNA-disease association consumes significant resources and time within conventional wet experiments. On the computational front, forecasting potential miRNA-disease connections serves as a valuable source of preliminary insights for medical investigators. As a result, we have developed a novel matrix factorization model known as Hessian-regularizedL 2 , 1 nonnegative matrix factorization in combination with deep learning for predicting associations between miRNAs and diseases, denoted asH R L 2 , 1 -NMF-DF. In particular, we introduce a novel iterative fusion approach to integrate all similarities. This method effectively diminishes the sparsity of the initial miRNA-disease associations matrix. Additionally, we devise a mixed model framework that utilizes deep learning, matrix decomposition, and singular value decomposition to capture and depict the intricate nonlinear features of miRNA and disease. The prediction performance of the six matrix factorization methods is improved by comparison and analysis, similarity matrix fusion, data preprocessing, and parameter adjustment. The AUC and AUPR obtained by the new matrix factorization model under fivefold cross validation are comparative or better with other matrix factorization models. Finally, we select three diseases including lung tumor, bladder tumor and breast tumor for case analysis, and further extend the matrix factorization model based on deep learning. The results show that the hybrid algorithm combining matrix factorization with deep learning proposed in this paper can predict miRNAs related to different diseases with high accuracy.
Collapse
Affiliation(s)
- Guo-Sheng Han
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China.
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China.
| | - Qi Gao
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Ling-Zhi Peng
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Jing Tang
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, China
| |
Collapse
|
3
|
E Z, Qiao G, Wang G, Li Y. GSL-DTI: Graph structure learning network for Drug-Target interaction prediction. Methods 2024; 223:136-145. [PMID: 38360082 DOI: 10.1016/j.ymeth.2024.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 12/23/2023] [Accepted: 01/29/2024] [Indexed: 02/17/2024] Open
Abstract
MOTIVATION Drug-target interaction prediction is an important area of research to predict whether there is an interaction between a drug molecule and its target protein. It plays a critical role in drug discovery and development by facilitating the identification of potential drug candidates and expediting the overall process. Given the time-consuming, expensive, and high-risk nature of traditional drug discovery methods, the prediction of drug-target interactions has become an indispensable tool. Using machine learning and deep learning to tackle this class of problems has become a mainstream approach, and graph-based models have recently received much attention in this field. However, many current graph-based Drug-Target Interaction (DTI) prediction methods rely on manually defined rules to construct the Drug-Protein Pair (DPP) network during the DPP representation learning process. However, these methods fail to capture the true underlying relationships between drug molecules and target proteins. RESULTS We propose GSL-DTI, an automatic graph structure learning model used for predicting drug-target interactions (DTIs). Initially, we integrate large-scale heterogeneous networks using a graph convolution network based on meta-paths, effectively learning the representations of drugs and target proteins. Subsequently, we construct drug-protein pairs based on these representations. In contrast to previous studies that construct DPP networks based on manual rules, our method introduces an automatic graph structure learning approach. This approach utilizes a filter gate on the affinity scores of DPPs and relies on the classification loss of downstream tasks to guide the learning of the underlying DPP network structure. Based on the learned DPP network, we transform the prediction of drug-target interactions into a node classification problem. The comprehensive experiments conducted on three public datasets have shown the superiority of GSL-DTI in the tasks of DTI prediction. Additionally, GSL-DTI provides a fresh perspective for advancing research in graph structure learning for DTI prediction.
Collapse
Affiliation(s)
- Zixuan E
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guanyu Qiao
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| | - Yang Li
- College of Computer and Control Engineering, Northeast Forestry University,Harbin 150006, China.
| |
Collapse
|
4
|
Ma J, Li C, Zhang Y, Wang Z, Li S, Guo Y, Zhang L, Liu H, Gao X, Song J. MULGA, a unified multi-view graph autoencoder-based approach for identifying drug-protein interaction and drug repositioning. Bioinformatics 2023; 39:btad524. [PMID: 37610353 PMCID: PMC10518077 DOI: 10.1093/bioinformatics/btad524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 07/26/2023] [Accepted: 08/22/2023] [Indexed: 08/24/2023] Open
Abstract
MOTIVATION Identifying drug-protein interactions (DPIs) is a critical step in drug repositioning, which allows reuse of approved drugs that may be effective for treating a different disease and thereby alleviates the challenges of new drug development. Despite the fact that a great variety of computational approaches for DPI prediction have been proposed, key challenges, such as extendable and unbiased similarity calculation, heterogeneous information utilization, and reliable negative sample selection, remain to be addressed. RESULTS To address these issues, we propose a novel, unified multi-view graph autoencoder framework, termed MULGA, for both DPI and drug repositioning predictions. MULGA is featured by: (i) a multi-view learning technique to effectively learn authentic drug affinity and target affinity matrices; (ii) a graph autoencoder to infer missing DPI interactions; and (iii) a new "guilty-by-association"-based negative sampling approach for selecting highly reliable non-DPIs. Benchmark experiments demonstrate that MULGA outperforms state-of-the-art methods in DPI prediction and the ablation studies verify the effectiveness of each proposed component. Importantly, we highlight the top drugs shortlisted by MULGA that target the spike glycoprotein of severe acute respiratory syndrome coronavirus 2 (SAR-CoV-2), offering additional insights into and potentially useful treatment option for COVID-19. Together with the availability of datasets and source codes, we envision that MULGA can be explored as a useful tool for DPI prediction and drug repositioning. AVAILABILITY AND IMPLEMENTATION MULGA is publicly available for academic purposes at https://github.com/jianiM/MULGA/.
Collapse
Affiliation(s)
- Jiani Ma
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Chen Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Yiwen Zhang
- Climate, Air Quality Research Unit, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Zhikang Wang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Shanshan Li
- Climate, Air Quality Research Unit, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Yuming Guo
- Climate, Air Quality Research Unit, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Lin Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Hui Liu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Xin Gao
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
- Wenzhou Medical University-Monash Biomedicine Discovery Institute (BDI) Alliance in Clinical and Experimental Biomedicine, Wenzhou 325035, China
- Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| |
Collapse
|
5
|
Zhang Y, Hu Y, Han N, Yang A, Liu X, Cai H. A survey of drug-target interaction and affinity prediction methods via graph neural networks. Comput Biol Med 2023; 163:107136. [PMID: 37329615 DOI: 10.1016/j.compbiomed.2023.107136] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 05/29/2023] [Accepted: 06/04/2023] [Indexed: 06/19/2023]
Abstract
The tasks of drug-target interaction (DTI) and drug-target affinity (DTA) prediction play important roles in the field of drug discovery. However, biological experiment-based methods are time-consuming and expensive. Recently, computational-based approaches have accelerated the process of drug-target relationship prediction. Drug and target features are represented in structure-based, sequence-based, and graph-based ways. Although some achievements have been made regarding structure-based representations and sequence-based representations, the acquired feature information is not sufficiently rich. Molecular graph-based representations are some of the more popular approaches, and they have also generated a great deal of interest. In this article, we provide an overview of the DTI prediction and DTA prediction tasks based on graph neural networks (GNNs). We briefly discuss the molecular graphs of drugs, the primary sequences of target proteins, and the graph reSLBpresentations of target proteins. Meanwhile, we conducted experiments on various fundamental datasets to substantiate the plausibility of DTI and DTA utilizing graph neural networks.
Collapse
Affiliation(s)
- Yue Zhang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, 510665, China.
| | - Yuqing Hu
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, 510665, China
| | - Na Han
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, 510665, China
| | - Aqing Yang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, 510665, China
| | - Xiaoyong Liu
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, 510665, China
| | - Hongmin Cai
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006, China
| |
Collapse
|
6
|
Hu L, Fu C, Ren Z, Cai Y, Yang J, Xu S, Xu W, Tang D. SSELM-neg: spherical search-based extreme learning machine for drug-target interaction prediction. BMC Bioinformatics 2023; 24:38. [PMID: 36737694 PMCID: PMC9896467 DOI: 10.1186/s12859-023-05153-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 01/18/2023] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND The experimental verification of a drug discovery process is expensive and time-consuming. Therefore, efficiently and effectively identifying drug-target interactions (DTIs) has been the focus of research. At present, many machine learning algorithms are used for predicting DTIs. The key idea is to train the classifier using an existing DTI to predict a new or unknown DTI. However, there are various challenges, such as class imbalance and the parameter optimization of many classifiers, that need to be solved before an optimal DTI model is developed. METHODS In this study, we propose a framework called SSELM-neg for DTI prediction, in which we use a screening approach to choose high-quality negative samples and a spherical search approach to optimize the parameters of the extreme learning machine. RESULTS The results demonstrated that the proposed technique outperformed other state-of-the-art methods in 10-fold cross-validation experiments in terms of the area under the receiver operating characteristic curve (0.986, 0.993, 0.988, and 0.969) and AUPR (0.982, 0.991, 0.982, and 0.946) for the enzyme dataset, G-protein coupled receptor dataset, ion channel dataset, and nuclear receptor dataset, respectively. CONCLUSION The screening approach produced high-quality negative samples with the same number of positive samples, which solved the class imbalance problem. We optimized an extreme learning machine using a spherical search approach to identify DTIs. Therefore, our models performed better than other state-of-the-art methods.
Collapse
Affiliation(s)
- Lingzhi Hu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Chengzhou Fu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| | - Zhonglu Ren
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Yongming Cai
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| | - Jin Yang
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| | - Siwen Xu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Wenhua Xu
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China
| | - Deyu Tang
- grid.411847.f0000 0004 1804 4300School of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, People’s Republic of China ,grid.79703.3a0000 0004 1764 3838School of Computer Science and Engineering, South China University of Technology, Guangzhou, People’s Republic of China ,Guangdong Province Precise Medicine Big Data of Traditional Chinese Medicine Engineering Technology Research Center, Guangzhou, People’s Republic of China
| |
Collapse
|
7
|
Azuma I, Mizuno T, Kusuhara H. NRBdMF: A Recommendation Algorithm for Predicting Drug Effects Considering Directionality. J Chem Inf Model 2023; 63:474-483. [PMID: 36635231 DOI: 10.1021/acs.jcim.2c01210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Predicting the novel effects of drugs based on information about approved drugs can be regarded as a recommendation system. Matrix factorization is one of the most used recommendation systems, and various algorithms have been devised for it. A literature survey and summary of existing algorithms for predicting drug effects demonstrated that most such methods, including neighborhood regularized logistic matrix factorization, which was the best performer in benchmark tests, used a binary matrix that considers only the presence or absence of interactions. However, drug effects are known to have two opposite aspects, such as side effects and therapeutic effects. In the present study, we proposed using neighborhood regularized bidirectional matrix factorization (NRBdMF) to predict drug effects by incorporating bidirectionality, which is a characteristic property of drug effects. We used this proposed method for predicting side effects using a matrix that considered the bidirectionality of drug effects, in which known side effects were assigned a positive (+1) label and known treatment effects were assigned a negative (-1) label. The NRBdMF model, which utilizes drug bidirectional information, achieved enrichment of side effects at the top and indications at the bottom of the prediction list. This first attempt to consider the bidirectional nature of drug effects using NRBdMF showed that it reduced false positives and produced a highly interpretable output.
Collapse
Affiliation(s)
- Iori Azuma
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo113-0033, Japan
| | - Tadahaya Mizuno
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo113-0033, Japan
| | - Hiroyuki Kusuhara
- Graduate School of Pharmaceutical Sciences, The University of Tokyo, Bunkyo-ku, Tokyo113-0033, Japan
| |
Collapse
|
8
|
Ouyang D, Liang Y, Wang J, Liu X, Xie S, Miao R, Ai N, Li L, Dang Q. Predicting multiple types of miRNA-disease associations using adaptive weighted nonnegative tensor factorization with self-paced learning and hypergraph regularization. Brief Bioinform 2022; 23:6720405. [PMID: 36168938 DOI: 10.1093/bib/bbac390] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 12/14/2022] Open
Abstract
More and more evidence indicates that the dysregulations of microRNAs (miRNAs) lead to diseases through various kinds of underlying mechanisms. Identifying the multiple types of disease-related miRNAs plays an important role in studying the molecular mechanism of miRNAs in diseases. Moreover, compared with traditional biological experiments, computational models are time-saving and cost-minimized. However, most tensor-based computational models still face three main challenges: (i) easy to fall into bad local minima; (ii) preservation of high-order relations; (iii) false-negative samples. To this end, we propose a novel tensor completion framework integrating self-paced learning, hypergraph regularization and adaptive weight tensor into nonnegative tensor factorization, called SPLDHyperAWNTF, for the discovery of potential multiple types of miRNA-disease associations. We first combine self-paced learning with nonnegative tensor factorization to effectively alleviate the model from falling into bad local minima. Then, hypergraphs for miRNAs and diseases are constructed, and hypergraph regularization is used to preserve the high-order complex relations of these hypergraphs. Finally, we innovatively introduce adaptive weight tensor, which can effectively alleviate the impact of false-negative samples on the prediction performance. The average results of 5-fold and 10-fold cross-validation on four datasets show that SPLDHyperAWNTF can achieve better prediction performance than baseline models in terms of Top-1 precision, Top-1 recall and Top-1 F1. Furthermore, we implement case studies to further evaluate the accuracy of SPLDHyperAWNTF. As a result, 98 (MDAv2.0) and 98 (MDAv2.0-2) of top-100 are confirmed by HMDDv3.2 dataset. Moreover, the results of enrichment analysis illustrate that unconfirmed potential associations have biological significance.
Collapse
Affiliation(s)
- Dong Ouyang
- Peng Cheng Laboratory, Shenzhen 518055, China.,School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| | - Yong Liang
- Peng Cheng Laboratory, Shenzhen 518055, China
| | - Jianjun Wang
- School of Mathematics and Statistics, Southwest University, Chongqing 400715, China
| | - Xiaoying Liu
- Computer Engineering Technical College, Guangdong Polytechnic of Science and Technology, Zhuhai 519090, China
| | - Shengli Xie
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou 510000, China
| | - Rui Miao
- Basic Teaching Department, ZhuHai Campus of ZunYi Medical University, Zhuhai 519090, China
| | - Ning Ai
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| | - Le Li
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| | - Qi Dang
- School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau 999078, China
| |
Collapse
|
9
|
Xia LY, Tang L, Huang H, Luo J. Identification of Potential Driver Genes and Pathways Based on Transcriptomics Data in Alzheimer's Disease. Front Aging Neurosci 2022; 14:752858. [PMID: 35401145 PMCID: PMC8985410 DOI: 10.3389/fnagi.2022.752858] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 02/21/2022] [Indexed: 01/16/2023] Open
Abstract
Alzheimer's disease (AD) is one of the most common neurodegenerative diseases. To identify AD-related genes from transcriptomics and help to develop new drugs to treat AD. In this study, firstly, we obtained differentially expressed genes (DEG)-enriched coexpression networks between AD and normal samples in multiple transcriptomics datasets by weighted gene co-expression network analysis (WGCNA). Then, a convergent genomic approach (CFG) integrating multiple AD-related evidence was used to prioritize potential genes from DEG-enriched modules. Subsequently, we identified candidate genes in the potential genes list. Lastly, we combined deepDTnet and SAveRUNNER to predict interaction among candidate genes, drug and AD. Experiments on five datasets show that the CFG score of GJA1 is the highest among all potential driver genes of AD. Moreover, we found GJA1 interacts with AD from target-drugs-diseases network prediction. Therefore, candidate gene GJA1 is the most likely to be target of AD. In summary, identification of AD-related genes contributes to the understanding of AD pathophysiology and the development of new drugs.
Collapse
|
10
|
Drug-target interaction prediction via an ensemble of weighted nearest neighbors with interaction recovery. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02495-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
11
|
Wang Q, Zhou Y. FedSPL: federated self-paced learning for privacy-preserving disease diagnosis. Brief Bioinform 2021; 23:6454650. [PMID: 34874995 DOI: 10.1093/bib/bbab498] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/28/2021] [Accepted: 10/30/2021] [Indexed: 12/18/2022] Open
Abstract
The growing expansion of data availability in medical fields could help improve the performance of machine learning methods. However, with healthcare data, using multi-institutional datasets is challenging due to privacy and security concerns. Therefore, privacy-preserving machine learning methods are required. Thus, we use a federated learning model to train a shared global model, which is a central server that does not contain private data, and all clients maintain the sensitive data in their own institutions. The scattered training data are connected to improve model performance, while preserving data privacy. However, in the federated training procedure, data errors or noise can reduce learning performance. Therefore, we introduce the self-paced learning, which can effectively select high-confidence samples and drop high noisy samples to improve the performances of the training model and reduce the risk of data privacy leakage. We propose the federated self-paced learning (FedSPL), which combines the advantage of federated learning and self-paced learning. The proposed FedSPL model was evaluated on gene expression data distributed across different institutions where the privacy concerns must be considered. The results demonstrate that the proposed FedSPL model is secure, i.e. it does not expose the original record to other parties, and the computational overhead during training is acceptable. Compared with learning methods based on the local data of all parties, the proposed model can significantly improve the predicted F1-score by approximately 4.3%. We believe that the proposed method has the potential to benefit clinicians in gene selections and disease prognosis.
Collapse
Affiliation(s)
- Qingyong Wang
- Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, China
| | - Yun Zhou
- Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, China
| |
Collapse
|
12
|
Zhang Y, Jiang Z, Chen C, Wei Q, Gu H, Yu B. DeepStack-DTIs: Predicting Drug-Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier. Interdiscip Sci 2021; 14:311-330. [PMID: 34731411 DOI: 10.1007/s12539-021-00488-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Revised: 10/19/2021] [Accepted: 10/21/2021] [Indexed: 12/12/2022]
Abstract
Accurate prediction of drug-target interactions (DTIs), which is often used in the fields of drug discovery and drug repositioning, is regarded a key challenge in the study of drug science. In this paper, a new method called DeepStack-DTIs is proposed to predict DTIs. First, for the target protein, pseudo-position specific score matrix, pseudo amino acid composition and SPIDER3 are used to extract the different feature information of the target protein. Meanwhile, the path-based fingerprint features of each drug are extracted. Then, the synthetic minority oversampling technique (SMOTE) and light gradient boosting machine (LightGBM) are used for data balancing and feature selection, respectively. Finally, the processed features are input to the deep-stacked ensemble classifier composed of gated recurrent unit (GRU), deep neural network (DNN), support vector machine (SVM), eXtreme gradient boosting (XGBoost) and logistic regression (LR) to predict DTIs. Under the five-fold cross-validation and compared with existing methods, the proposed method achieves higher prediction accuracy on the gold standard dataset. To evaluate the predictive power of DeepStack-DTIs, we validate the method on another dataset and predict the drug-target interaction network. The results indicate that DeepStack-DTIs has excellent predictive ability than the other methods, and provides novel insights for the prediction of DTIs. A novel method DeepStack-DTIs for drug-target interactions prediction. PsePSSM, PseAAC, SPIDER3 and FP2 are fused to convert protein sequence and drug molecule information into digital information, respectively. The SMOTE algorithm is used to balance the dataset and LightGBM feature selection algorithm is employed to remove redundant and irrelevant features to select the optimal feature subset. This optimal feature subset is inputted into the deep-stacked ensemble classifier to predict drug-target interactions. The experimental results show DeepStack-DTIs method can significantly improve the prediction accuracy of drug-target interactions.
Collapse
Affiliation(s)
- Yan Zhang
- College of Mechanical and Electrical Engineering, Qingdao University of Science and Technology, Qingdao, 266061, China.,College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Zhiwen Jiang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Cheng Chen
- School of Computer Science and Technology, Shandong University, Qingdao, 266237, China
| | - Qinqin Wei
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Haiming Gu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China.,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China. .,Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China. .,Key Laboratory of Computational Science and Application of Hainan Province, Haikou, 571158, China.
| |
Collapse
|
13
|
Chen C, Shi H, Jiang Z, Salhi A, Chen R, Cui X, Yu B. DNN-DTIs: Improved drug-target interactions prediction using XGBoost feature selection and deep neural network. Comput Biol Med 2021; 136:104676. [PMID: 34375902 DOI: 10.1016/j.compbiomed.2021.104676] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 07/18/2021] [Accepted: 07/19/2021] [Indexed: 02/03/2023]
Abstract
Analysis and prediction of drug-target interactions (DTIs) play an important role in understanding drug mechanisms, as well as drug repositioning and design. Machine learning (ML)-based methods for DTIs prediction can mitigate the shortcomings of time-consuming and labor-intensive experimental approaches, while providing new ideas and insights for drug design. We propose a novel pipeline for predicting drug-target interactions, called DNN-DTIs. First, the target information is characterized by a number of features, namely, pseudo-amino acid composition, pseudo position-specific scoring matrix, conjoint triad composition, transition and distribution, Moreau-Broto autocorrelation, and structural features. The drug compounds are subsequently encoded using substructure fingerprints. Next, eXtreme gradient boosting (XGBoost) is used to determine the subset of non-redundant features of importance. The optimal balanced set of sample vectors is obtained by applying the synthetic minority oversampling technique (SMOTE). Finally, a DTIs predictor, DNN-DTIs, is developed based on a deep neural network (DNN) via a layer-by-layer learning scheme. Experimental results indicate that DNN-DTIs achieves better performance than other state-of-the-art predictors with ACC values of 98.78%, 98.60%, 97.98%, 98.24% and 98.00% on Enzyme, Ion Channels (IC), GPCR, Nuclear Receptors (NR) and Kuang's datasets. Therefore, the accurate prediction performance of DNN-DTIs makes it a favored choice for contributing to the study of DTIs, especially drug repositioning.
Collapse
Affiliation(s)
- Cheng Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China; School of Computer Science and Technology, Shandong University, Qingdao, 266237, China
| | - Han Shi
- Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Zhiwen Jiang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Adil Salhi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Ruixin Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China
| | - Xuefeng Cui
- School of Computer Science and Technology, Shandong University, Qingdao, 266237, China
| | - Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao, 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao, 266061, China; Key Laboratory of Computational Science and Application of Hainan Province, Haikou, 571158, China.
| |
Collapse
|
14
|
Pei F, Shi Q, Zhang H, Bahar I. Predicting Protein-Protein Interactions Using Symmetric Logistic Matrix Factorization. J Chem Inf Model 2021; 61:1670-1682. [PMID: 33831302 DOI: 10.1021/acs.jcim.1c00173] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Accurate assessment of protein-protein interactions (PPIs) is critical to deciphering disease mechanisms and developing novel drugs, and with rapidly growing PPI data, the need for more efficient predictive methods is emerging. We propose here a symmetric logistic matrix factorization (symLMF)-based approach to predict PPIs, especially useful for large PPI networks. Benchmarked against two widely used datasets (Saccharomyces cerevisiae and Homo sapiens benchmarks) and their extended versions, the symLMF-based method proves to outperform most of the state-of-the-art data-driven methods applied to human PPIs, and it shows a performance comparable to those of deep learning methods despite its conceptual and technical simplicity and efficiency. Tests performed on humans, yeast, and tissue (brain and liver)- and disease (neurodegenerative and metabolic disorders)-specific datasets further demonstrate the high capability to capture the hidden interactions. Notably, many "de novo predictions" made by symLMF are verified to exist in PPI databases other than those used for training/testing the method, indicating that the method could be of broad utility as a simple, yet efficient and accurate, tool applicable to PPI datasets.
Collapse
Affiliation(s)
| | - Qingya Shi
- School of Medicine, Tsinghua University, Beijing 100084, China
| | | | | |
Collapse
|
15
|
Karaman Mayack B, Sippl W. Current In Silico Drug Repurposing Strategies. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11523-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
16
|
Thafar MA, Olayan RS, Ashoor H, Albaradei S, Bajic VB, Gao X, Gojobori T, Essack M. DTiGEMS+: drug-target interaction prediction using graph embedding, graph mining, and similarity-based techniques. J Cheminform 2020; 12:44. [PMID: 33431036 PMCID: PMC7325230 DOI: 10.1186/s13321-020-00447-2] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 06/16/2020] [Indexed: 12/14/2022] Open
Abstract
In silico prediction of drug–target interactions is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. However, developing such computational methods is not an easy task, but is much needed, as current methods that predict potential drug–target interactions suffer from high false-positive rates. Here we introduce DTiGEMS+, a computational method that predicts Drug–Target interactions using Graph Embedding, graph Mining, and Similarity-based techniques. DTiGEMS+ combines similarity-based as well as feature-based approaches, and models the identification of novel drug–target interactions as a link prediction problem in a heterogeneous network. DTiGEMS+ constructs the heterogeneous network by augmenting the known drug–target interactions graph with two other complementary graphs namely: drug–drug similarity, target–target similarity. DTiGEMS+ combines different computational techniques to provide the final drug target prediction, these techniques include graph embeddings, graph mining, and machine learning. DTiGEMS+ integrates multiple drug–drug similarities and target–target similarities into the final heterogeneous graph construction after applying a similarity selection procedure as well as a similarity fusion algorithm. Using four benchmark datasets, we show DTiGEMS+ substantially improves prediction performance compared to other state-of-the-art in silico methods developed to predict of drug-target interactions by achieving the highest average AUPR across all datasets (0.92), which reduces the error rate by 33.3% relative to the second-best performing model in the state-of-the-art methods comparison.
Collapse
Affiliation(s)
- Maha A Thafar
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Collage of Computers and Information Technology, Taif University, Taif, Kingdom of Saudi Arabia
| | - Rawan S Olayan
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Haitham Ashoor
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
| | - Vladimir B Bajic
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.,Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
| |
Collapse
|
17
|
Incorporating chemical sub-structures and protein evolutionary information for inferring drug-target interactions. Sci Rep 2020; 10:6641. [PMID: 32313024 PMCID: PMC7171114 DOI: 10.1038/s41598-020-62891-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 03/12/2020] [Indexed: 01/29/2023] Open
Abstract
Accumulating evidence has shown that drug-target interactions (DTIs) play a crucial role in the process of genomic drug discovery. Although biological experimental technology has made great progress, the identification of DTIs is still very time-consuming and expensive nowadays. Hence it is urgent to develop in silico model as a supplement to the biological experiments to predict the potential DTIs. In this work, a new model is designed to predict DTIs by incorporating chemical sub-structures and protein evolutionary information. Specifically, we first use Position-Specific Scoring Matrix (PSSM) to convert the protein sequence into the numerical descriptor containing biological evolutionary information, then use Discrete Cosine Transform (DCT) algorithm to extract the hidden features and integrate them with the chemical sub-structures descriptor, and finally utilize Rotation Forest (RF) classifier to accurately predict whether there is interaction between the drug and the target protein. In the 5-fold cross-validation (CV) experiment, the average accuracy of the proposed model on the benchmark datasets of Enzymes, Ion Channels, GPCRs and Nuclear Receptors reached 0.9140, 0.8919, 0.8724 and 0.8111, respectively. In order to fully evaluate the performance of the proposed model, we compare it with different feature extraction model, classifier model, and other state-of-the-art models. Furthermore, we also implemented case studies. As a result, 8 of the top 10 drug-target pairs with the highest prediction score were confirmed by related databases. These excellent results indicate that the proposed model has outstanding ability in predicting DTIs and can provide reliable candidates for biological experiments.
Collapse
|