1
|
Basnet BB, Zhou ZY, Wei B, Wang H. Advances in AI-based strategies and tools to facilitate natural product and drug development. Crit Rev Biotechnol 2025:1-32. [PMID: 40159111 DOI: 10.1080/07388551.2025.2478094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2024] [Revised: 02/11/2025] [Accepted: 02/16/2025] [Indexed: 04/02/2025]
Abstract
Natural products and their derivatives have been important for treating diseases in humans, animals, and plants. However, discovering new structures from natural sources is still challenging. In recent years, artificial intelligence (AI) has greatly aided the discovery and development of natural products and drugs. AI facilitates to: connect genetic data to chemical structures or vice-versa, repurpose known natural products, predict metabolic pathways, and design and optimize metabolites biosynthesis. More recently, the emergence and improvement in neural networks such as deep learning and ensemble automated web based bioinformatics platforms have sped up the discovery process. Meanwhile, AI also improves the identification and structure elucidation of unknown compounds from raw data like mass spectrometry and nuclear magnetic resonance. This article reviews these AI-driven methods and tools, highlighting their practical applications and guide for efficient natural product discovery and drug development.
Collapse
Affiliation(s)
- Buddha Bahadur Basnet
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
- Central Department of Biotechnology, Tribhuvan University, Kathmandu, Nepal
| | - Zhen-Yi Zhou
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
| | - Bin Wei
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
| | - Hong Wang
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, China
- Key Laboratory of Marine Fishery Resources Exploitment, Utilization of Zhejiang Province, Zhejiang University of Technology, Hangzhou, China
| |
Collapse
|
2
|
Jiang W, Ye W, Tan X, Bao YJ. Network-based multi-omics integrative analysis methods in drug discovery: a systematic review. BioData Min 2025; 18:27. [PMID: 40155979 PMCID: PMC11954193 DOI: 10.1186/s13040-025-00442-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Accepted: 03/17/2025] [Indexed: 04/01/2025] Open
Abstract
The integration of multi-omics data from diverse high-throughput technologies has revolutionized drug discovery. While various network-based methods have been developed to integrate multi-omics data, systematic evaluation and comparison of these methods remain challenging. This review aims to analyze network-based approaches for multi-omics integration and evaluate their applications in drug discovery. We conducted a comprehensive review of literature (2015-2024) on network-based multi-omics integration methods in drug discovery, and categorized methods into four primary types: network propagation/diffusion, similarity-based approaches, graph neural networks, and network inference models. We also discussed the applications of the methods in three scenario of drug discovery, including drug target identification, drug response prediction, and drug repurposing, and finally evaluated the performance of the methods by highlighting their advantages and limitations in specific applications. While network-based multi-omics integration has shown promise in drug discovery, challenges remain in computational scalability, data integration, and biological interpretation. Future developments should focus on incorporating temporal and spatial dynamics, improving model interpretability, and establishing standardized evaluation frameworks.
Collapse
Affiliation(s)
- Wei Jiang
- School of Life Sciences, Hubei University, Wuhan, China
| | - Weicai Ye
- School of Computer Science and Engineering, Guangdong Province Key Laboratory of Computational Science, National Engineering Laboratory for Big Data Analysis and Application, Sun Yat-sen University, Guangzhou, China
| | - Xiaoming Tan
- School of Life Sciences, Hubei University, Wuhan, China
| | - Yun-Juan Bao
- School of Life Sciences, Hubei University, Wuhan, China.
- , No.368 Youyi Avenue, Wuhan, 430062, China.
| |
Collapse
|
3
|
Sun L, Yin Z, Lu L. ISLRWR: A network diffusion algorithm for drug-target interactions prediction. PLoS One 2025; 20:e0302281. [PMID: 39883675 PMCID: PMC11781719 DOI: 10.1371/journal.pone.0302281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 04/01/2024] [Indexed: 02/01/2025] Open
Abstract
Machine learning techniques and computer-aided methods are now widely used in the pre-discovery tasks of drug discovery, effectively improving the efficiency of drug development and reducing the workload and cost. In this study, we used multi-source heterogeneous network information to build a network model, learn the network topology through multiple network diffusion algorithms, and obtain compressed low-dimensional feature vectors for predicting drug-target interactions (DTIs). We applied the metropolis-hasting random walk (MHRW) algorithm to improve the performance of the random walk with restart (RWR) algorithm, forming the basis by which the self-loop probability of the current node is removed. Additionally, the propagation efficiency of the MHRW was improved using the improved metropolis-hasting random walk (IMRWR) algorithm, facilitating network deep sampling. Finally, we proposed a correction of the transfer probability of the entire network after increasing the self-loop rate of isolated nodes to form the ISLRWR algorithm. Notably, the ISLRWR algorithm improved the area under the receiver operating characteristic curve (AUROC) by 7.53 and 5.72%, and the area under the precision-recall curve (AUPRC) by 5.95 and 4.19% compared to the RWR and MHRW algorithms, respectively, in predicting DTIs performance. Moreover, after excluding the interference of homologous proteins (popular drugs or targets may lead to inflated prediction results), the ISLRWR algorithm still showed a significant performance improvement.
Collapse
Affiliation(s)
- Lu Sun
- School of Mathematics, Physics and Statistics, Institute for Frontier Medical Technology, Center of Intelligent Computing and Applied Statistics, Shanghai University of Engineering Science, Shanghai, China
| | - Zhixiang Yin
- School of Mathematics, Physics and Statistics, Institute for Frontier Medical Technology, Center of Intelligent Computing and Applied Statistics, Shanghai University of Engineering Science, Shanghai, China
| | - Lin Lu
- Shanghai Xinhao Information Technology Co., Ltd., Shanghai, China
| |
Collapse
|
4
|
Wang J, He R, Wang X, Li H, Lu Y. MCF-DTI: Multi-Scale Convolutional Local-Global Feature Fusion for Drug-Target Interaction Prediction. Molecules 2025; 30:274. [PMID: 39860144 PMCID: PMC11767603 DOI: 10.3390/molecules30020274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 12/21/2024] [Accepted: 01/10/2025] [Indexed: 01/27/2025] Open
Abstract
Predicting drug-target interactions (DTIs) is a crucial step in the development of new drugs and drug repurposing. In this paper, we propose a novel drug-target prediction model called MCF-DTI. The model utilizes the SMILES representation of drugs and the sequence features of targets, employing a multi-scale convolutional neural network (MSCNN) with parallel shared-weight modules to extract features from the drug side. For the target side, it combines MSCNN with Transformer modules to capture both local and global features effectively. The extracted features are then weighted and fused, enabling comprehensive feature representation to enhance the predictive power of the model. Experimental results on the Davis dataset demonstrate that MCF-DTI achieves an AUC of 0.9746 and an AUPR of 0.9542, outperforming other state-of-the-art models. Our case study demonstrates that our model effectively validated several known drug-target relationships in lung cancer and predicted the therapeutic potential of certain preclinical compounds in treating lung cancer. These findings contribute valuable insights for subsequent drug repurposing efforts and novel drug development.
Collapse
Affiliation(s)
- Jihong Wang
- School of Computer, Guangdong University of Education, Guangzhou 510310, China
| | - Ruijia He
- School of Computer, Guangdong University of Education, Guangzhou 510310, China
| | - Xiaodan Wang
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Zhongshan 528458, China
| | - Hongjian Li
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Zhongshan 528458, China
| | - Yulei Lu
- School of Computer, Guangdong University of Education, Guangzhou 510310, China
| |
Collapse
|
5
|
Bhatia T, Sharma S. Drug Repurposing: Insights into Current Advances and Future Applications. Curr Med Chem 2025; 32:468-510. [PMID: 37946344 DOI: 10.2174/0109298673266470231023110841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 09/04/2023] [Accepted: 09/11/2023] [Indexed: 11/12/2023]
Abstract
Drug development is a complex and expensive process that involves extensive research and testing before a new drug can be approved for use. This has led to a limited availability of potential therapeutics for many diseases. Despite significant advances in biomedical science, the process of drug development remains a bottleneck, as all hypotheses must be tested through experiments and observations, which can be timeconsuming and costly. To address this challenge, drug repurposing has emerged as an innovative strategy for finding new uses for existing medications that go beyond their original intended use. This approach has the potential to speed up the drug development process and reduce costs, making it an attractive option for pharmaceutical companies and researchers alike. It involves the identification of existing drugs or compounds that have the potential to be used for the treatment of a different disease or condition. This can be done through a variety of approaches, including screening existing drugs against new disease targets, investigating the biological mechanisms of existing drugs, and analyzing data from clinical trials and electronic health records. Additionally, repurposing drugs can lead to the identification of new therapeutic targets and mechanisms of action, which can enhance our understanding of disease biology and lead to the development of more effective treatments. Overall, drug repurposing is an exciting and promising area of research that has the potential to revolutionize the drug development process and improve the lives of millions of people around the world. The present review provides insights on types of interaction, approaches, availability of databases, applications and limitations of drug repurposing.
Collapse
Affiliation(s)
- Trisha Bhatia
- School of Pharmacy, National Forensic Sciences University, Gandhinagar, Gujarat, 382007, India
| | - Shweta Sharma
- School of Pharmacy, National Forensic Sciences University, Gandhinagar, Gujarat, 382007, India
| |
Collapse
|
6
|
Ahmed F, Samantasinghar A, Ali W, Choi KH. Network-based drug repurposing identifies small molecule drugs as immune checkpoint inhibitors for endometrial cancer. Mol Divers 2024; 28:3879-3895. [PMID: 38227161 DOI: 10.1007/s11030-023-10784-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 11/25/2023] [Indexed: 01/17/2024]
Abstract
Endometrial cancer (EC) is the 6th most common cancer in women around the world. Alone in the United States (US), 66,200 new cases and 13,030 deaths are expected to occur in 2023 which needs the rapid development of potential therapies against EC. Here, a network-based drug-repurposing strategy is developed which led to the identification of 16 FDA-approved drugs potentially repurposable for EC as potential immune checkpoint inhibitors (ICIs). A network of EC-associated immune checkpoint proteins (ICPs)-induced protein interactions (P-ICP) was constructed. As a result of network analysis of P-ICP, top key target genes closely interacting with ICPs were shortlisted followed by network proximity analysis in drug-target interaction (DTI) network and pathway cross-examination which identified 115 distinct pathways of approved drugs as potential immune checkpoint inhibitors. The presented approach predicted 16 drugs to target EC-associated ICPs-induced pathways, three of which have already been used for EC and six of them possess immunomodulatory properties providing evidence of the validity of the strategy. Classification of the predicted pathways indicated that 15 drugs can be divided into two distinct pathway groups, containing 17 immune pathways and 98 metabolic pathways. In addition, drug-drug correlation analysis provided insight into finding useful drug combinations. This fair and verified analysis creates new opportunities for the quick repurposing of FDA-approved medications in clinical trials.
Collapse
Affiliation(s)
- Faheem Ahmed
- Department of Mechatronics Engineering, Jeju National University, Jeju, Republic of Korea
| | - Anupama Samantasinghar
- Department of Mechatronics Engineering, Jeju National University, Jeju, Republic of Korea
| | - Wajid Ali
- Department of Mechatronics Engineering, Jeju National University, Jeju, Republic of Korea
| | - Kyung Hyun Choi
- Department of Mechatronics Engineering, Jeju National University, Jeju, Republic of Korea.
| |
Collapse
|
7
|
Ma L, Yan Y, Dai S, Shao D, Yi S, Wang J, Li J, Yan J. Research on prediction of human oral bioavailability of drugs based on improved deep forest. J Mol Graph Model 2024; 133:108851. [PMID: 39232489 DOI: 10.1016/j.jmgm.2024.108851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 08/22/2024] [Accepted: 08/26/2024] [Indexed: 09/06/2024]
Abstract
Human oral bioavailability is a crucial factor in drug discovery. In recent years, researchers have constructed a variety of different prediction models. However, given the limited size of human oral bioavailability data sets, the challenge of making accurate predictions with small sample sizes has become a critical issue in the field. The deep forest model, with its adaptively determinable number of cascade levels, can perform exceptionally well even on small-scale data. However, the original deep forest suffers unbalanced multi-grained scanning process and premature stopping of cascade forest training. In this paper, we propose a human oral bioavailability predict method based on an improved deep forest, called balanced multi-grained scanning mapping cascade forest (bgmc-forest). Firstly, the mordred descriptor method is selected to feature extraction, then enhanced features are obtained by the improved balanced multi-grained scanning, which solves the problem of missing features at both ends. And finally, the prediction results are obtained by feature mapping cascaded forests, which is based on principal component analysis and cascade forests, ensures the effectiveness of the cascade forest. The superiority of the model constructed in this paper is demonstrated through comparative experiments, while the effectiveness of the improved module is verified through ablation experiments. Finally the decision-making process of the model is explained by the shapley additive explanations interpretation algorithm.
Collapse
Affiliation(s)
- Lei Ma
- Kunming University of Science and Technology, Kunming, CN 650500, China
| | - Yukun Yan
- Kunming University of Science and Technology, Kunming, CN 650500, China
| | - Shaoxing Dai
- Kunming University of Science and Technology, Kunming, CN 650500, China
| | - Dangguo Shao
- Kunming University of Science and Technology, Kunming, CN 650500, China.
| | - Sanli Yi
- Kunming University of Science and Technology, Kunming, CN 650500, China
| | - Jiawei Wang
- Kunming University of Science and Technology, Kunming, CN 650500, China
| | - Jingtao Li
- Kunming University of Science and Technology, Kunming, CN 650500, China
| | - Jiangkai Yan
- Kunming University of Science and Technology, Kunming, CN 650500, China
| |
Collapse
|
8
|
Zhang M, Hong Y, Shen L, Xu S, Xu Y, Zhang X, Liu J, Liu X. A heterogeneous graph neural network with automatic discovery of effective metapaths for drug–target interaction prediction. FUTURE GENERATION COMPUTER SYSTEMS 2024; 160:283-294. [DOI: 10.1016/j.future.2024.05.054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
9
|
Ye W, Li C, Zhang W, Li J, Liu L, Cheng D, Feng Z. Predicting drug-target interactions by measuring confidence with consistent causal neighborhood interventions. Methods 2024; 231:15-25. [PMID: 39218170 DOI: 10.1016/j.ymeth.2024.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 08/12/2024] [Accepted: 08/27/2024] [Indexed: 09/04/2024] Open
Abstract
Predicting drug-target interactions (DTI) is a crucial stage in drug discovery and development. Understanding the interaction between drugs and targets is essential for pinpointing the specific relationship between drug molecules and targets, akin to solving a link prediction problem using information technology. While knowledge graph (KG) and knowledge graph embedding (KGE) methods have been rapid advancements and demonstrated impressive performance in drug discovery, they often lack authenticity and accuracy in identifying DTI. This leads to increased misjudgment rates and reduced efficiency in drug development. To address these challenges, our focus lies in refining the accuracy of DTI prediction models through KGE, with a specific emphasis on causal intervention confidence measures (CI). These measures aim to assess triplet scores, enhancing the precision of the predictions. Comparative experiments conducted on three datasets and utilizing 9 KGE models reveal that our proposed confidence measure approach via causal intervention, significantly improves the accuracy of DTI link prediction compared to traditional approaches. Furthermore, our experimental analysis delves deeper into the embedding of intervention values, offering valuable insights for guiding the design and development of subsequent drug development experiments. As a result, our predicted outcomes serve as valuable guidance in the pursuit of more efficient drug development processes.
Collapse
Affiliation(s)
- Wenting Ye
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Chen Li
- Graduate School of Informatic, Nagoya University, Chikusa, Nagoya, 464-8602, Japan
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China; Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agric, Wuhan 430070, China; Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agric, Wuhan 430070, China
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Adelaide, 5095, Australia
| | - Lin Liu
- UniSA STEM, University of South Australia, Adelaide, 5095, Australia
| | - Debo Cheng
- UniSA STEM, University of South Australia, Adelaide, 5095, Australia.
| | - Zaiwen Feng
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China; Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agric, Wuhan 430070, China; Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agric, Wuhan 430070, China.
| |
Collapse
|
10
|
Mswahili ME, Jeong YS. Transformer-based models for chemical SMILES representation: A comprehensive literature review. Heliyon 2024; 10:e39038. [PMID: 39640612 PMCID: PMC11620068 DOI: 10.1016/j.heliyon.2024.e39038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 09/26/2024] [Accepted: 10/05/2024] [Indexed: 12/07/2024] Open
Abstract
Pre-trained chemical language models (CLMs) have attracted increasing attention within the domains of cheminformatics and bioinformatics, inspired by their remarkable success in the natural language processing (NLP) domain such as speech recognition, text analysis, translation, and other objectives associated with language. Furthermore, the vast amount of unlabeled data associated with chemical compounds or molecules has emerged as a crucial research focus, prompting the need for CLMs with reasoning capabilities over such data. Molecular graphs and molecular descriptors are the predominant approaches to representing molecules for property prediction in machine learning (ML). However, Transformer-based LMs have recently emerged as de-facto powerful tools in deep learning (DL), showcasing outstanding performance across various NLP downstream tasks, particularly in text analysis. Within the realm of pre-trained transformer-based LMs such as, BERT (and its variants) and GPT (and its variants) have been extensively explored in the chemical informatics domain. Various learning tasks in cheminformatics such as the text analysis that necessitate handling of chemical SMILES data which contains intricate relations among elements or atoms, have become increasingly prevalent. Whether the objective is predicting molecular reactions or molecular property prediction, there is a growing demand for LMs capable of learning molecular contextual information within SMILES sequences or strings from text inputs (i.e., SMILES). This review provides an overview of the current state-of-the-art of chemical language Transformer-based LMs in chemical informatics for de novo design, and analyses current limitations, challenges, and advantages. Finally, a perspective on future opportunities is provided in this evolving field.
Collapse
Affiliation(s)
- Medard Edmund Mswahili
- Chungbuk National University, Department of Computer Engineering, Cheongju, 28644, South Korea
| | - Young-Seob Jeong
- Chungbuk National University, Department of Computer Engineering, Cheongju, 28644, South Korea
| |
Collapse
|
11
|
Hong Q, Lin L, Li Z, Li Q, Yao J, Wu Q, Liu K, Tian J. A Distance Transformation Deep Forest Framework With Hybrid-Feature Fusion for CXR Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14633-14644. [PMID: 37285251 DOI: 10.1109/tnnls.2023.3280646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Detecting pneumonia, especially coronavirus disease 2019 (COVID-19), from chest X-ray (CXR) images is one of the most effective ways for disease diagnosis and patient triage. The application of deep neural networks (DNNs) for CXR image classification is limited due to the small sample size of the well-curated data. To tackle this problem, this article proposes a distance transformation-based deep forest framework with hybrid-feature fusion (DTDF-HFF) for accurate CXR image classification. In our proposed method, hybrid features of CXR images are extracted in two ways: hand-crafted feature extraction and multigrained scanning. Different types of features are fed into different classifiers in the same layer of the deep forest (DF), and the prediction vector obtained at each layer is transformed to form distance vector based on a self-adaptive scheme. The distance vectors obtained by different classifiers are fused and concatenated with the original features, then input into the corresponding classifier at the next layer. The cascade grows until DTDF-HFF can no longer gain benefits from the new layer. We compare the proposed method with other methods on the public CXR datasets, and the experimental results show that the proposed method can achieve state-of-the art (SOTA) performance. The code will be made publicly available at https://github.com/hongqq/DTDF-HFF.
Collapse
|
12
|
Qian Y, Li X, Wu J, Zhang Q. MMCL-CPI: A multi-modal compound-protein interaction prediction model incorporating contrastive learning pre-training. Comput Biol Chem 2024; 112:108137. [PMID: 39079285 DOI: 10.1016/j.compbiolchem.2024.108137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 05/31/2024] [Accepted: 06/20/2024] [Indexed: 09/13/2024]
Abstract
MOTIVATION Compound-protein interaction (CPI) prediction plays a crucial role in drug discovery and drug repositioning. Early researchers relied on time-consuming and labor-intensive wet laboratory experiments. However, the advent of deep learning has significantly accelerated this progress. Most existing deep learning methods utilize deep neural networks to extract compound features from sequences and graphs, either separately or in combination. Our team's previous research has demonstrated that compound images contain valuable information that can be leveraged for CPI task. However, there is a scarcity of multimodal methods that effectively combine sequence and image representations of compounds in CPI. Currently, the use of text-image pairs for contrastive language-image pre-training is a popular approach in the multimodal field. Further research is needed to explore how the integration of sequence and image representations can enhance the accuracy of CPI task. RESULTS This paper presents a novel method called MMCL-CPI, which encompasses two key highlights: 1) Firstly, we propose extracting compound features from two modalities: one-dimensional SMILES and two-dimensional images. This approach enables us to capture both sequence and spatial features, enhancing the prediction accuracy for CPI. Based on this, we design a novel multimodal model. 2) Secondly, we introduce a multimodal pre-training strategy that leverages comparative learning on a large-scale unlabeled dataset to establish the correspondence between SMILES string and compound's image. This pre-training approach significantly improves compound feature representations for downstream CPI task. Our method has shown competitive results on multiple datasets.
Collapse
Affiliation(s)
- Ying Qian
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China
| | - Xinyi Li
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China
| | - Jian Wu
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China
| | - Qian Zhang
- School of Computer Science and Technology, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, East China Normal University, Shanghai, China.
| |
Collapse
|
13
|
Sun D, Macedonia C, Chen Z, Chandrasekaran S, Najarian K, Zhou S, Cernak T, Ellingrod VL, Jagadish HV, Marini B, Pai M, Violi A, Rech JC, Wang S, Li Y, Athey B, Omenn GS. Can Machine Learning Overcome the 95% Failure Rate and Reality that Only 30% of Approved Cancer Drugs Meaningfully Extend Patient Survival? J Med Chem 2024; 67:16035-16055. [PMID: 39253942 DOI: 10.1021/acs.jmedchem.4c01684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Despite implementing hundreds of strategies, cancer drug development suffers from a 95% failure rate over 30 years, with only 30% of approved cancer drugs extending patient survival beyond 2.5 months. Adding more criteria without eliminating nonessential ones is impractical and may fall into the "survivorship bias" trap. Machine learning (ML) models may enhance efficiency by saving time and cost. Yet, they may not improve success rate without identifying the root causes of failure. We propose a "STAR-guided ML system" (structure-tissue/cell selectivity-activity relationship) to enhance success rate and efficiency by addressing three overlooked interdependent factors: potency/specificity to the on/off-targets determining efficacy in tumors at clinical doses, on/off-target-driven tissue/cell selectivity influencing adverse effects in the normal organs at clinical doses, and optimal clinical doses balancing efficacy/safety as determined by potency/specificity and tissue/cell selectivity. STAR-guided ML models can directly predict clinical dose/efficacy/safety from five features to design/select the best drugs, enhancing success and efficiency of cancer drug development.
Collapse
Affiliation(s)
| | | | - Zhigang Chen
- LabBotics.ai, Palo Alto, California 94303, United States
| | | | | | - Simon Zhou
- Aurinia Pharmaceuticals Inc., Rockville, Maryland 20850, United States
| | | | | | | | | | | | | | | | | | - Yan Li
- Translational Medicine and Clinical Pharmacology, Bristol Myers Squibb, Summit, New Jersey 07901, United States
| | | | | |
Collapse
|
14
|
Manen-Freixa L, Antolin AA. Polypharmacology prediction: the long road toward comprehensively anticipating small-molecule selectivity to de-risk drug discovery. Expert Opin Drug Discov 2024; 19:1043-1069. [PMID: 39004919 DOI: 10.1080/17460441.2024.2376643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024]
Abstract
INTRODUCTION Small molecules often bind to multiple targets, a behavior termed polypharmacology. Anticipating polypharmacology is essential for drug discovery since unknown off-targets can modulate safety and efficacy - profoundly affecting drug discovery success. Unfortunately, experimental methods to assess selectivity present significant limitations and drugs still fail in the clinic due to unanticipated off-targets. Computational methods are a cost-effective, complementary approach to predict polypharmacology. AREAS COVERED This review aims to provide a comprehensive overview of the state of polypharmacology prediction and discuss its strengths and limitations, covering both classical cheminformatics methods and bioinformatic approaches. The authors review available data sources, paying close attention to their different coverage. The authors then discuss major algorithms grouped by the types of data that they exploit using selected examples. EXPERT OPINION Polypharmacology prediction has made impressive progress over the last decades and contributed to identify many off-targets. However, data incompleteness currently limits most approaches to comprehensively predict selectivity. Moreover, our limited agreement on model assessment challenges the identification of the best algorithms - which at present show modest performance in prospective real-world applications. Despite these limitations, the exponential increase of multidisciplinary Big Data and AI hold much potential to better polypharmacology prediction and de-risk drug discovery.
Collapse
Affiliation(s)
- Leticia Manen-Freixa
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
| | - Albert A Antolin
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
- Center for Cancer Drug Discovery, The Division of Cancer Therapeutics, The Institute of Cancer Research, London, UK
| |
Collapse
|
15
|
Wu H, Liu J, Zhang R, Lu Y, Cui G, Cui Z, Ding Y. A review of deep learning methods for ligand based drug virtual screening. FUNDAMENTAL RESEARCH 2024; 4:715-737. [PMID: 39156568 PMCID: PMC11330120 DOI: 10.1016/j.fmre.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/10/2024] [Accepted: 02/18/2024] [Indexed: 08/20/2024] Open
Abstract
Drug discovery is costly and time consuming, and modern drug discovery endeavors are progressively reliant on computational methodologies, aiming to mitigate temporal and financial expenditures associated with the process. In particular, the time required for vaccine and drug discovery is prolonged during emergency situations such as the coronavirus 2019 pandemic. Recently, the performance of deep learning methods in drug virtual screening has been particularly prominent. It has become a concern for researchers how to summarize the existing deep learning in drug virtual screening, select different models for different drug screening problems, exploit the advantages of deep learning models, and further improve the capability of deep learning in drug virtual screening. This review first introduces the basic concepts of drug virtual screening, common datasets, and data representation methods. Then, large numbers of common deep learning methods for drug virtual screening are compared and analyzed. In addition, a dataset of different sizes is constructed independently to evaluate the performance of each deep learning model for the difficult problem of large-scale ligand virtual screening. Finally, the existing challenges and future directions in the field of virtual screening are presented.
Collapse
Affiliation(s)
- Hongjie Wu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Junkai Liu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Runhua Zhang
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Yaoyao Lu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Guozeng Cui
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Zhiming Cui
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| |
Collapse
|
16
|
Li W, Ma W, Yang M, Tang X. Drug repurposing based on the DTD-GNN graph neural network: revealing the relationships among drugs, targets and diseases. BMC Genomics 2024; 25:584. [PMID: 38862928 PMCID: PMC11165810 DOI: 10.1186/s12864-024-10499-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 06/05/2024] [Indexed: 06/13/2024] Open
Abstract
MOTIVATION The rational modelling of the relationship among drugs, targets and diseases is crucial for drug retargeting. While significant progress has been made in studying binary relationships, further research is needed to deepen our understanding of ternary relationships. The application of graph neural networks in drug retargeting is increasing, but further research is needed to determine the appropriate modelling method for ternary relationships and how to capture their complex multi-feature structure. RESULTS The aim of this study was to construct relationships among drug, targets and diseases. To represent the complex relationships among these entities, we used a heterogeneous graph structure. Additionally, we propose a DTD-GNN model that combines graph convolutional networks and graph attention networks to learn feature representations and association information, facilitating a more thorough exploration of the relationships. The experimental results demonstrate that the DTD-GNN model outperforms other graph neural network models in terms of AUC, Precision, and F1-score. The study has important implications for gaining a comprehensive understanding of the relationships between drugs and diseases, as well as for further research and application in exploring the mechanisms of drug-disease interactions. The study reveals these relationships, providing possibilities for innovative therapeutic strategies in medicine.
Collapse
Affiliation(s)
- Wenjun Li
- Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology, Changsha, Hunan, China
| | - Wanjun Ma
- Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology, Changsha, Hunan, China
| | - Mengyun Yang
- School of Intelligent Manufacturing, Hunan First Normal University, Changsha, 410205, Hunan, China
| | - Xiwei Tang
- School of Intelligent Manufacturing, Hunan First Normal University, Changsha, 410205, Hunan, China.
| |
Collapse
|
17
|
Zhang Y, Li J, Lin S, Zhao J, Xiong Y, Wei DQ. An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model. J Cheminform 2024; 16:67. [PMID: 38849874 PMCID: PMC11162000 DOI: 10.1186/s13321-024-00862-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 05/19/2024] [Indexed: 06/09/2024] Open
Abstract
Identification of interactions between chemical compounds and proteins is crucial for various applications, including drug discovery, target identification, network pharmacology, and elucidation of protein functions. Deep neural network-based approaches are becoming increasingly popular in efficiently identifying compound-protein interactions with high-throughput capabilities, narrowing down the scope of candidates for traditional labor-intensive, time-consuming and expensive experimental techniques. In this study, we proposed an end-to-end approach termed SPVec-SGCN-CPI, which utilized simplified graph convolutional network (SGCN) model with low-dimensional and continuous features generated from our previously developed model SPVec and graph topology information to predict compound-protein interactions. The SGCN technique, dividing the local neighborhood aggregation and nonlinearity layer-wise propagation steps, effectively aggregates K-order neighbor information while avoiding neighbor explosion and expediting training. The performance of the SPVec-SGCN-CPI method was assessed across three datasets and compared against four machine learning- and deep learning-based methods, as well as six state-of-the-art methods. Experimental results revealed that SPVec-SGCN-CPI outperformed all these competing methods, particularly excelling in unbalanced data scenarios. By propagating node features and topological information to the feature space, SPVec-SGCN-CPI effectively incorporates interactions between compounds and proteins, enabling the fusion of heterogeneity. Furthermore, our method scored all unlabeled data in ChEMBL, confirming the top five ranked compound-protein interactions through molecular docking and existing evidence. These findings suggest that our model can reliably uncover compound-protein interactions within unlabeled compound-protein pairs, carrying substantial implications for drug re-profiling and discovery. In summary, SPVec-SGCN demonstrates its efficacy in accurately predicting compound-protein interactions, showcasing potential to enhance target identification and streamline drug discovery processes.Scientific contributionsThe methodology presented in this work not only enables the comparatively accurate prediction of compound-protein interactions but also, for the first time, take sample imbalance which is very common in real world and computation efficiency into consideration simultaneously, accelerating the target identification and drug discovery process.
Collapse
Affiliation(s)
- Yufang Zhang
- School of Mathematical Sciences and SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, 200240, China
- Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China
- Zhongjing Research and Industrialization, Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, 473006, Henan, China
| | - Jiayi Li
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China
| | - Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China
| | - Jianwei Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China.
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Dong-Qing Wei
- Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China.
- Zhongjing Research and Industrialization, Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, 473006, Henan, China.
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China.
| |
Collapse
|
18
|
Wang W, Yu M, Sun B, Li J, Liu D, Zhang H, Wang X, Zhou Y. SMGCN: Multiple Similarity and Multiple Kernel Fusion Based Graph Convolutional Neural Network for Drug-Target Interactions Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:143-154. [PMID: 38051618 DOI: 10.1109/tcbb.2023.3339645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Accurately identifying potential drug-target interactions (DTIs) is a critical step in accelerating drug discovery. Despite many studies that have been conducted over the past decades, detecting DTIs remains a highly challenging and complicated process. Therefore, we propose a novel method called SMGCN, which combines multiple similarity and multiple kernel fusion based on Graph Convolutional Network (GCN) to predict DTIs. In order to capture the features of the network structure and fully explore direct or indirect relationships between nodes, we propose the method of multiple similarity, which combines similarity fusion matrices with Random Walk with Restart (RWR) and cosine similarity. Then, we use GCN to extract multi-layer low-dimensional embedding features. Unlike traditional GCN methods, we incorporate Multiple Kernel Learning (MKL). Finally, we use the Dual Laplace Regularized Least Squares method to predict novel DTIs through combinatorial kernels in drug and target spaces. We conduct experiments on a golden standard dataset, and demonstrate the effectiveness of our proposed model in predicting DTIs through showing significant improvements in Area Under the Curve (AUC) and Area Under the Precision-Recall Curve (AUPR). In addition, our model can also discover some new DTIs, which can be verified by the KEGG BRITE Database and relevant literature.
Collapse
|
19
|
Gao Z, Ding P, Xu R. IUPHAR review - Data-driven computational drug repurposing approaches for opioid use disorder. Pharmacol Res 2024; 199:106960. [PMID: 37832859 DOI: 10.1016/j.phrs.2023.106960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/08/2023] [Accepted: 10/10/2023] [Indexed: 10/15/2023]
Abstract
Opioid Use Disorder (OUD) is a chronic and relapsing condition characterized by the misuse of opioid drugs, causing significant morbidity and mortality in the United States. Existing medications for OUD are limited, and there is an immediate need to discover treatments with enhanced safety and efficacy. Drug repurposing aims to find new indications for existing medications, offering a time-saving and cost-efficient alternative strategy to traditional drug discovery. Computational approaches have been developed to further facilitate the drug repurposing process. In this paper, we reviewed state-of-the-art data-driven computational drug repurposing approaches for OUD and discussed their advantages and potential challenges. We also highlighted promising repurposed candidate drugs for OUD that were identified by computational drug repurposing techniques and reviewed studies supporting their potential mechanisms of action in treating OUD.
Collapse
Affiliation(s)
- Zhenxiang Gao
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Pingjian Ding
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Rong Xu
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| |
Collapse
|
20
|
Djeddi WE, Hermi K, Ben Yahia S, Diallo G. Advancing drug-target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining. BMC Bioinformatics 2023; 24:488. [PMID: 38114937 PMCID: PMC10731821 DOI: 10.1186/s12859-023-05593-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 11/30/2023] [Indexed: 12/21/2023] Open
Abstract
BACKGROUND The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interactions. Recently, machine learning techniques, particularly graph-based methods, have gained prominence. These methods utilize networks of drugs and targets, employing knowledge graph embedding (KGE) to represent structured information from knowledge graphs in a continuous vector space. This phenomenon highlights the growing inclination to utilize graph topologies as a means to improve the precision of predicting DTIs, hence addressing the pressing requirement for effective computational methodologies in the field of drug discovery. RESULTS The present study presents a novel approach called DTIOG for the prediction of DTIs. The methodology employed in this study involves the utilization of a KGE strategy, together with the incorporation of contextual information obtained from protein sequences. More specifically, the study makes use of Protein Bidirectional Encoder Representations from Transformers (ProtBERT) for this purpose. DTIOG utilizes a two-step process to compute embedding vectors using KGE techniques. Additionally, it employs ProtBERT to determine target-target similarity. Different similarity measures, such as Cosine similarity or Euclidean distance, are utilized in the prediction procedure. In addition to the contextual embedding, the proposed unique approach incorporates local representations obtained from the Simplified Molecular Input Line Entry Specification (SMILES) of drugs and the amino acid sequences of protein targets. CONCLUSIONS The effectiveness of the proposed approach was assessed through extensive experimentation on datasets pertaining to Enzymes, Ion Channels, and G-protein-coupled Receptors. The remarkable efficacy of DTIOG was showcased through the utilization of diverse similarity measures in order to calculate the similarities between drugs and targets. The combination of these factors, along with the incorporation of various classifiers, enabled the model to outperform existing algorithms in its ability to predict DTIs. The consistent observation of this advantage across all datasets underlines the robustness and accuracy of DTIOG in the domain of DTIs. Additionally, our case study suggests that the DTIOG can serve as a valuable tool for discovering new DTIs.
Collapse
Affiliation(s)
- Warith Eddine Djeddi
- LR11ES14, Faculty of Sciences of Tunis, University of Tunis El Manar, Campus Universitaire, 2092, Tunis, Tunisia.
- High Institute of Informatics in Kef, University of Jendouba, Saleh Ayech, 8189, Jendouba, Tunisia.
| | - Khalil Hermi
- High Institute of Informatics in Kef, University of Jendouba, Saleh Ayech, 8189, Jendouba, Tunisia
| | - Sadok Ben Yahia
- Department of Software Science, Tallinn University of Technology, Ehitajate tee-5, 12618, Tallinn, Estonia
- The Maersk Mc-Kinney Moller Institute, Southern Syddansk Universitet, Alsion 2, 6400, Sønderborg, Denmark
| | - Gayo Diallo
- Bordeaux Population Health Inserm 1219, University of Bordeaux, rue Léo Saignat, 33000, Bordeaux, France
| |
Collapse
|
21
|
Chen J, Zhang L, Cheng K, Jin B, Lu X, Che C. Predicting Drug-Target Interaction Via Self-Supervised Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2781-2789. [PMID: 35230952 DOI: 10.1109/tcbb.2022.3153963] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recent advances in graph representation learning provide new opportunities for computational drug-target interaction (DTI) prediction. However, it still suffers from deficiencies of dependence on manual labels and vulnerability to attacks. Inspired by the success of self-supervised learning (SSL) algorithms, which can leverage input data itself as supervision,we propose SupDTI, a SSL-enhanced drug-target interaction prediction framework based on a heterogeneous network (i.e., drug-protein, drug-drug, and protein-protein interaction network; drug-disease, drug-side-effect, and protein-disease association network; drug-structure and protein-sequence similarity network). Specifically, SupDTI is an end-to-end learning framework consisting of five components. First, localized and globalized graph convolutions are designed to capture the nodes' information from both local and global perspectives, respectively. Then, we develop a variational autoencoder to constrain the nodes' representation to have desired statistical characteristics. Finally, a unified self-supervised learning strategy is leveraged to enhance the nodes' representation, namely, a contrastive learning module is employed to enable the nodes' representation to fit the graph-level representation, followed by a generative learning module which further maximizes the node-level agreement across the global and local views by learning the probabilistic connectivity distribution of the original heterogeneous network. Experimental results show that our model can achieve better prediction performance than state-of-the-art methods.
Collapse
|
22
|
Wang Y, Gao YL, Wang J, Li F, Liu JX. MSGCA: Drug-Disease Associations Prediction Based on Multi-Similarities Graph Convolutional Autoencoder. IEEE J Biomed Health Inform 2023; 27:3686-3694. [PMID: 37163398 DOI: 10.1109/jbhi.2023.3272154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Identifying drug-disease associations (DDAs) is critical to the development of drugs. Traditional methods to determine DDAs are expensive and inefficient. Therefore, it is imperative to develop more accurate and effective methods for DDAs prediction. Most current DDAs prediction methods utilize original DDAs matrix directly. However, the original DDAs matrix is sparse, which greatly affects the prediction consequences. Hence, a prediction method based on multi-similarities graph convolutional autoencoder (MSGCA) is proposed for DDAs prediction. First, MSGCA integrates multiple drug similarities and disease similarities using centered kernel alignment-based multiple kernel learning (CKA-MKL) algorithm to form new drug similarity and disease similarity, respectively. Second, the new drug and disease similarities are improved by linear neighborhood, and the DDAs matrix is reconstructed by weighted K nearest neighbor profiles. Next, the reconstructed DDAs and the improved drug and disease similarities are integrated into a heterogeneous network. Finally, the graph convolutional autoencoder with attention mechanism is utilized to predict DDAs. Compared with extant methods, MSGCA shows superior results on three datasets. Furthermore, case studies further demonstrate the reliability of MSGCA.
Collapse
|
23
|
Hu X, Yin Z, Zeng Z, Peng Y. Prediction of miRNA-Disease Associations by Cascade Forest Model Based on Stacked Autoencoder. Molecules 2023; 28:5013. [PMID: 37446675 DOI: 10.3390/molecules28135013] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/23/2023] [Accepted: 06/24/2023] [Indexed: 07/15/2023] Open
Abstract
Numerous pieces of evidence have indicated that microRNA (miRNA) plays a crucial role in a series of significant biological processes and is closely related to complex disease. However, the traditional biological experimental methods used to verify disease-related miRNAs are inefficient and expensive. Thus, it is necessary to design some excellent approaches to improve efficiency. In this work, a novel method (CFSAEMDA) is proposed for the prediction of unknown miRNA-disease associations (MDAs). Specifically, we first capture the interactive features of miRNA and disease by integrating multi-source information. Then, the stacked autoencoder is applied for obtaining the underlying feature representation. Finally, the modified cascade forest model is employed to complete the final prediction. The experimental results present that the AUC value obtained by our method is 97.67%. The performance of CFSAEMDA is superior to several of the latest methods. In addition, case studies conducted on lung neoplasms, breast neoplasms and hepatocellular carcinoma further show that the CFSAEMDA method may be regarded as a utility approach to infer unknown disease-miRNA relationships.
Collapse
Affiliation(s)
- Xiang Hu
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Zhixiang Yin
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Zhiliang Zeng
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Yu Peng
- Center of Intelligent Computing and Applied Statistics, School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| |
Collapse
|
24
|
Li J, Wang Y, Li Z, Lin H, Wu B. LM-DTI: a tool of predicting drug-target interactions using the node2vec and network path score methods. Front Genet 2023; 14:1181592. [PMID: 37229202 PMCID: PMC10203599 DOI: 10.3389/fgene.2023.1181592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 04/13/2023] [Indexed: 05/27/2023] Open
Abstract
Introduction: Drug-target interaction (DTI) prediction is a key step in drug function discovery and repositioning. The emergence of large-scale heterogeneous biological networks provides an opportunity to identify drug-related target genes, which led to the development of several computational methods for DTI prediction. Methods: Considering the limitations of conventional computational methods, a novel tool named LM-DTI based on integrated information related to lncRNAs and miRNAs was proposed, which adopted the graph embedding (node2vec) and the network path score methods. First, LM-DTI innovatively constructed a heterogeneous information network containing eight networks composed of four types of nodes (drug, target, lncRNA, and miRNA). Next, the node2vec method was used to obtain feature vectors of drug as well as target nodes, and the path score vector of each drug-target pair was calculated using the DASPfind method. Finally, the feature vectors and path score vectors were merged and input into the XGBoost classifier to predict potential drug-target interactions. Results and Discussion: The 10-fold cross validations evaluate the classification accuracies of the LM-DTI. The prediction performance of LM-DTI in AUPR reached 0.96, which showed a significant improvement compared with those of conventional tools. The validity of LM-DTI has also been verified by manually searching literature and various databases. LM-DTI is scalable and computing efficient; thus representing a powerful drug relocation tool that can be accessed for free at http://www.lirmed.com:5038/lm_dti.
Collapse
Affiliation(s)
- Jianwei Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
- School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, China
| | - Yinfei Wang
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Zhiguang Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Hongxin Lin
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Baoqin Wu
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| |
Collapse
|
25
|
Zhang L, Ouyang C, Hu F, Liu Y, Gao Z. Relational Topology-based Heterogeneous Network Embedding for Predicting Drug-Target Interactions. DATA INTELLIGENCE 2023; 5:475-493. [DOI: 10.1162/dint_a_00149] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025] Open
Abstract
ABSTRACT
Predicting interactions between drugs and target proteins has become an essential task in the drug discovery process. Although the method of validation via wet-lab experiments has become available, experimental methods for drug-target interaction (DTI) identification remain either time consuming or heavily dependent on domain expertise. Therefore, various computational models have been proposed to predict possible interactions between drugs and target proteins. However, most prediction methods do not consider the topological structures characteristics of the relationship. In this paper, we propose a relational topology-based heterogeneous network embedding method to predict drug-target interactions, abbreviated as RTHNE_ DTI. We first construct a heterogeneous information network based on the interaction between different types of nodes, to enhance the ability of association discovery by fully considering the topology of the network. Then drug and target protein nodes can be represented by the other types of nodes. According to the different topological structure of the relationship between the nodes, we divide the relationship in the heterogeneous network into two categories and model them separately. Extensive experiments on the real-world drug datasets, RTHNE_DTI produces high efficiency and outperforms other state-of-the-art methods. RTHNE_DTI can be further used to predict the interaction between unknown interaction drug-target pairs.
Collapse
Affiliation(s)
- Linlin Zhang
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
| | - Chunping Ouyang
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
- Hunan Medical Big Data International Sci.&Tech, Innovation Cooperation Base
| | - Fuyu Hu
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
| | - Yongbin Liu
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
- Hunan Medical Big Data International Sci.&Tech, Innovation Cooperation Base
| | - Zheng Gao
- Department of Information and Library Science, Indiana University Bloomington Woodlawn Avenue, IN 47408, Bloomington, America
| |
Collapse
|
26
|
Abbasi Mesrabadi H, Faez K, Pirgazi J. Drug-target interaction prediction based on protein features, using wrapper feature selection. Sci Rep 2023; 13:3594. [PMID: 36869062 PMCID: PMC9984486 DOI: 10.1038/s41598-023-30026-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 02/14/2023] [Indexed: 03/05/2023] Open
Abstract
Drug-target interaction prediction is a vital stage in drug development, involving lots of methods. Experimental methods that identify these relationships on the basis of clinical remedies are time-taking, costly, laborious, and complex introducing a lot of challenges. One group of new methods is called computational methods. The development of new computational methods which are more accurate can be preferable to experimental methods, in terms of total cost and time. In this paper, a new computational model to predict drug-target interaction (DTI), consisting of three phases, including feature extraction, feature selection, and classification is proposed. In feature extraction phase, different features such as EAAC, PSSM and etc. would be extracted from sequence of proteins and fingerprint features from drugs. These extracted features would then be combined. In the next step, one of the wrapper feature selection methods named IWSSR, due to the large amount of extracted data, is applied. The selected features are then given to rotation forest classification, to have a more efficient prediction. Actually, the innovation of our work is that we extract different features; and then select features by the use of IWSSR. The accuracy of the rotation forest classifier based on tenfold on the golden standard datasets (enzyme, ion channels, G-protein-coupled receptors, nuclear receptors) is as follows: 98.12, 98.07, 96.82, and 95.64. The results of experiments indicate that the proposed model has an acceptable rate in DTI prediction and is compatible with the proposed methods in other papers.
Collapse
Affiliation(s)
- Hengame Abbasi Mesrabadi
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | - Karim Faez
- Department of Electrical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran.
| | - Jamshid Pirgazi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| |
Collapse
|
27
|
Hou Z, Yang Y, Ma Z, Wong KC, Li X. Learning the protein language of proteome-wide protein-protein binding sites via explainable ensemble deep learning. Commun Biol 2023; 6:73. [PMID: 36653447 PMCID: PMC9849350 DOI: 10.1038/s42003-023-04462-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 01/11/2023] [Indexed: 01/20/2023] Open
Abstract
Protein-protein interactions (PPIs) govern cellular pathways and processes, by significantly influencing the functional expression of proteins. Therefore, accurate identification of protein-protein interaction binding sites has become a key step in the functional analysis of proteins. However, since most computational methods are designed based on biological features, there are no available protein language models to directly encode amino acid sequences into distributed vector representations to model their characteristics for protein-protein binding events. Moreover, the number of experimentally detected protein interaction sites is much smaller than that of protein-protein interactions or protein sites in protein complexes, resulting in unbalanced data sets that leave room for improvement in their performance. To address these problems, we develop an ensemble deep learning model (EDLM)-based protein-protein interaction (PPI) site identification method (EDLMPPI). Evaluation results show that EDLMPPI outperforms state-of-the-art techniques including several PPI site prediction models on three widely-used benchmark datasets including Dset_448, Dset_72, and Dset_164, which demonstrated that EDLMPPI is superior to those PPI site prediction models by nearly 10% in terms of average precision. In addition, the biological and interpretable analyses provide new insights into protein binding site identification and characterization mechanisms from different perspectives. The EDLMPPI webserver is available at http://www.edlmppi.top:5002/ .
Collapse
Affiliation(s)
- Zilong Hou
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Yuning Yang
- Information Science and Technology, Northeast Normal University, Jilin, China
| | - Zhiqiang Ma
- Information Science and Technology, Northeast Normal University, Jilin, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Jilin, China.
| |
Collapse
|
28
|
Peng Y, Zhao S, Zeng Z, Hu X, Yin Z. LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions. Front Microbiol 2023; 13:1092467. [PMID: 36687573 PMCID: PMC9849804 DOI: 10.3389/fmicb.2022.1092467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 12/07/2022] [Indexed: 01/07/2023] Open
Abstract
Prediction of drug-target interactions (DTIs) plays an important role in drug development. However, traditional laboratory methods to determine DTIs require a lot of time and capital costs. In recent years, many studies have shown that using machine learning methods to predict DTIs can speed up the drug development process and reduce capital costs. An excellent DTI prediction method should have both high prediction accuracy and low computational cost. In this study, we noticed that the previous research based on deep forests used XGBoost as the estimator in the cascade, we applied LightGBM instead of XGBoost to the cascade forest as the estimator, then the estimator group was determined experimentally as three LightGBMs and three ExtraTrees, this new model is called LGBMDF. We conducted 5-fold cross-validation on LGBMDF and other state-of-the-art methods using the same dataset, and compared their Sn, Sp, MCC, AUC and AUPR. Finally, we found that our method has better performance and faster calculation speed.
Collapse
|
29
|
Lei S, Lei X, Liu L. Drug repositioning based on heterogeneous networks and variational graph autoencoders. Front Pharmacol 2022; 13:1056605. [PMID: 36618933 PMCID: PMC9812491 DOI: 10.3389/fphar.2022.1056605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022] Open
Abstract
Predicting new therapeutic effects (drug repositioning) of existing drugs plays an important role in drug development. However, traditional wet experimental prediction methods are usually time-consuming and costly. The emergence of more and more artificial intelligence-based drug repositioning methods in the past 2 years has facilitated drug development. In this study we propose a drug repositioning method, VGAEDR, based on a heterogeneous network of multiple drug attributes and a variational graph autoencoder. First, a drug-disease heterogeneous network is established based on three drug attributes, disease semantic information, and known drug-disease associations. Second, low-dimensional feature representations for heterogeneous networks are learned through a variational graph autoencoder module and a multi-layer convolutional module. Finally, the feature representation is fed to a fully connected layer and a Softmax layer to predict new drug-disease associations. Comparative experiments with other baseline methods on three datasets demonstrate the excellent performance of VGAEDR. In the case study, we predicted the top 10 possible anti-COVID-19 drugs on the existing drug and disease data, and six of them were verified by other literatures.
Collapse
|
30
|
Ni J, Cheng X, Ni T, Liang J. Identifying SM-miRNA associations based on layer attention graph convolutional network and matrix decomposition. Front Mol Biosci 2022; 9:1009099. [PMID: 36504714 PMCID: PMC9732030 DOI: 10.3389/fmolb.2022.1009099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 11/03/2022] [Indexed: 11/27/2022] Open
Abstract
The accurate prediction of potential associations between microRNAs (miRNAs) and small molecule (SM) drugs can enhance our knowledge of how SM cures endogenous miRNA-related diseases. Given that traditional methods for predicting SM-miRNA associations are time-consuming and arduous, a number of computational models have been proposed to anticipate the potential SM-miRNA associations. However, several of these strategies failed to eliminate noise from the known SM-miRNA association information or failed to prioritize the most significant known SM-miRNA associations. Therefore, we proposed a model of Graph Convolutional Network with Layer Attention mechanism for SM-MiRNA Association prediction (GCNLASMMA). Firstly, we obtained the new SM-miRNA associations by matrix decomposition. The new SM-miRNA associations, as well as the integrated SM similarity and miRNA similarity were subsequently incorporated into a heterogeneous network. Finally, a graph convolutional network with an attention mechanism was used to compute the reconstructed SM-miRNA association matrix. Furthermore, four types of cross validations and two types of case studies were performed to assess the performance of GCNLASMMA. In cross validation, global Leave-One-Out Cross Validation (LOOCV), miRNA-fixed LOOCV, SM-fixed LOOCV and 5-fold cross-validation achieved excellent performance. Numerous hypothesized associations in case studies were confirmed by experimental literatures. All of these results confirmed that GCNLASMMA is a trustworthy association inference method.
Collapse
|
31
|
Tian Z, Peng X, Fang H, Zhang W, Dai Q, Ye Y. MHADTI: predicting drug-target interactions via multiview heterogeneous information network embedding with hierarchical attention mechanisms. Brief Bioinform 2022; 23:6761042. [PMID: 36242566 DOI: 10.1093/bib/bbac434] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/19/2022] [Accepted: 09/08/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Discovering the drug-target interactions (DTIs) is a crucial step in drug development such as the identification of drug side effects and drug repositioning. Since identifying DTIs by web-biological experiments is time-consuming and costly, many computational-based approaches have been proposed and have become an efficient manner to infer the potential interactions. Although extensive effort is invested to solve this task, the prediction accuracy still needs to be improved. More especially, heterogeneous network-based approaches do not fully consider the complex structure and rich semantic information in these heterogeneous networks. Therefore, it is still a challenge to predict DTIs efficiently. RESULTS In this study, we develop a novel method via Multiview heterogeneous information network embedding with Hierarchical Attention mechanisms to discover potential Drug-Target Interactions (MHADTI). Firstly, MHADTI constructs different similarity networks for drugs and targets by utilizing their multisource information. Combined with the known DTI network, three drug-target heterogeneous information networks (HINs) with different views are established. Secondly, MHADTI learns embeddings of drugs and targets from multiview HINs with hierarchical attention mechanisms, which include the node-level, semantic-level and graph-level attentions. Lastly, MHADTI employs the multilayer perceptron to predict DTIs with the learned deep feature representations. The hierarchical attention mechanisms could fully consider the importance of nodes, meta-paths and graphs in learning the feature representations of drugs and targets, which makes their embeddings more comprehensively. Extensive experimental results demonstrate that MHADTI performs better than other SOTA prediction models. Moreover, analysis of prediction results for some interested drugs and targets further indicates that MHADTI has advantages in discovering DTIs. AVAILABILITY AND IMPLEMENTATION https://github.com/pxystudy/MHADTI.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Xiangyu Peng
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Haichuan Fang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Wenjie Zhang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Qiguo Dai
- School of Computer Science and Engineering, Dalian Minzu University, Dalian,116600, China
| | - Yangdong Ye
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
32
|
Nussinov R, Zhang M, Liu Y, Jang H. AlphaFold, Artificial Intelligence (AI), and Allostery. J Phys Chem B 2022; 126:6372-6383. [PMID: 35976160 PMCID: PMC9442638 DOI: 10.1021/acs.jpcb.2c04346] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/03/2022] [Indexed: 02/08/2023]
Abstract
AlphaFold has burst into our lives. A powerful algorithm that underscores the strength of biological sequence data and artificial intelligence (AI). AlphaFold has appended projects and research directions. The database it has been creating promises an untold number of applications with vast potential impacts that are still difficult to surmise. AI approaches can revolutionize personalized treatments and usher in better-informed clinical trials. They promise to make giant leaps toward reshaping and revamping drug discovery strategies, selecting and prioritizing combinations of drug targets. Here, we briefly overview AI in structural biology, including in molecular dynamics simulations and prediction of microbiota-human protein-protein interactions. We highlight the advancements accomplished by the deep-learning-powered AlphaFold in protein structure prediction and their powerful impact on the life sciences. At the same time, AlphaFold does not resolve the decades-long protein folding challenge, nor does it identify the folding pathways. The models that AlphaFold provides do not capture conformational mechanisms like frustration and allostery, which are rooted in ensembles, and controlled by their dynamic distributions. Allostery and signaling are properties of populations. AlphaFold also does not generate ensembles of intrinsically disordered proteins and regions, instead describing them by their low structural probabilities. Since AlphaFold generates single ranked structures, rather than conformational ensembles, it cannot elucidate the mechanisms of allosteric activating driver hotspot mutations nor of allosteric drug resistance. However, by capturing key features, deep learning techniques can use the single predicted conformation as the basis for generating a diverse ensemble.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational
Structural Biology Section, Frederick National
Laboratory for Cancer Research, Frederick, Maryland 21702, United States
- Department
of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Mingzhen Zhang
- Computational
Structural Biology Section, Frederick National
Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| | - Yonglan Liu
- Cancer
Innovation Laboratory, National Cancer Institute, Frederick, Maryland 21702, United States
| | - Hyunbum Jang
- Computational
Structural Biology Section, Frederick National
Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| |
Collapse
|
33
|
Lian M, Wang X, Du W. Integrated multi-similarity fusion and heterogeneous graph inference for drug-target interaction prediction. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
34
|
Zhang W, Hou J, Liu B. iPiDA-LTR: Identifying piwi-interacting RNA-disease associations based on Learning to Rank. PLoS Comput Biol 2022; 18:e1010404. [PMID: 35969645 PMCID: PMC9410559 DOI: 10.1371/journal.pcbi.1010404] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 08/25/2022] [Accepted: 07/18/2022] [Indexed: 12/01/2022] Open
Abstract
Piwi-interacting RNAs (piRNAs) are regarded as drug targets and biomarkers for the diagnosis and therapy of diseases. However, biological experiments cost substantial time and resources, and the existing computational methods only focus on identifying missing associations between known piRNAs and diseases. With the fast development of biological experiments, more and more piRNAs are detected. Therefore, the identification of piRNA-disease associations of newly detected piRNAs has significant theoretical value and practical significance on pathogenesis of diseases. In this study, the iPiDA-LTR predictor is proposed to identify associations between piRNAs and diseases based on Learning to Rank. The iPiDA-LTR predictor not only identifies the missing associations between known piRNAs and diseases, but also detects diseases associated with newly detected piRNAs. Experimental results demonstrate that iPiDA-LTR effectively predicts piRNA-disease associations outperforming the other related methods.
Collapse
Affiliation(s)
- Wenxiang Zhang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Jialu Hou
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
35
|
Pan X, Lin X, Cao D, Zeng X, Yu PS, He L, Nussinov R, Cheng F. Deep learning for drug repurposing: Methods, databases, and applications. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1597] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Xiaoqin Pan
- School of Computer Science and Engineering Hunan University Changsha Hunan China
| | - Xuan Lin
- School of Computer Science Xiangtan University Xiangtan China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education Xiangtan University Xiangtan China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences Central South University Changsha China
| | - Xiangxiang Zeng
- School of Computer Science and Engineering Hunan University Changsha Hunan China
| | - Philip S. Yu
- Department of Computer Science University of Illinois at Chicago Chicago Illinois USA
| | - Lifang He
- Department of Computer Science and Engineering Lehigh University Bethlehem Pennsylvania USA
| | - Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research National Cancer Institute at Frederick Frederick Maryland USA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine Tel Aviv University Tel Aviv Israel
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic Cleveland Ohio USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine Case Western Reserve University Cleveland Ohio USA
- Case Comprehensive Cancer Center Case Western Reserve University School of Medicine Cleveland Ohio USA
| |
Collapse
|
36
|
Wang H, Huang F, Xiong Z, Zhang W. A heterogeneous network-based method with attentive meta-path extraction for predicting drug-target interactions. Brief Bioinform 2022; 23:6596318. [PMID: 35641162 DOI: 10.1093/bib/bbac184] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/09/2022] [Accepted: 04/23/2022] [Indexed: 11/13/2022] Open
Abstract
Predicting drug-target interactions (DTIs) is crucial at many phases of drug discovery and repositioning. Many computational methods based on heterogeneous networks (HNs) have proved their potential to predict DTIs by capturing extensive biological knowledge and semantic information from meta-paths. However, existing methods manually customize meta-paths, which is overly dependent on some specific expertise. Such strategy heavily limits the scalability and flexibility of these models, and even affects their predictive performance. To alleviate this limitation, we propose a novel HN-based method with attentive meta-path extraction for DTI prediction, named HampDTI, which is capable of automatically extracting useful meta-paths through a learnable attention mechanism instead of pre-definition based on domain knowledge. Specifically, by scoring multi-hop connections across various relations in the HN with each relation assigned an attention weight, HampDTI constructs a new trainable graph structure, called meta-path graph. Such meta-path graph implicitly measures the importance of every possible meta-path between drugs and targets. To enable HampDTI to extract more diverse meta-paths, we adopt a multi-channel mechanism to generate multiple meta-path graphs. Then, a graph neural network is deployed on the generated meta-path graphs to yield the multi-channel embeddings of drugs and targets. Finally, HampDTI fuses all embeddings from different channels for predicting DTIs. The meta-path graphs are optimized along with the model training such that HampDTI can adaptively extract valuable meta-paths for DTI prediction. The experiments on benchmark datasets not only show the superiority of HampDTI in DTI prediction over several baseline methods, but also, more importantly, demonstrate the effectiveness of the model discovering important meta-paths.
Collapse
Affiliation(s)
- Hongzhun Wang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Feng Huang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Zhankun Xiong
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| |
Collapse
|
37
|
Shao K, Zhang Y, Wen Y, Zhang Z, He S, Bo X. DTI-HETA: prediction of drug-target interactions based on GCN and GAT on heterogeneous graph. Brief Bioinform 2022; 23:6563180. [PMID: 35380622 DOI: 10.1093/bib/bbac109] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 02/14/2022] [Accepted: 03/03/2022] [Indexed: 12/19/2022] Open
Abstract
Drug-target interaction (DTI) prediction plays an important role in drug repositioning, drug discovery and drug design. However, due to the large size of the chemical and genomic spaces and the complex interactions between drugs and targets, experimental identification of DTIs is costly and time-consuming. In recent years, the emerging graph neural network (GNN) has been applied to DTI prediction because DTIs can be represented effectively using graphs. However, some of these methods are only based on homogeneous graphs, and some consist of two decoupled steps that cannot be trained jointly. To further explore GNN-based DTI prediction by integrating heterogeneous graph information, this study regards DTI prediction as a link prediction problem and proposes an end-to-end model based on HETerogeneous graph with Attention mechanism (DTI-HETA). In this model, a heterogeneous graph is first constructed based on the drug-drug and target-target similarity matrices and the DTI matrix. Then, the graph convolutional neural network is utilized to obtain the embedded representation of the drugs and targets. To highlight the contribution of different neighborhood nodes to the central node in aggregating the graph convolution information, a graph attention mechanism is introduced into the node embedding process. Afterward, an inner product decoder is applied to predict DTIs. To evaluate the performance of DTI-HETA, experiments are conducted on two datasets. The experimental results show that our model is superior to the state-of-the-art methods. Also, the identification of novel DTIs indicates that DTI-HETA can serve as a powerful tool for integrating heterogeneous graph information to predict DTIs.
Collapse
Affiliation(s)
| | | | - Yuqi Wen
- Beijing Institute of Radiation Medicine, Beijing, China
| | | | - Song He
- Beijing Institute of Radiation Medicine, Beijing, China
| | - Xiaochen Bo
- Beijing Institute of Radiation Medicine, Beijing, China
| |
Collapse
|
38
|
Tastan Bishop Ö, Mutemi Musyoka T, Barozi V. Allostery and missense mutations as intermittently linked promising aspects of modern computational drug discovery. J Mol Biol 2022; 434:167610. [DOI: 10.1016/j.jmb.2022.167610] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 04/21/2022] [Accepted: 04/22/2022] [Indexed: 12/15/2022]
|
39
|
Affinity2Vec: drug-target binding affinity prediction through representation learning, graph mining, and machine learning. Sci Rep 2022; 12:4751. [PMID: 35306525 PMCID: PMC8934358 DOI: 10.1038/s41598-022-08787-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 03/08/2022] [Indexed: 11/21/2022] Open
Abstract
Drug-target interaction (DTI) prediction plays a crucial role in drug repositioning and virtual drug screening. Most DTI prediction methods cast the problem as a binary classification task to predict if interactions exist or as a regression task to predict continuous values that indicate a drug's ability to bind to a specific target. The regression-based methods provide insight beyond the binary relationship. However, most of these methods require the three-dimensional (3D) structural information of targets which are still not generally available to the targets. Despite this bottleneck, only a few methods address the drug-target binding affinity (DTBA) problem from a non-structure-based approach to avoid the 3D structure limitations. Here we propose Affinity2Vec, as a novel regression-based method that formulates the entire task as a graph-based problem. To develop this method, we constructed a weighted heterogeneous graph that integrates data from several sources, including drug-drug similarity, target-target similarity, and drug-target binding affinities. Affinity2Vec further combines several computational techniques from feature representation learning, graph mining, and machine learning to generate or extract features, build the model, and predict the binding affinity between the drug and the target with no 3D structural data. We conducted extensive experiments to evaluate and demonstrate the robustness and efficiency of the proposed method on benchmark datasets used in state-of-the-art non-structured-based drug-target binding affinity studies. Affinity2Vec showed superior and competitive results compared to the state-of-the-art methods based on several evaluation metrics, including mean squared error, rm2, concordance index, and area under the precision-recall curve.
Collapse
|
40
|
Xia LY, Tang L, Huang H, Luo J. Identification of Potential Driver Genes and Pathways Based on Transcriptomics Data in Alzheimer's Disease. Front Aging Neurosci 2022; 14:752858. [PMID: 35401145 PMCID: PMC8985410 DOI: 10.3389/fnagi.2022.752858] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 02/21/2022] [Indexed: 01/16/2023] Open
Abstract
Alzheimer's disease (AD) is one of the most common neurodegenerative diseases. To identify AD-related genes from transcriptomics and help to develop new drugs to treat AD. In this study, firstly, we obtained differentially expressed genes (DEG)-enriched coexpression networks between AD and normal samples in multiple transcriptomics datasets by weighted gene co-expression network analysis (WGCNA). Then, a convergent genomic approach (CFG) integrating multiple AD-related evidence was used to prioritize potential genes from DEG-enriched modules. Subsequently, we identified candidate genes in the potential genes list. Lastly, we combined deepDTnet and SAveRUNNER to predict interaction among candidate genes, drug and AD. Experiments on five datasets show that the CFG score of GJA1 is the highest among all potential driver genes of AD. Moreover, we found GJA1 interacts with AD from target-drugs-diseases network prediction. Therefore, candidate gene GJA1 is the most likely to be target of AD. In summary, identification of AD-related genes contributes to the understanding of AD pathophysiology and the development of new drugs.
Collapse
|
41
|
HKAM-MKM: A hybrid kernel alignment maximization-based multiple kernel model for identifying DNA-binding proteins. Comput Biol Med 2022; 145:105395. [PMID: 35334314 DOI: 10.1016/j.compbiomed.2022.105395] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 03/08/2022] [Accepted: 03/08/2022] [Indexed: 12/24/2022]
Abstract
The identification of DNA-binding proteins (DBPs) has always been a hot issue in the field of sequence classification. However, considering that the experimental identification method is very resource-intensive, the construction of a computational prediction model is worthwhile. This study developed and evaluated a hybrid kernel alignment maximization-based multiple kernel model (HKAM-MKM) for predicting DBPs. First, we collected two datasets and performed feature extraction on the sequences to obtain six feature groups, and then constructed the corresponding kernels. To ensure the effective utilisation of the base kernel and avoid ignoring the difference between the sample and its neighbours, we proposed local kernel alignment to calculate the kernel between the sample and its neighbours, with each sample as the centre. We combined the global and local kernel alignments to develop a hybrid kernel alignment model, and balance the relationship between the two through parameters. By maximising the hybrid kernel alignment value, we obtained the weight of each kernel and then linearly combined the kernels in the form of weights. Finally, the fused kernel was input into a support vector machine for training and prediction. Finally, in the independent test sets PDB186 and PDB2272, we obtained the highest Matthew's correlation coefficient (MCC) (0.768 and 0.5962, respectively) and the highest accuracy (87.1% and 78.43%, respectively), which were superior to the other predictors. Therefore, HKAM-MKM is an efficient prediction tool for DBPs.
Collapse
|
42
|
Jiao S, Chen Z, Zhang L, Zhou X, Shi L. ATGPred-FL: sequence-based prediction of autophagy proteins with feature representation learning. Amino Acids 2022; 54:799-809. [PMID: 35286461 DOI: 10.1007/s00726-022-03145-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 01/28/2022] [Indexed: 11/26/2022]
Abstract
Autophagy plays an important role in biological evolution and is regulated by many autophagy proteins. Accurate identification of autophagy proteins is crucially important to reveal their biological functions. Due to the expense and labor cost of experimental methods, it is urgent to develop automated, accurate and reliable sequence-based computational tools to enable the identification of novel autophagy proteins among numerous proteins and peptides. For this purpose, a new predictor named ATGPred-FL was proposed for the efficient identification of autophagy proteins. We investigated various sequence-based feature descriptors and adopted the feature learning method to generate corresponding, more informative probability features. Then, a two-step feature selection strategy based on accuracy was utilized to remove irrelevant and redundant features, leading to the most discriminative 14-dimensional feature set. The final predictor was built using a support vector machine classifier, which performed favorably on both the training and testing sets with accuracy values of 94.40% and 90.50%, respectively. ATGPred-FL is the first ATG machine learning predictor based on protein primary sequences. We envision that ATGPred-FL will be an effective and useful tool for autophagy protein identification, and it is available for free at http://lab.malab.cn/~acy/ATGPred-FL , the source code and datasets are accessible at https://github.com/jiaoshihu/ATGPred .
Collapse
Affiliation(s)
- Shihu Jiao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Zheng Chen
- School of Applied Chemistry and Biological Technology, Shenzhen Polytechnic, 7098 Liuxian Street, Shenzhen, 518055, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.4 Block 2 North Jianshe Road, Chengdu, 61005, China
| | - Lichao Zhang
- School of Intelligent Manufacturing and Equipment, Shenzhen Institute of Information Technology, Shenzhen, 518172, China
| | - Xun Zhou
- Beidahuang Industry Group General Hospital, Harbin, 150001, China.
| | - Lei Shi
- Department of Spine Surgery, Changzheng Hospital, Naval Medical University, No 415, Fengyang Road, Huangpu District, Shanghai, 210000, China.
| |
Collapse
|
43
|
Wei L, Ye X, Sakurai T, Mu Z, Wei L. ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning. Bioinformatics 2022; 38:1514-1524. [PMID: 34999757 DOI: 10.1093/bioinformatics/btac006] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 11/29/2021] [Accepted: 01/04/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Recently, peptides have emerged as a promising class of pharmaceuticals for various diseases treatment poised between traditional small molecule drugs and therapeutic proteins. However, one of the key bottlenecks preventing them from therapeutic peptides is their toxicity toward human cells, and few available algorithms for predicting toxicity are specially designed for short-length peptides. RESULTS We present ToxIBTL, a novel deep learning framework by utilizing the information bottleneck principle and transfer learning to predict the toxicity of peptides as well as proteins. Specifically, we use evolutionary information and physicochemical properties of peptide sequences and integrate the information bottleneck principle into a feature representation learning scheme, by which relevant information is retained and the redundant information is minimized in the obtained features. Moreover, transfer learning is introduced to transfer the common knowledge contained in proteins to peptides, which aims to improve the feature representation capability. Extensive experimental results demonstrate that ToxIBTL not only achieves a higher prediction performance than state-of-the-art methods on the peptide dataset, but also has a competitive performance on the protein dataset. Furthermore, a user-friendly online web server is established as the implementation of the proposed ToxIBTL. AVAILABILITY AND IMPLEMENTATION The proposed ToxIBTL and data can be freely accessible at http://server.wei-group.net/ToxIBTL. Our source code is available at https://github.com/WLYLab/ToxIBTL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lesong Wei
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Zengchao Mu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China
| |
Collapse
|
44
|
Nussinov R, Zhang M, Maloney R, Tsai C, Yavuz BR, Tuncbag N, Jang H. Mechanism of activation and the rewired network: New drug design concepts. Med Res Rev 2022; 42:770-799. [PMID: 34693559 PMCID: PMC8837674 DOI: 10.1002/med.21863] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 07/06/2021] [Accepted: 10/07/2021] [Indexed: 12/13/2022]
Abstract
Precision oncology benefits from effective early phase drug discovery decisions. Recently, drugging inactive protein conformations has shown impressive successes, raising the cardinal questions of which targets can profit and what are the principles of the active/inactive protein pharmacology. Cancer driver mutations have been established to mimic the protein activation mechanism. We suggest that the decision whether to target an inactive (or active) conformation should largely rest on the protein mechanism of activation. We next discuss the recent identification of double (multiple) same-allele driver mutations and their impact on cell proliferation and suggest that like single driver mutations, double drivers also mimic the mechanism of activation. We further suggest that the structural perturbations of double (multiple) in cis mutations may reveal new surfaces/pockets for drug design. Finally, we underscore the preeminent role of the cellular network which is deregulated in cancer. Our structure-based review and outlook updates the traditional Mechanism of Action, informs decisions, and calls attention to the intrinsic activation mechanism of the target protein and the rewired tumor-specific network, ushering innovative considerations in precision medicine.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer ImmunometabolismNational Cancer InstituteFrederickMarylandUSA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of MedicineTel Aviv UniversityTel AvivIsrael
| | - Mingzhen Zhang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer ImmunometabolismNational Cancer InstituteFrederickMarylandUSA
| | - Ryan Maloney
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer ImmunometabolismNational Cancer InstituteFrederickMarylandUSA
| | - Chung‐Jung Tsai
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer ImmunometabolismNational Cancer InstituteFrederickMarylandUSA
| | - Bengi Ruken Yavuz
- Department of Health Informatics, Graduate School of InformaticsMiddle East Technical UniversityAnkaraTurkey
| | - Nurcan Tuncbag
- Department of Health Informatics, Graduate School of InformaticsMiddle East Technical UniversityAnkaraTurkey
- Department of Chemical and Biological Engineering, College of EngineeringKoc UniversityIstanbulTurkey
- Koc University Research Center for Translational Medicine, School of MedicineKoc UniversityIstanbulTurkey
| | - Hyunbum Jang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer ImmunometabolismNational Cancer InstituteFrederickMarylandUSA
| |
Collapse
|
45
|
Li J, Wang J, Lv H, Zhang Z, Wang Z. IMCHGAN: Inductive Matrix Completion With Heterogeneous Graph Attention Networks for Drug-Target Interactions Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:655-665. [PMID: 34115592 DOI: 10.1109/tcbb.2021.3088614] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identification of targets among known drugs plays an important role in drug repurposing and discovery. Computational approaches for prediction of drug-target interactions (DTIs)are highly desired in comparison to traditional biological experiments as its fast and low price. Moreover, recent advances of systems biology approaches have generated large-scale heterogeneous, biological information networks data, which offer opportunities for machine learning-based identification of DTIs. We present a novel Inductive Matrix Completion with Heterogeneous Graph Attention Network approach (IMCHGAN)for predicting DTIs. IMCHGAN first adopts a two-level neural attention mechanism approach to learn drug and target latent feature representations from the DTI heterogeneous network respectively. Then, the learned latent features are fed into the Inductive Matrix Completion (IMC)prediction score model which computes the best projection from drug space onto target space and output DTI score via the inner product of projected drug and target feature representations. IMCHGAN is an end-to-end neural network learning framework where the parameters of both the prediction score model and the feature representation learning model are simultaneously optimized via backpropagation under supervising of the observed known drug-target interactions data. We compare IMCHGAN with other state-of-the-art baselines on two real DTI experimental datasets. The results show that our method is superior to existing methods in term of AUC and AUPR. Moreover, IMCHGAN also shows it has strong predictive power for novel (unknown)DTIs. All datasets and code can be obtained from https://github.com/ljatynu/IMCHGAN/.
Collapse
|
46
|
Roberti A, Chaffey LE, Greaves DR. NF-κB Signaling and Inflammation-Drug Repurposing to Treat Inflammatory Disorders? BIOLOGY 2022; 11:372. [PMID: 35336746 PMCID: PMC8945680 DOI: 10.3390/biology11030372] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/12/2022] [Accepted: 02/15/2022] [Indexed: 12/15/2022]
Abstract
NF-κB is a central mediator of inflammation, response to DNA damage and oxidative stress. As a result of its central role in so many important cellular processes, NF-κB dysregulation has been implicated in the pathology of important human diseases. NF-κB activation causes inappropriate inflammatory responses in diseases including rheumatoid arthritis (RA) and multiple sclerosis (MS). Thus, modulation of NF-κB signaling is being widely investigated as an approach to treat chronic inflammatory diseases, autoimmunity and cancer. The emergence of COVID-19 in late 2019, the subsequent pandemic and the huge clinical burden of patients with life-threatening SARS-CoV-2 pneumonia led to a massive scramble to repurpose existing medicines to treat lung inflammation in a wide range of healthcare systems. These efforts continue and have proven to be controversial. Drug repurposing strategies are a promising alternative to de novo drug development, as they minimize drug development timelines and reduce the risk of failure due to unexpected side effects. Different experimental approaches have been applied to identify existing medicines which inhibit NF-κB that could be repurposed as anti-inflammatory drugs.
Collapse
Affiliation(s)
| | | | - David R. Greaves
- Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK; (A.R.); (L.E.C.)
| |
Collapse
|
47
|
Orlando G, Raimondi D, Duran-Romaña R, Moreau Y, Schymkowitz J, Rousseau F. PyUUL provides an interface between biological structures and deep learning algorithms. Nat Commun 2022; 13:961. [PMID: 35181656 PMCID: PMC8857184 DOI: 10.1038/s41467-022-28327-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 01/18/2022] [Indexed: 11/09/2022] Open
Abstract
Structural bioinformatics suffers from the lack of interfaces connecting biological structures and machine learning methods, making the application of modern neural network architectures impractical. This negatively affects the development of structure-based bioinformatics methods, causing a bottleneck in biological research. Here we present PyUUL ( https://pyuul.readthedocs.io/ ), a library to translate biological structures into 3D tensors, allowing an out-of-the-box application of state-of-the-art deep learning algorithms. The library converts biological macromolecules to data structures typical of computer vision, such as voxels and point clouds, for which extensive machine learning research has been performed. Moreover, PyUUL allows an out-of-the box GPU and sparse calculation. Finally, we demonstrate how PyUUL can be used by researchers to address some typical bioinformatics problems, such as structure recognition and docking.
Collapse
Affiliation(s)
- Gabriele Orlando
- Switch Laboratory, VIB-KU Leuven Center for Brain and Disease Research, Herestraat 49, 3000, Leuven, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, 3000, Leuven, Belgium
| | | | - Ramon Duran-Romaña
- Switch Laboratory, VIB-KU Leuven Center for Brain and Disease Research, Herestraat 49, 3000, Leuven, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, 3000, Leuven, Belgium
| | | | - Joost Schymkowitz
- Switch Laboratory, VIB-KU Leuven Center for Brain and Disease Research, Herestraat 49, 3000, Leuven, Belgium.
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, 3000, Leuven, Belgium.
| | - Frederic Rousseau
- Switch Laboratory, VIB-KU Leuven Center for Brain and Disease Research, Herestraat 49, 3000, Leuven, Belgium.
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, 3000, Leuven, Belgium.
| |
Collapse
|
48
|
Wang W, Zhang X, Dai DQ. springD2A: capturing uncertainty in disease-drug association prediction with model integration. Bioinformatics 2022; 38:1353-1360. [PMID: 34864881 DOI: 10.1093/bioinformatics/btab820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 11/23/2021] [Accepted: 11/30/2021] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Drug repositioning that aims to find new indications for existing drugs has been an efficient strategy for drug discovery. In the scenario where we only have confirmed disease-drug associations as positive pairs, a negative set of disease-drug pairs is usually constructed from the unknown disease-drug pairs in previous studies, where we do not know whether drugs and diseases can be associated, to train a model for disease-drug association prediction (drug repositioning). Drugs and diseases in these negative pairs can potentially be associated, but most studies have ignored them. RESULTS We present a method, springD2A, to capture the uncertainty in the negative pairs, and to discriminate between positive and unknown pairs because the former are more reliable. In springD2A, we introduce a spring-like penalty for the loss of negative pairs, which is strong if they are too close in a unit sphere, but mild if they are at a moderate distance. We also design a sequential sampling in which the probability of an unknown disease-drug pair sampled as negative is proportional to its score predicted as positive. Multiple models are learned during sequential sampling, and we adopt parameter- and feature-based ensemble schemes to boost performance. Experiments show springD2A is an effective tool for drug-repositioning. AVAILABILITY AND IMPLEMENTATION A python implementation of springD2A and datasets used in this study are available at https://github.com/wangyuanhao/springD2A. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Weiwen Wang
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, Guangzhou 510000, China
| | - Xiwen Zhang
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, Guangzhou 510000, China
| | - Dao-Qing Dai
- Intelligent Data Center, School of Mathematics, Sun Yat-Sen University, Guangzhou 510000, China
| |
Collapse
|
49
|
Interpreting neural networks for biological sequences by learning stochastic masks. NAT MACH INTELL 2022; 4:41-54. [DOI: 10.1038/s42256-021-00428-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
50
|
Zhai Y, Zhang J, Zhang T, Gong Y, Zhang Z, Zhang D, Zhao Y. AOPM: Application of Antioxidant Protein Classification Model in Predicting the Composition of Antioxidant Drugs. Front Pharmacol 2022; 12:818115. [PMID: 35115948 PMCID: PMC8803896 DOI: 10.3389/fphar.2021.818115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 12/20/2021] [Indexed: 11/18/2022] Open
Abstract
Antioxidant proteins can not only balance the oxidative stress in the body, but are also an important component of antioxidant drugs. Accurate identification of antioxidant proteins is essential to help humans fight diseases and develop new drugs. In this paper, we developed a friendly method AOPM to identify antioxidant proteins. 188D and the Composition of k-spaced Amino Acid Pairs were adopted as the feature extraction method. In addition, the Max-Relevance-Max-Distance algorithm (MRMD) and random forest were the feature selection and classifier, respectively. We used 5-folds cross-validation and independent test dataset to evaluate our model. On the test dataset, AOPM presented a higher performance compared with the state-of-the-art methods. The sensitivity, specificity, accuracy, Matthew’s Correlation Coefficient and an Area Under the Curve reached 87.3, 94.2, 92.0%, 0.815 and 0.972, respectively. In addition, AOPM still has excellent performance in predicting the catalytic enzymes of antioxidant drugs. This work proved the feasibility of virtual drug screening based on sequence information and provided new ideas and solutions for drug development.
Collapse
Affiliation(s)
- Yixiao Zhai
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Jingyu Zhang
- Department of Neurology, the Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Yue Gong
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Zixiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
| | - Dandan Zhang
- Department of Obstetrics and Gynecology, the First Affiliated Hospital of Harbin Medical University, Harbin, China
- *Correspondence: Dandan Zhang, ; Yuming Zhao,
| | - Yuming Zhao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
- *Correspondence: Dandan Zhang, ; Yuming Zhao,
| |
Collapse
|