1
|
Tang X, Hou Y, Meng Y, Wang Z, Lu C, Lv J, Hu X, Xu J, Yang J. CDPMF-DDA: contrastive deep probabilistic matrix factorization for drug-disease association prediction. BMC Bioinformatics 2025; 26:5. [PMID: 39773275 PMCID: PMC11708303 DOI: 10.1186/s12859-024-06032-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Accepted: 12/27/2024] [Indexed: 01/11/2025] Open
Abstract
The process of new drug development is complex, whereas drug-disease association (DDA) prediction aims to identify new therapeutic uses for existing medications. However, existing graph contrastive learning approaches typically rely on single-view contrastive learning, which struggle to fully capture drug-disease relationships. Subsequently, we introduce a novel multi-view contrastive learning framework, named CDPMF-DDA, which enhances the model's ability to capture drug-disease associations by incorporating diverse information representations from different views. First, we decompose the original drug-disease association matrix into drug and disease feature matrices, which are then used to reconstruct the drug-disease association network, as well as the drug-drug and disease-disease similarity networks. This process effectively reduces noise in the data, establishing a reliable foundation for the networks produced. Next, we generate multiple contrastive views from both the original and generated networks. These views effectively capture hidden feature associations, significantly enhancing the model's ability to represent complex relationships. Extensive cross-validation experiments on three standard datasets show that CDPMF-DDA achieves an average AUC of 0.9475 and an AUPR of 0.5009, outperforming existing models. Additionally, case studies on Alzheimer's disease and epilepsy further validate the model's effectiveness, demonstrating its high accuracy and robustness in drug-disease association prediction. Based on a multi-view contrastive learning framework, CDPMF-DDA is capable of integrating multi-source information and effectively capturing complex drug-disease associations, making it a powerful tool for drug repositioning and the discovery of new therapeutic strategies.
Collapse
Affiliation(s)
- Xianfang Tang
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Yawen Hou
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Yajie Meng
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Zhaojing Wang
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Changcheng Lu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Juan Lv
- College of Traditional Chinese Medicine, Changsha Medical University, Changsha, 410000, China
| | - Xinrong Hu
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, China
| | - Junlin Xu
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China.
| | | |
Collapse
|
2
|
Van Norden M, Mangione W, Falls Z, Samudrala R. Strategies for robust, accurate, and generalizable benchmarking of drug discovery platforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.10.627863. [PMID: 39764006 PMCID: PMC11702551 DOI: 10.1101/2024.12.10.627863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/15/2025]
Abstract
Benchmarking is an important step in the improvement, assessment, and comparison of the performance of drug discovery platforms and technologies. We revised the existing benchmarking protocols in our Computational Analysis of Novel Drug Opportunities (CANDO) multiscale therapeutic discovery platform to improve utility and performance. We optimized multiple parameters used in drug candidate prediction and assessment with these updated benchmarking protocols. CANDO ranked 7.4% of known drugs in the top 10 compounds for their respective diseases/indications based on drug-indication associations/mappings obtained from the Comparative Toxicogenomics Database (CTD) using these optimized parameters. This increased to 12.1% when drug-indication mappings were obtained from the Therapeutic Targets Database. Performance on an indication was weakly correlated (Spearman correlation coefficient >0.3) with indication size (number of drugs associated with an indication) and moderately correlated (correlation coefficient >0.5) with compound chemical similarity. There was also moderate correlation between our new and original benchmarking protocols when assessing performance per indication using each protocol. Benchmarking results were also dependent on the source of the drug-indication mapping used: a higher proportion of indication-associated drugs were recalled in the top 100 compounds when using the Therapeutic Targets Database (TTD), which only includes FDA-approved drug-indication associations (in contrast to the CTD, which includes associations drawn from the literature). We also created compbench, a publicly available head-to-head benchmarking protocol that allows consistent assessment and comparison of different drug discovery platforms. Using this protocol, we compared two pipelines for drug repurposing within CANDO; our primary pipeline outperformed another similarity-based pipeline still in development that clusters signatures based on their associated Gene Ontology terms. Our study sets a precedent for the complete, comprehensive, and comparable benchmarking of drug discovery platforms, resulting in more accurate drug candidate predictions.
Collapse
Affiliation(s)
- Melissa Van Norden
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - William Mangione
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Zackary Falls
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| |
Collapse
|
3
|
Han X, Xie X, Zhao R, Li Y, Ma P, Li H, Chen F, Zhao Y, Tang Z. Calculating the similarity between prescriptions to find their new indications based on graph neural network. Chin Med 2024; 19:124. [PMID: 39261848 PMCID: PMC11391787 DOI: 10.1186/s13020-024-00994-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 09/01/2024] [Indexed: 09/13/2024] Open
Abstract
BACKGROUND Drug repositioning has the potential to reduce costs and accelerate the rate of drug development, with highly promising applications. Currently, the development of artificial intelligence has provided the field with fast and efficient computing power. Nevertheless, the repositioning of traditional Chinese medicine (TCM) is still in its infancy, and the establishment of a reasonable and effective research method is a pressing issue that requires urgent attention. The use of graph neural network (GNN) to compute the similarity between TCM prescriptions to develop a method for finding their new indications is an innovative attempt. METHODS This paper focused on traditional Chinese medicine prescriptions containing ephedra, with 20 prescriptions for treating external cough and asthma taken as target prescriptions. The remaining 67 prescriptions containing ephedra were taken as to-be-matched prescriptions. Furthermore, a multitude of data pertaining to the prescriptions, including diseases, disease targets, symptoms, and various types of information on herbs, was gathered from a diverse array of literature sources, such as Chinese medicine databases. Then, cosine similarity and Jaccard coefficient were calculated to characterize the similarity between prescriptions using graph convolutional network (GCN) with a self-supervised learning method, such as deep graph infomax (DGI). RESULTS A total of 1340 values were obtained for each of the two calculation indicators. A total of 68 prescription pairs were identified after screening with 0.77 as the threshold for cosine similarity. Following the removal of false positive results, 12 prescription pairs were deemed to have further research value. A total of 5 prescription pairs were screened using a threshold of 0.50 for the Jaccard coefficient. However, the specific results did not exhibit significant value for further use, which may be attributed to the excessive variety of information in the dataset. CONCLUSIONS The proposed method can provide reference for finding new indications of target prescriptions by quantifying the similarity between prescriptions. It is expected to offer new insights for developing a scientific and systematic research methodology for traditional Chinese medicine repositioning.
Collapse
Affiliation(s)
- Xingxing Han
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-Di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China
| | - Xiaoxia Xie
- National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China
| | - Ranran Zhao
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-Di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China
| | - Yu Li
- National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China
| | - Pengzhen Ma
- National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China
| | - Huan Li
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-Di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China
| | - Fengming Chen
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-Di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China
| | - Yufeng Zhao
- National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China.
| | - Zhishu Tang
- State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-Di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, People's Republic of China.
- Beijing University of Chinese Medicine, Beijing, 100029, People's Republic of China.
| |
Collapse
|
4
|
Tayebi J, BabaAli B. EKGDR: An End-to-End Knowledge Graph-Based Method for Computational Drug Repurposing. J Chem Inf Model 2024; 64:1868-1881. [PMID: 38483449 DOI: 10.1021/acs.jcim.3c01925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
The lengthy and expensive process of developing new drugs from scratch, coupled with a high failure rate, has prompted the emergence of drug repurposing/repositioning as a more efficient and cost-effective approach. This approach involves identifying new therapeutic applications for existing approved drugs, leveraging the extensive drug-related data already gathered. However, the diversity and heterogeneity of data, along with the limited availability of known drug-disease interactions, pose significant challenges to computational drug design. To address these challenges, this study introduces EKGDR, an end-to-end knowledge graph-based approach for computational drug repurposing. EKGDR utilizes the power of a drug knowledge graph, a comprehensive repository of drug-related information that encompasses known drug interactions and various categorization information, as well as structural molecular descriptors of drugs. EKGDR employs graph neural networks, a cutting-edge graph representation learning technique, to embed the drug knowledge graph (nodes and relations) in an end-to-end manner. By doing so, EKGDR can effectively learn the underlying causes (intents) behind drug-disease interactions and recursively aggregate and combine relational messages between nodes along different multihop neighborhood paths (relational paths). This process generates representations of disease and drug nodes, enabling EKGDR to predict the interaction probability for each drug-disease pair in an end-to-end manner. The obtained results demonstrate that EKGDR outperforms previous models in all three evaluation metrics: area under the receiver operating characteristic curve (AUROC = 0.9475), area under the precision-recall curve (AUPRC = 0.9490), and recall at the top-200 recommendations (Recall@200 = 0.8315). To further validate EKGDR's effectiveness, we evaluated the top-20 candidate drugs suggested for each of Alzheimer's and Parkinson's diseases.
Collapse
Affiliation(s)
- Javad Tayebi
- School of Mathematics, Statistics and Computer Science, University of Tehran, Tehran 141556455, Iran
| | - Bagher BabaAli
- School of Mathematics, Statistics and Computer Science, University of Tehran, Tehran 141556455, Iran
| |
Collapse
|
5
|
Zhao BW, Su XR, Yang Y, Li DX, Li GD, Hu PW, Zhao YG, Hu L. Drug-disease association prediction using semantic graph and function similarity representation learning over heterogeneous information networks. Methods 2023; 220:106-114. [PMID: 37972913 DOI: 10.1016/j.ymeth.2023.10.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/13/2023] [Accepted: 10/28/2023] [Indexed: 11/19/2023] Open
Abstract
Discovering new indications for existing drugs is a promising development strategy at various stages of drug research and development. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering available higher-order connectivity patterns in heterogeneous biological information networks, which are believed to be useful for improving the accuracy of new drug discovering. To this end, we propose a computational-based model, called SFRLDDA, for drug-disease association prediction by using semantic graph and function similarity representation learning. Specifically, SFRLDDA first integrates a heterogeneous information network (HIN) by drug-disease, drug-protein, protein-disease associations, and their biological knowledge. Second, different representation learning strategies are applied to obtain the feature representations of drugs and diseases from different perspectives over semantic graph and function similarity graphs constructed, respectively. At last, a Random Forest classifier is incorporated by SFRLDDA to discover potential drug-disease associations (DDAs). Experimental results demonstrate that SFRLDDA yields a best performance when compared with other state-of-the-art models on three benchmark datasets. Moreover, case studies also indicate that the simultaneous consideration of semantic graph and function similarity of drugs and diseases in the HIN allows SFRLDDA to precisely predict DDAs in a more comprehensive manner.
Collapse
Affiliation(s)
- Bo-Wei Zhao
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China.
| | - Xiao-Rui Su
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China.
| | - Yue Yang
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China.
| | - Dong-Xu Li
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China.
| | - Guo-Dong Li
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China.
| | - Peng-Wei Hu
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China.
| | - Yong-Gang Zhao
- Department of Orthopaedic Surgery (hand and foot trauma), People's Hospital of Dongxihu, Wuhan 420100, China.
| | - Lun Hu
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China; Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China.
| |
Collapse
|
6
|
Askr H, Elgeldawi E, Aboul Ella H, Elshaier YAMM, Gomaa MM, Hassanien AE. Deep learning in drug discovery: an integrative review and future challenges. Artif Intell Rev 2022; 56:5975-6037. [PMID: 36415536 PMCID: PMC9669545 DOI: 10.1007/s10462-022-10306-1] [Citation(s) in RCA: 86] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2022] [Indexed: 11/18/2022]
Abstract
Recently, using artificial intelligence (AI) in drug discovery has received much attention since it significantly shortens the time and cost of developing new drugs. Deep learning (DL)-based approaches are increasingly being used in all stages of drug development as DL technology advances, and drug-related data grows. Therefore, this paper presents a systematic Literature review (SLR) that integrates the recent DL technologies and applications in drug discovery Including, drug-target interactions (DTIs), drug-drug similarity interactions (DDIs), drug sensitivity and responsiveness, and drug-side effect predictions. We present a review of more than 300 articles between 2000 and 2022. The benchmark data sets, the databases, and the evaluation measures are also presented. In addition, this paper provides an overview of how explainable AI (XAI) supports drug discovery problems. The drug dosing optimization and success stories are discussed as well. Finally, digital twining (DT) and open issues are suggested as future research challenges for drug discovery problems. Challenges to be addressed, future research directions are identified, and an extensive bibliography is also included.
Collapse
Affiliation(s)
- Heba Askr
- Faculty of Computers and Artificial Intelligence, University of Sadat City, Sadat City, Egypt
| | - Enas Elgeldawi
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Heba Aboul Ella
- Faculty of Pharmacy and Drug Technology, Chinese University in Egypt (CUE), Cairo, Egypt
| | | | - Mamdouh M. Gomaa
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Aboul Ella Hassanien
- Faculty of Computers and Artificial Intelligence, Cairo University, Cairo, Egypt
| |
Collapse
|
7
|
Lu L, Qin J, Chen J, Wu H, Zhao Q, Miyano S, Zhang Y, Yu H, Li C. DDIT: An Online Predictor for Multiple Clinical Phenotypic Drug-Disease Associations. Front Pharmacol 2022; 12:772026. [PMID: 35126114 PMCID: PMC8809407 DOI: 10.3389/fphar.2021.772026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 11/19/2021] [Indexed: 12/20/2022] Open
Abstract
Background: Drug repurposing provides an effective method for high-speed, low-risk drug development. Clinical phenotype-based screening exceeded target-based approaches in discovering first-in-class small-molecule drugs. However, most of these approaches predict only binary phenotypic associations between drugs and diseases; the types of drug and diseases have not been well exploited. Principally, the clinical phenotypes of a known drug can be divided into indications (Is), side effects (SEs), and contraindications (CIs). Incorporating these different clinical phenotypes of drug–disease associations (DDAs) can improve the prediction accuracy of the DDAs. Methods: We develop Drug Disease Interaction Type (DDIT), a user-friendly online predictor that supports drug repositioning by submitting known Is, SEs, and CIs for a target drug of interest. The dataset for Is, SEs, and CIs was extracted from PREDICT, SIDER, and MED-RT, respectively. To unify the names of the drugs and diseases, we mapped their names to the Unified Medical Language System (UMLS) ontology using Rest API. We then integrated multiple clinical phenotypes into a conditional restricted Boltzmann machine (RBM) enabling the identification of different phenotypes of drug–disease associations, including the prediction of as yet unknown DDAs in the input. Results: By 10-fold cross-validation, we demonstrate that DDIT can effectively capture the latent features of the drug–disease association network and represents over 0.217 and over 0.072 improvement in AUC and AUPR, respectively, for predicting the clinical phenotypes of DDAs compared with the classic K-nearest neighbors method (KNN, including drug-based KNN and disease-based KNN), Random Forest, and XGBoost. By conducting leave-one-drug-class-out cross-validation, the AUC and AUPR of DDIT demonstrated an improvement of 0.135 in AUC and 0.075 in AUPR compared to any of the other four methods. Within the top 10 predicted indications, side effects, and contraindications, 7/10, 9/10, and 9/10 hit known drug–disease associations. Overall, DDIT is a useful tool for predicting multiple clinical phenotypic types of drug–disease associations.
Collapse
Affiliation(s)
- Lu Lu
- Department of Human Genetics, Department of Ultrasound and Women’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Jiale Qin
- Department of Human Genetics, Department of Ultrasound and Women’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Hangzhou, China
| | - Jiandong Chen
- School of Public Health, Undergraduate School of Zhejiang University, Hangzhou, China
| | - Hao Wu
- Department of Human Genetics, Department of Ultrasound and Women’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Qiang Zhao
- Department of Human Genetics, Department of Ultrasound and Women’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Satoru Miyano
- M&D Data Science Center, Tokyo Medical and Dental University, Tokyo, Japan
| | - Yaozhong Zhang
- The Institute of Medical Science, the University of Tokyo, Tokyo, Japan
- *Correspondence: Yaozhong Zhang, ; Hua Yu, ; Chen Li,
| | - Hua Yu
- Department of Basic Medical Sciences, Zhejiang University School of Medicine, Hangzhou, China
- *Correspondence: Yaozhong Zhang, ; Hua Yu, ; Chen Li,
| | - Chen Li
- Department of Human Genetics, Department of Ultrasound and Women’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Hangzhou, China
- Department of Basic Medical Sciences, Zhejiang University School of Medicine, Hangzhou, China
- *Correspondence: Yaozhong Zhang, ; Hua Yu, ; Chen Li,
| |
Collapse
|