1
|
Chen D, Zhang T, Cui H, Gu J, Xuan P. KNDM: A Knowledge Graph Transformer and Node Category Sensitive Contrastive Learning Model for Drug and Microbe Association Prediction. J Chem Inf Model 2025; 65:4714-4728. [PMID: 40267287 DOI: 10.1021/acs.jcim.5c00186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2025]
Abstract
It has been proven that the microbiome in human bodies can promote or inhibit the treatment effects of the drugs by affecting their toxicities and activities. Therefore, identifying drug-related microbes helps in understanding how drugs exert their functions under the influence of these microbes. Most recent methods for drug-related microbe prediction are developed based on graph learning. However, those methods fail to fully utilize the diverse characteristics of drug and microbe entities from the perspective of a knowledge graph, as well as the contextual relationships among multiple meta-paths from the meta-path perspective. Moreover, previous methods overlook the consistency between the entity features derived from the knowledge graph and the node semantic features extracted from the meta-paths. To address these limitations, we propose a knowledge-graph transformer and node category-sensitive contrastive learning-based drug and microbe association prediction model (KNDM). This model learns the diverse features of drug and microbe entities, encodes the contextual relationships across multiple meta-paths, and integrates the feature consistency. First, we construct a knowledge graph consisting of drug and microbe entities, which aids in revealing similarities and associations between any two entities. Second, considering the heterogeneity of entities in the knowledge graph, we propose an entity category-sensitive transformer to integrate the diversity of multiple entity types and the various relationships among them. Third, multiple meta-paths are constructed to capture and embed the semantic relationships based on similarities and associations among drug and microbe nodes. A meta-path semantic feature learning strategy with recursive gating is proposed to capture specific semantic features of individual meta-paths while fusing contextual relationships among multiple meta-paths. Finally, we develop a node-category-sensitive contrastive learning strategy to enhance the consistency between entity features and node semantic features. Extensive experiments demonstrate that KNDM outperforms eight state-of-the-art drug-microbe association prediction models, while ablation studies validate the effectiveness of its key innovations. Additionally, case studies on candidate microbes associated with three drugs-curcumin, epigallocatechin gallate, and ciprofloxacin-further showcase KNDM's capability to identify potential drug-microbe associations.
Collapse
Affiliation(s)
- Dongliang Chen
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Tiangang Zhang
- School of Cyberspace Security, Hainan University, Haikou 570228, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Victoria 3083, Australia
| | - Jing Gu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Ping Xuan
- School of Cyberspace Security, Hainan University, Haikou 570228, China
| |
Collapse
|
2
|
Wan Z, Sun X, Li Y, Chu T, Hao X, Cao Y, Zhang P. Applications of Artificial Intelligence in Drug Repurposing. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2411325. [PMID: 40047357 PMCID: PMC11984889 DOI: 10.1002/advs.202411325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 12/12/2024] [Indexed: 04/12/2025]
Abstract
Drug repurposing identifies new therapeutic uses for the existing drugs originally developed for different indications, aiming at capitalizing on the established safety and efficacy profiles of known drugs. Thus, it is beneficial to bypass of early stages of drug development, and to reduction of the time and cost associated with bringing new therapies to market. Traditional experimental methods are often time-consuming and expensive, making artificial intelligence (AI) a promising alternative due to its lower cost, computational advantages, and ability to uncover hidden patterns. This review focuses on the availability of AI algorithms in drug development, and their positive and specific roles in revealing repurposing of the existing drugs, especially being integrated with virtual screening. It is shown that the existing AI algorithms excel at analyzing large-scale datasets, identifying the complicated patterns of drug responses from these datasets, and making predictions for potential drug repurposing. Building on these insights, challenges remain in developing efficient AI algorithms and future research, including integrating drug-related data across databases for better repurposing, enhancing AI computational efficiency, and advancing personalized medicine.
Collapse
Affiliation(s)
- Zhaoman Wan
- State Key Laboratory of Common Mechanism Research for Major DiseasesSuzhou Institute of Systems MedicineChinese Academy of Medical Sciences & Peking Union Medical CollegeSuzhouJiangsu215123China
| | - Xinran Sun
- Institute of Medicinal Plant DevelopmentChinese Academy of Medical Sciences & Peking Union Medical CollegeBeijing100193China
| | - Yi Li
- Hunan Agriculture University College of Plant ProtectionChangshaHunan410128China
| | - Tianyao Chu
- Beijing Key Laboratory for Genetics of Birth DefectsBeijing Pediatric Research InstituteMOE Key Laboratory of Major Diseases in ChildrenRare Disease CenterBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijing100045China
| | - Xueyu Hao
- Beijing Key Laboratory for Genetics of Birth DefectsBeijing Pediatric Research InstituteMOE Key Laboratory of Major Diseases in ChildrenRare Disease CenterBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijing100045China
| | - Yang Cao
- College of Life SciencesSichuan UniversityChengduSichuan610041China
| | - Peng Zhang
- Beijing Key Laboratory for Genetics of Birth DefectsBeijing Pediatric Research InstituteMOE Key Laboratory of Major Diseases in ChildrenRare Disease CenterBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijing100045China
| |
Collapse
|
3
|
Li Y, Zhao H, Wang J. MPEMDA: A multi-similarity integration approach with pre-completion and error correction for predicting microbe-drug associations. Methods 2025; 235:1-9. [PMID: 39863140 DOI: 10.1016/j.ymeth.2024.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Revised: 12/14/2024] [Accepted: 12/25/2024] [Indexed: 01/27/2025] Open
Abstract
Exploring the associations between microbes and drugs offers valuable insights into their underlying mechanisms. Traditional wet lab experiments, while reliable, are often time-consuming and labor-intensive, making computational approaches an attractive alternative. Existing similarity-based machine learning models for predicting microbe-drug associations typically rely on integrated similarities as input, neglecting the unique contributions of individual similarities, which can compromise predictive accuracy. To overcome these limitations, we develop MPEMDA, a novel method that pre-completes the microbe-drug association matrix using various similarity combinations and employs a label propagation algorithm with error correction to predict microbe-drug associations. Compared with existing methods, MPEMDA simultaneously utilizes the integrated and individual similarities obtained through the Similarity Network Fusion (SNF) method to pre-complete the known drug-microbe association matrix, followed by error correction to optimize the predictive scores generated by the label propagation algorithm. Experimental results on three benchmark datasets show that MPEMDA outperforms state-of-the-art methods in both the 5-fold cross-validation and de novo test. Additionally, case studies on drugs and microbes highlight the method's strong potential to identify novel microbe-drug associations. The MPEMDA code is available at https://github.com/lyx8527/MPEMDA.
Collapse
Affiliation(s)
- Yuxiang Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Haochen Zhao
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China.
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| |
Collapse
|
4
|
Liang M, Liu X, Li J, Chen Q, Zeng B, Wang Z, Li J, Wang L. BANNMDA: a computational model for predicting potential microbe-drug associations based on bilinear attention networks and nuclear norm minimization. Front Microbiol 2025; 15:1497886. [PMID: 39911712 PMCID: PMC11794793 DOI: 10.3389/fmicb.2024.1497886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 12/31/2024] [Indexed: 02/07/2025] Open
Abstract
Introduction Predicting potential associations between microbes and drugs is crucial for advancing pharmaceutical research and development. In this manuscript, we introduced an innovative computational model named BANNMDA by integrating Bilinear Attention Networks(BAN) with the Nuclear Norm Minimization (NNM) to uncover hidden connections between microbes and drugs. Methods In BANNMDA, we initially constructed a heterogeneous microbe-drug network by combining multiple drug and microbe similarity metrics with known microbe-drug relationships. Subsequently, we applied both BAN and NNM to compute predicted scores of potential microbe-drug associations. Finally, we implemented 5-fold cross-validation frameworks to evaluate the prediction performance of BANNMDA. Results and discussion The experimental results indicated that BANNMDA outperformed state-of-the-art competitive methods. We conducted case studies on well-known drugs such as the Amoxicillin and Ceftazidime, as well as on pathogens such as Bacillus cereus and Influenza A virus, to further evaluate the efficacy of BANNMDA, and experimental outcomes showed that there were 9 out of the top 10 predicted drugs, along with 8 and 9 out of the top 10 predicted microbes having been corroborated by relevant literatures. These findings underscored the capability of BANNMDA to achieve commendable predictive accuracy.
Collapse
Affiliation(s)
- Mingmin Liang
- School of Intelligent Equipment, Hunan Vocational College of Electronic and Technology, Changsha, China
| | - Xianzhi Liu
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, China
| | - Juncai Li
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, China
| | - Qijia Chen
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, China
| | - Bin Zeng
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, China
| | - Zhong Wang
- School of Humanities and Education, Hunan Vocational College of Electronic and Technology, Changsha, China
| | - Jing Li
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, China
| | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China
| |
Collapse
|
5
|
Liang J, Sun Y, Ling J. GRL-PUL: predicting microbe-drug association based on graph representation learning and positive unlabeled learning. Mol Omics 2025; 21:38-50. [PMID: 39540771 DOI: 10.1039/d4mo00117f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Extensive research has confirmed the widespread presence of microorganisms in the human body and their crucial impact on human health, with drugs being an effective method of regulation. Hence it is essential to identify potential microbe-drug associations (MDAs). Owing to the limitations of wet experiments, such as high costs and long durations, computational methods for binary classification tasks have become valuable alternatives for traditional experimental approaches. Since validated negative MDAs are absent in existing datasets, most methods randomly sample negatives from unlabeled data, which evidently leads to false negative issues. In this manuscript, we propose a novel model based on graph representation learning and positive-unlabeled learning (GRL-PUL), to infer potential MDAs. Firstly, we screen reliable negative samples by applying weighted matrix factorization and the PU-bagging strategy on the known microbe-drug bipartite network. Then, we combine muti-model attributes and constructed a microbe-drug heterogeneous network. After that, graph attention auto-encoder module, an encoder combining graph convolutional networks and graph attention networks, is introduced to extract informative embeddings based on the microbe-drug heterogeneous network. Lastly, we adopt a modified random forest as the final classifier. Comparison experiments with five baseline models on three benchmark datasets show that our model surpasses other methods in terms of the AUC, AUPR, ACC, F1-score and MCC. Moreover, several case studies show that GRL-PUL could capably predict latent MDAs. Notably, we further verify the effectiveness of a reliable negative sample selection module by migrating it to other state-of-the-art models, and the experimental results demonstrate its ability to substantially improve their prediction performance.
Collapse
Affiliation(s)
- Jinqing Liang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China.
| | - Yuping Sun
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China.
| | - Jie Ling
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China.
| |
Collapse
|
6
|
Ye S, Wang J, Zhu M, Yuan S, Zhuo L, Chen T, Gao J. MKAN-MMI: empowering traditional medicine-microbe interaction prediction with masked graph autoencoders and KANs. Front Pharmacol 2024; 15:1484639. [PMID: 39512819 PMCID: PMC11540998 DOI: 10.3389/fphar.2024.1484639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 10/08/2024] [Indexed: 11/15/2024] Open
Abstract
The growing microbial resistance to traditional medicines necessitates in-depth analysis of medicine-microbe interactions (MMIs) to develop new therapeutic strategies. Widely used artificial intelligence models are limited by sparse observational data and prevalent noise, leading to over-reliance on specific data for feature extraction and reduced generalization ability. To address these limitations, we integrate Kolmogorov-Arnold Networks (KANs), independent subspaces, and collaborative decoding techniques into the masked graph autoencoder (Mask GAE) framework, creating an innovative MMI prediction model with enhanced accuracy, generalization, and interpretability. First, we apply Bernoulli distribution to randomly mask parts of the medicine-microbe graph, advancing self-supervised training and reducing noise impact. Additionally, the independent subspace technique enables graph neural networks (GNNs) to learn weights independently across different feature subspaces, enhancing feature expression. Fusing the multi-layer outputs of GNNs effectively reduces information loss caused by masking. Moreover, using KANs for advanced nonlinear mapping enhances the learnability and interpretability of weights, deepening the understanding of complex MMIs. These measures significantly enhanced the accuracy, generalization, and interpretability of our model in MMI prediction tasks. We validated our model on three public datasets with results showing that our model outperformed existing leading models. The relevant data and code are publicly accessible at: https://github.com/zhuoninnin1992/MKAN-MMI.
Collapse
Affiliation(s)
- Sheng Ye
- The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China
- School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Jue Wang
- Department of Clinical Laboratory, Shandong Provincial Third Hospital, Shandong University, Jinan, Shandong, China
| | - Mingmin Zhu
- School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Sisi Yuan
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, Zhejiang, China
| | - Tiancong Chen
- Department of Rehabilitation, The Wenzhou Third Clinical Institute Affiliated to Wenzhou Medical University, Wenzhou People's Hospital, Wenzhou Maternal and Child Health Care Hospital, Wenzhou, Zhejiang, China
| | - Jinjian Gao
- The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China
| |
Collapse
|
7
|
Xuan P, Guan C, Chen S, Gu J, Wang X, Nakaguchi T, Zhang T. Gating-Enhanced Hierarchical Structure Learning in Hyperbolic Space and Multi-scale Neighbor Topology Learning in Euclidean Space for Prediction of Microbe-Drug Associations. J Chem Inf Model 2024; 64:7806-7815. [PMID: 39324410 DOI: 10.1021/acs.jcim.4c01340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2024]
Abstract
Identifying drug-related microbes may help us explore how the microbes affect the functions of drugs by promoting or inhibiting their effects. Most previous methods for the prediction of microbe-drug associations focused on integrating the attributes and topologies of microbe and drug nodes in Euclidean space. The heterogeneous network composed of microbes and drugs has a hierarchical structure, and the hyperbolic space is helpful for reflecting the structure. However, the previous methods did not fully exploit the structure. We propose a multi-space feature learning enhanced microbe-drug association prediction method, MFLP, to fuse the hierarchical structure of microbe and drug nodes in hyperbolic space and the multiscale neighbor topologies in Euclidean space. First, we project the nodes of the microbe-drug heterogeneous network on the sphere in hyperbolic space and then construct a topology which implies hierarchical structure and forms a hierarchical attribute embedding. The node information from multiple types of neighbor nodes with the new topological structure in the tangent plane space of a sphere is aggregated by the designed gating-enhanced hyperbolic graph neural network. Second, the gate at the node feature level is constructed to adaptively fuse the hierarchical features of microbe and drug nodes from two adjacent graph neural encoding layers. Third, multiple neighbor topological embeddings for each microbe and drug node are formed by neighborhood random walks on the microbe-drug heterogeneous network, and they cover neighborhood topologies with multiple scales, respectively. Finally, as each scale of topological embedding contains its specific neighborhood topology, we establish an independent graph convolutional neural network for the topology and form the topological representations of microbe and drug nodes in Euclidean space. The comparison experiments based on cross validation showed that MFLP outperformed several advanced prediction methods, and the ablation experiments verified the effectiveness of MFLP's major innovations. The case studies on three drugs further demonstrated MFLP's ability in being applied to discover potential candidate microbes for the given drugs.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Chunhong Guan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Sentao Chen
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Jing Gu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Xiuju Wang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
8
|
Xuan P, Xu Z, Cui H, Gu J, Liu C, Zhang T, Wu P. Dynamic category-sensitive hypergraph inferring and homo-heterogeneous neighbor feature learning for drug-related microbe prediction. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae562. [PMID: 39292557 PMCID: PMC11441325 DOI: 10.1093/bioinformatics/btae562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 08/29/2024] [Accepted: 09/17/2024] [Indexed: 09/20/2024]
Abstract
MOTIVATION The microbes in human body play a crucial role in influencing the functions of drugs, as they can regulate the activities and toxicities of drugs. Most recent methods for predicting drug-microbe associations are based on graph learning. However, the relationships among multiple drugs and microbes are complex, diverse, and heterogeneous. Existing methods often fail to fully model the relationships. In addition, the attributes of drug-microbe pairs exhibit long-distance spatial correlations, which previous methods have not integrated effectively. RESULTS We propose a new prediction method named DHDMP which is designed to encode the relationships among multiple drugs and microbes and integrate the attributes of various neighbor nodes along with the pairwise long-distance correlations. First, we construct a hypergraph with dynamic topology, where each hyperedge represents a specific relationship among multiple drug nodes and microbe nodes. Considering the heterogeneity of node attributes across different categories, we developed a node category-sensitive hypergraph convolution network to encode these diverse relationships. Second, we construct homogeneous graphs for drugs and microbes respectively, as well as drug-microbe heterogeneous graph, facilitating the integration of features from both homogeneous and heterogeneous neighbors of each target node. Third, we introduce a graph convolutional network with cross-graph feature propagation ability to transfer node features from homogeneous to heterogeneous graphs for enhanced neighbor feature representation learning. The propagation strategy aids in the deep fusion of features from both types of neighbors. Finally, we design spatial cross-attention to encode the attributes of drug-microbe pairs, revealing long-distance correlations among multiple pairwise attribute patches. The comprehensive comparison experiments showed our method outperformed state-of-the-art methods for drug-microbe association prediction. The ablation studies demonstrated the effectiveness of node category-sensitive hypergraph convolution network, graph convolutional network with cross-graph feature propagation, and spatial cross-attention. Case studies on three drugs further showed DHDMP's potential application in discovering the reliable candidate microbes for the interested drugs. AVAILABILITY AND IMPLEMENTATION Source codes and supplementary materials are available at https://github.com/pingxuan-hlju/DHDMP.
Collapse
Affiliation(s)
- Ping Xuan
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Zelong Xu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, VIC 3083, Australia
- Australian Centre for AI in Medical Innovation, La Trobe University, Melbourne 3083, Australia
| | - Jing Gu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Cheng Liu
- Department of Computer Science and Technology, Shantou University, Shantou 515063, China
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Peiliang Wu
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| |
Collapse
|
9
|
Meng Z, Liu S, Liang S, Jani B, Meng Z. Heterogeneous biomedical entity representation learning for gene-disease association prediction. Brief Bioinform 2024; 25:bbae380. [PMID: 39154194 PMCID: PMC11330343 DOI: 10.1093/bib/bbae380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 05/29/2024] [Accepted: 07/22/2024] [Indexed: 08/19/2024] Open
Abstract
Understanding the genetic basis of disease is a fundamental aspect of medical research, as genes are the classic units of heredity and play a crucial role in biological function. Identifying associations between genes and diseases is critical for diagnosis, prevention, prognosis, and drug development. Genes that encode proteins with similar sequences are often implicated in related diseases, as proteins causing identical or similar diseases tend to show limited variation in their sequences. Predicting gene-disease association (GDA) requires time-consuming and expensive experiments on a large number of potential candidate genes. Although methods have been proposed to predict associations between genes and diseases using traditional machine learning algorithms and graph neural networks, these approaches struggle to capture the deep semantic information within the genes and diseases and are dependent on training data. To alleviate this issue, we propose a novel GDA prediction model named FusionGDA, which utilizes a pre-training phase with a fusion module to enrich the gene and disease semantic representations encoded by pre-trained language models. Multi-modal representations are generated by the fusion module, which includes rich semantic information about two heterogeneous biomedical entities: protein sequences and disease descriptions. Subsequently, the pooling aggregation strategy is adopted to compress the dimensions of the multi-modal representation. In addition, FusionGDA employs a pre-training phase leveraging a contrastive learning loss to extract potential gene and disease features by training on a large public GDA dataset. To rigorously evaluate the effectiveness of the FusionGDA model, we conduct comprehensive experiments on five datasets and compare our proposed model with five competitive baseline models on the DisGeNet-Eval dataset. Notably, our case study further demonstrates the ability of FusionGDA to discover hidden associations effectively. The complete code and datasets of our experiments are available at https://github.com/ZhaohanM/FusionGDA.
Collapse
Affiliation(s)
- Zhaohan Meng
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| | - Siwei Liu
- School of Natural and Computing Science, University of Aberdeen King’s College, Aberdeen, AB24 3FX, UK
| | - Shangsong Liang
- Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence, Building 1B, Masdar City, Abu Dhabi 000000, UAE
| | - Bhautesh Jani
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| | - Zaiqiao Meng
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| |
Collapse
|
10
|
Zhao J, Kuang L, Hu A, Zhang Q, Yang D, Wang C. OGNNMDA: a computational model for microbe-drug association prediction based on ordered message-passing graph neural networks. Front Genet 2024; 15:1370013. [PMID: 38689654 PMCID: PMC11058190 DOI: 10.3389/fgene.2024.1370013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 03/14/2024] [Indexed: 05/02/2024] Open
Abstract
In recent years, many excellent computational models have emerged in microbe-drug association prediction, but their performance still has room for improvement. This paper proposed the OGNNMDA framework, which applied an ordered message-passing mechanism to distinguish the different neighbor information in each message propagation layer, and it achieved a better embedding ability through deeper network layers. Firstly, the method calculates four similarity matrices based on microbe functional similarity, drug chemical structure similarity, and their respective Gaussian interaction profile kernel similarity. After integrating these similarity matrices, it concatenates the integrated similarity matrix with the known association matrix to obtain the microbe-drug heterogeneous matrix. Secondly, it uses a multi-layer ordered message-passing graph neural network encoder to encode the heterogeneous network and the known association information adjacency matrix, thereby obtaining the final embedding features of the microbe-drugs. Finally, it inputs the embedding features into the bilinear decoder to get the final prediction results. The OGNNMDA method performed comparative experiments, ablation experiments, and case studies on the aBiofilm, MDAD and DrugVirus datasets using 5-fold cross-validation. The experimental results showed that OGNNMDA showed the strongest prediction performance on aBiofilm and MDAD and obtained sub-optimal results on DrugVirus. In addition, the case studies on well-known drugs and microbes also support the effectiveness of the OGNNMDA method. Source codes and data are available at: https://github.com/yyzg/OGNNMDA.
Collapse
Affiliation(s)
- Jiabao Zhao
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - An Hu
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Qi Zhang
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Dinghai Yang
- School of Computer Science and School of Cyberspace Science, Xiangtan University, Xiangtan, China
| | - Chunxiang Wang
- Hunan Institute of Engineering College of textile and clothing, Xiangtan, China
| |
Collapse
|
11
|
Tian Z, Han C, Xu L, Teng Z, Song W. MGCNSS: miRNA-disease association prediction with multi-layer graph convolution and distance-based negative sample selection strategy. Brief Bioinform 2024; 25:bbae168. [PMID: 38622356 PMCID: PMC11018511 DOI: 10.1093/bib/bbae168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 03/14/2024] [Accepted: 03/31/2024] [Indexed: 04/17/2024] Open
Abstract
Identifying disease-associated microRNAs (miRNAs) could help understand the deep mechanism of diseases, which promotes the development of new medicine. Recently, network-based approaches have been widely proposed for inferring the potential associations between miRNAs and diseases. However, these approaches ignore the importance of different relations in meta-paths when learning the embeddings of miRNAs and diseases. Besides, they pay little attention to screening out reliable negative samples which is crucial for improving the prediction accuracy. In this study, we propose a novel approach named MGCNSS with the multi-layer graph convolution and high-quality negative sample selection strategy. Specifically, MGCNSS first constructs a comprehensive heterogeneous network by integrating miRNA and disease similarity networks coupled with their known association relationships. Then, we employ the multi-layer graph convolution to automatically capture the meta-path relations with different lengths in the heterogeneous network and learn the discriminative representations of miRNAs and diseases. After that, MGCNSS establishes a highly reliable negative sample set from the unlabeled sample set with the negative distance-based sample selection strategy. Finally, we train MGCNSS under an unsupervised learning manner and predict the potential associations between miRNAs and diseases. The experimental results fully demonstrate that MGCNSS outperforms all baseline methods on both balanced and imbalanced datasets. More importantly, we conduct case studies on colon neoplasms and esophageal neoplasms, further confirming the ability of MGCNSS to detect potential candidate miRNAs. The source code is publicly available on GitHub https://github.com/15136943622/MGCNSS/tree/master.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Chenguang Han
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Lewen Xu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhixia Teng
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China
| | - Wei Song
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
12
|
Kuang H, Zhang Z, Zeng B, Liu X, Zuo H, Xu X, Wang L. A novel microbe-drug association prediction model based on graph attention networks and bilayer random forest. BMC Bioinformatics 2024; 25:78. [PMID: 38378437 PMCID: PMC10877932 DOI: 10.1186/s12859-024-05687-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 01/31/2024] [Indexed: 02/22/2024] Open
Abstract
BACKGROUND In recent years, the extensive use of drugs and antibiotics has led to increasing microbial resistance. Therefore, it becomes crucial to explore deep connections between drugs and microbes. However, traditional biological experiments are very expensive and time-consuming. Therefore, it is meaningful to develop efficient computational models to forecast potential microbe-drug associations. RESULTS In this manuscript, we proposed a novel prediction model called GARFMDA by combining graph attention networks and bilayer random forest to infer probable microbe-drug correlations. In GARFMDA, through integrating different microbe-drug-disease correlation indices, we constructed two different microbe-drug networks first. And then, based on multiple measures of similarity, we constructed a unique feature matrix for drugs and microbes respectively. Next, we fed these newly-obtained microbe-drug networks together with feature matrices into the graph attention network to extract the low-dimensional feature representations for drugs and microbes separately. Thereafter, these low-dimensional feature representations, along with the feature matrices, would be further inputted into the first layer of the Bilayer random forest model to obtain the contribution values of all features. And then, after removing features with low contribution values, these contribution values would be fed into the second layer of the Bilayer random forest to detect potential links between microbes and drugs. CONCLUSIONS Experimental results and case studies show that GARFMDA can achieve better prediction performance than state-of-the-art approaches, which means that GARFMDA may be a useful tool in the field of microbe-drug association prediction in the future. Besides, the source code of GARFMDA is available at https://github.com/KuangHaiYue/GARFMDA.git.
Collapse
Affiliation(s)
- Haiyue Kuang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Zhen Zhang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Bin Zeng
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Xin Liu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| | - Hao Zuo
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Xingye Xu
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China
| | - Lei Wang
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| |
Collapse
|
13
|
Xuan P, Gu J, Cui H, Wang S, Toshiya N, Liu C, Zhang T. Multi-scale topology and position feature learning and relationship-aware graph reasoning for prediction of drug-related microbes. Bioinformatics 2024; 40:btae025. [PMID: 38269610 PMCID: PMC10868329 DOI: 10.1093/bioinformatics/btae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 12/26/2023] [Accepted: 01/22/2024] [Indexed: 01/26/2024] Open
Abstract
MOTIVATION The human microbiome may impact the effectiveness of drugs by modulating their activities and toxicities. Predicting candidate microbes for drugs can facilitate the exploration of the therapeutic effects of drugs. Most recent methods concentrate on constructing of the prediction models based on graph reasoning. They fail to sufficiently exploit the topology and position information, the heterogeneity of multiple types of nodes and connections, and the long-distance correlations among nodes in microbe-drug heterogeneous graph. RESULTS We propose a new microbe-drug association prediction model, NGMDA, to encode the position and topological features of microbe (drug) nodes, and fuse the different types of features from neighbors and the whole heterogeneous graph. First, we formulate the position and topology features of microbe (drug) nodes by t-step random walks, and the features reveal the topological neighborhoods at multiple scales and the position of each node. Second, as the features of nodes are high-dimensional and sparse, we designed an embedding enhancement strategy based on supervised fully connected autoencoders to form the embeddings with representative features and the more discriminative node distributions. Third, we propose an adaptive neighbor feature fusion module, which fuses features of neighbors by the constructed position- and topology-sensitive heterogeneous graph neural networks. A novel self-attention mechanism is developed to estimate the importance of the position and topology of each neighbor to a target node. Finally, a heterogeneous graph feature fusion module is constructed to learn the long-distance correlations among the nodes in the whole heterogeneous graph by a relationship-aware graph transformer. Relationship-aware graph transformer contains the strategy for encoding the connection relationship types among the nodes, which is helpful for integrating the diverse semantics of these connections. The extensive comparison experimental results demonstrate NGMDA's superior performance over five state-of-the-art prediction methods. The ablation experiment shows the contributions of the multi-scale topology and position feature learning, the embedding enhancement strategy, the neighbor feature fusion, and the heterogeneous graph feature fusion. Case studies over three drugs further indicate that NGMDA has ability in discovering the potential drug-related microbes. AVAILABILITY AND IMPLEMENTATION Source codes and Supplementary Material are available at https://github.com/pingxuan-hlju/NGMDA.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Jing Gu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, VIC 3083, Australia
| | - Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Nakaguchi Toshiya
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Cheng Liu
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
14
|
Liang M, Liu X, Chen Q, Zeng B, Wang L. NMGMDA: a computational model for predicting potential microbe-drug associations based on minimize matrix nuclear norm and graph attention network. Sci Rep 2024; 14:650. [PMID: 38182635 PMCID: PMC10770326 DOI: 10.1038/s41598-023-50793-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 12/26/2023] [Indexed: 01/07/2024] Open
Abstract
The prediction of potential microbe-drug associations is of great value for drug research and development, especially, methods, based on deep learning, have been achieved significant improvement in bio-medicine. In this manuscript, we proposed a novel computational model named NMGMDA based on the nuclear norm minimization and graph attention network to infer latent microbe-drug associations. Firstly, we created a heterogeneous microbe-drug network in NMGMDA by fusing the drug and microbe similarities with the established drug-microbe associations. After this, by using GAT and NNM to calculate the predict scores. Lastly, we created a fivefold cross validation framework to assess the new model NMGMDA's progressiveness. According to the simulation results, NMGMDA outperforms some of the most advanced methods, with a reliable AUC of 0.9946 on both MDAD and aBioflm databases. Furthermore, case studies on Ciprofloxacin, Moxifoxacin, HIV-1 and Mycobacterium tuberculosis were carried out in order to assess the effectiveness of NMGMDA even more. The experimental results demonstrated that, following the removal of known correlations from the database, 16 and 14 medications as well as 19 and 17 microbes in the top 20 predictions were validated by pertinent literature. This demonstrates the potential of our new model, NMGMDA, to reach acceptable prediction performance.
Collapse
Affiliation(s)
- Mingmin Liang
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, 410000, China
| | - Xianzhi Liu
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, 410000, China
| | - Qijia Chen
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, 410000, China.
| | - Bin Zeng
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, 410000, China.
| | - Lei Wang
- School of Information Engineering, Hunan Vocational College of Electronic and Technology, Changsha, 410000, China.
- Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, 410022, China.
| |
Collapse
|
15
|
Zhou Z, Zhuo L, Fu X, Zou Q. Joint deep autoencoder and subgraph augmentation for inferring microbial responses to drugs. Brief Bioinform 2023; 25:bbad483. [PMID: 38171927 PMCID: PMC10764208 DOI: 10.1093/bib/bbad483] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 10/25/2023] [Accepted: 11/30/2023] [Indexed: 01/05/2024] Open
Abstract
Exploring microbial stress responses to drugs is crucial for the advancement of new therapeutic methods. While current artificial intelligence methodologies have expedited our understanding of potential microbial responses to drugs, the models are constrained by the imprecise representation of microbes and drugs. To this end, we combine deep autoencoder and subgraph augmentation technology for the first time to propose a model called JDASA-MRD, which can identify the potential indistinguishable responses of microbes to drugs. In the JDASA-MRD model, we begin by feeding the established similarity matrices of microbe and drug into the deep autoencoder, enabling to extract robust initial features of both microbes and drugs. Subsequently, we employ the MinHash and HyperLogLog algorithms to account intersections and cardinality data between microbe and drug subgraphs, thus deeply extracting the multi-hop neighborhood information of nodes. Finally, by integrating the initial node features with subgraph topological information, we leverage graph neural network technology to predict the microbes' responses to drugs, offering a more effective solution to the 'over-smoothing' challenge. Comparative analyses on multiple public datasets confirm that the JDASA-MRD model's performance surpasses that of current state-of-the-art models. This research aims to offer a more profound insight into the adaptability of microbes to drugs and to furnish pivotal guidance for drug treatment strategies. Our data and code are publicly available at: https://github.com/ZZCrazy00/JDASA-MRD.
Collapse
Affiliation(s)
- Zhecheng Zhou
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000, Wenzhou, China
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325000, Wenzhou, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, 410012, Changsha, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 611730, Chengdu, China
| |
Collapse
|
16
|
Chen L, Zhao X. PCDA-HNMP: Predicting circRNA-disease association using heterogeneous network and meta-path. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20553-20575. [PMID: 38124565 DOI: 10.3934/mbe.2023909] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Increasing amounts of experimental studies have shown that circular RNAs (circRNAs) play important regulatory roles in human diseases through interactions with related microRNAs (miRNAs). CircRNAs have become new potential disease biomarkers and therapeutic targets. Predicting circRNA-disease association (CDA) is of great significance for exploring the pathogenesis of complex diseases, which can improve the diagnosis level of diseases and promote the targeted therapy of diseases. However, determination of CDAs through traditional clinical trials is usually time-consuming and expensive. Computational methods are now alternative ways to predict CDAs. In this study, a new computational method, named PCDA-HNMP, was designed. For obtaining informative features of circRNAs and diseases, a heterogeneous network was first constructed, which defined circRNAs, mRNAs, miRNAs and diseases as nodes and associations between them as edges. Then, a deep analysis was conducted on the heterogeneous network by extracting meta-paths connecting to circRNAs (diseases), thereby mining hidden associations between various circRNAs (diseases). These associations constituted the meta-path-induced networks for circRNAs and diseases. The features of circRNAs and diseases were derived from the aforementioned networks via mashup. On the other hand, miRNA-disease associations (mDAs) were employed to improve the model's performance. miRNA features were yielded from the meta-path-induced networks on miRNAs and circRNAs, which were constructed from the meta-paths connecting miRNAs and circRNAs in the heterogeneous network. A concatenation operation was adopted to build the features of CDAs and mDAs. Such representations of CDAs and mDAs were fed into XGBoost to set up the model. The five-fold cross-validation yielded an area under the curve (AUC) of 0.9846, which was better than those of some existing state-of-the-art methods. The employment of mDAs can really enhance the model's performance and the importance analysis on meta-path-induced networks shown that networks produced by the meta-paths containing validated CDAs provided the most important contributions.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Xiaoyu Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
17
|
Qu J, Song Z, Cheng X, Jiang Z, Zhou J. A new integrated framework for the identification of potential virus-drug associations. Front Microbiol 2023; 14:1179414. [PMID: 37675432 PMCID: PMC10478006 DOI: 10.3389/fmicb.2023.1179414] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Accepted: 07/31/2023] [Indexed: 09/08/2023] Open
Abstract
Introduction With the increasingly serious problem of antiviral drug resistance, drug repurposing offers a time-efficient and cost-effective way to find potential therapeutic agents for disease. Computational models have the ability to quickly predict potential reusable drug candidates to treat diseases. Methods In this study, two matrix decomposition-based methods, i.e., Matrix Decomposition with Heterogeneous Graph Inference (MDHGI) and Bounded Nuclear Norm Regularization (BNNR), were integrated to predict anti-viral drugs. Moreover, global leave-one-out cross-validation (LOOCV), local LOOCV, and 5-fold cross-validation were implemented to evaluate the performance of the proposed model based on datasets of DrugVirus that consist of 933 known associations between 175 drugs and 95 viruses. Results The results showed that the area under the receiver operating characteristics curve (AUC) of global LOOCV and local LOOCV are 0.9035 and 0.8786, respectively. The average AUC and the standard deviation of the 5-fold cross-validation for DrugVirus datasets are 0.8856 ± 0.0032. We further implemented cross-validation based on MDAD and aBiofilm, respectively, to evaluate the performance of the model. In particle, MDAD (aBiofilm) dataset contains 2,470 (2,884) known associations between 1,373 (1,470) drugs and 173 (140) microbes. In addition, two types of case studies were carried out further to verify the effectiveness of the model based on the DrugVirus and MDAD datasets. The results of the case studies supported the effectiveness of MHBVDA in identifying potential virus-drug associations as well as predicting potential drugs for new microbes.
Collapse
Affiliation(s)
- Jia Qu
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zihao Song
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Xiaolong Cheng
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China
| | - Zhibin Jiang
- School of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| | - Jie Zhou
- School of Computer Science and Engineering, Shaoxing University, Shaoxing, Zhejiang, China
| |
Collapse
|
18
|
Chen L, Chen K, Zhou B. Inferring drug-disease associations by a deep analysis on drug and disease networks. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:14136-14157. [PMID: 37679129 DOI: 10.3934/mbe.2023632] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Drugs, which treat various diseases, are essential for human health. However, developing new drugs is quite laborious, time-consuming, and expensive. Although investments into drug development have greatly increased over the years, the number of drug approvals each year remain quite low. Drug repositioning is deemed an effective means to accelerate the procedures of drug development because it can discover novel effects of existing drugs. Numerous computational methods have been proposed in drug repositioning, some of which were designed as binary classifiers that can predict drug-disease associations (DDAs). The negative sample selection was a common defect of this method. In this study, a novel reliable negative sample selection scheme, named RNSS, is presented, which can screen out reliable pairs of drugs and diseases with low probabilities of being actual DDAs. This scheme considered information from k-neighbors of one drug in a drug network, including their associations to diseases and the drug. Then, a scoring system was set up to evaluate pairs of drugs and diseases. To test the utility of the RNSS, three classic classification algorithms (random forest, bayes network and nearest neighbor algorithm) were employed to build classifiers using negative samples selected by the RNSS. The cross-validation results suggested that such classifiers provided a nearly perfect performance and were significantly superior to those using some traditional and previous negative sample selection schemes.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Kaiyu Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Bo Zhou
- Shanghai University of Medicine & Health Sciences, Shanghai 201318, China
| |
Collapse
|