1
|
Shang Y, Wang Z, Chen Y, Yang X, Ren Z, Zeng X, Xu L. HNF-DDA: subgraph contrastive-driven transformer-style heterogeneous network embedding for drug-disease association prediction. BMC Biol 2025; 23:101. [PMID: 40241152 PMCID: PMC12004644 DOI: 10.1186/s12915-025-02206-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2024] [Accepted: 04/03/2025] [Indexed: 04/18/2025] Open
Abstract
BACKGROUND Drug-disease association (DDA) prediction aims to identify potential links between drugs and diseases, facilitating the discovery of new therapeutic potentials and reducing the cost and time associated with traditional drug development. However, existing DDA prediction methods often overlook the global relational information provided by other biological entities, and the complex association structure between drug diseases, limiting the potential correlations of drug and disease embeddings. RESULTS In this study, we propose HNF-DDA, a subgraph contrastive-driven transformer-style heterogeneous network embedding model for DDA prediction. Specifically, HNF-DDA adopts all-pairs message passing strategy to capture the global structure of the network, fully integrating multi-omics information. HNF-DDA also proposes the concept of subgraph contrastive learning to capture the local structure of drug-disease subgraphs, learning the high-order semantic information of nodes. Experimental results on two benchmark datasets demonstrate that HNF-DDA outperforms several state-of-the-art methods. Additionally, it shows superior performance across different dataset splitting schemes, indicating HNF-DDA's capability to generalize to novel drug and disease categories. Case studies for breast cancer and prostate cancer reveal that 9 out of the top 10 predicted candidate drugs for breast cancer and 8 out of the top 10 for prostate cancer have documented therapeutic effects. CONCLUSIONS HNF-DDA incorporates all-pairs message passing and subgraph capture strategies into heterogeneous network embedding, enabling effective learning of drug and disease representations enriched with heterogeneous information, while also demonstrating significant potential for applications in drug repositioning.
Collapse
Affiliation(s)
- Yifan Shang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Zixu Wang
- Department of Computer Science, University of Tsukuba, Tsukuba, 305-8577, Japan
| | - Yangyang Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Xinyu Yang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Zhonghao Ren
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China.
| |
Collapse
|
2
|
Quan L, Wu J, Jiang Y, Pan D, Qiang L. DTA-GTOmega: Enhancing Drug-Target Binding Affinity Prediction with Graph Transformers Using OmegaFold Protein Structures. J Mol Biol 2025; 437:168843. [PMID: 39481634 DOI: 10.1016/j.jmb.2024.168843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 10/05/2024] [Accepted: 10/24/2024] [Indexed: 11/02/2024]
Abstract
Understanding drug-protein interactions is crucial for elucidating drug mechanisms and optimizing drug development. However, existing methods have limitations in representing the three-dimensional structure of targets and capturing the complex relationships between drugs and targets. This study proposes a new method, DTA-GTOmega, for predicting drug-target binding affinity. DTA-GTOmega utilizes OmegaFold to predict protein three-dimensional structure and construct target graphs, while processing drug SMILES sequences with RDKit to generate drug graphs. By employing multi-layer graph transformer modules and co-attention modules, this method effectively integrates atomic-level features of drugs and residue-level features of targets, accurately modeling the complex interactions between drugs and targets, thereby significantly improving the accuracy of binding affinity predictions. Our method outperforms existing techniques on benchmark datasets such as KIBA, Davis, and BindingDB_Kd under cold-start setting. Moreover, DTA-GTOmega demonstrates competitive performance in real-world DTI scenarios involving DrugBank data and drug-target interactions related to cardiovascular and nervous system-related diseases, highlighting its robust generalization capabilities. Additionally, the introduced DTI evaluation metrics further validate DTA-GTOmega's potential in handling imbalanced data.
Collapse
Affiliation(s)
- Lijun Quan
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China; Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu 210000, China
| | - Jian Wu
- China Mobile (Suzhou) Software Technology Co., Ltd., Suzhou 215000, China
| | - Yelu Jiang
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Deng Pan
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China
| | - Lyu Qiang
- School of Computer Science and Technology, Soochow University, Jiangsu 215006, China; Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu 210000, China.
| |
Collapse
|
3
|
Meng W, Xu X, Xiao Z, Gao L, Yu L. Cancer Drug Sensitivity Prediction Based on Deep Transfer Learning. Int J Mol Sci 2025; 26:2468. [PMID: 40141112 PMCID: PMC11942577 DOI: 10.3390/ijms26062468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2025] [Revised: 02/27/2025] [Accepted: 03/06/2025] [Indexed: 03/28/2025] Open
Abstract
In recent years, many approved drugs have been discovered using phenotypic screening, which elaborates the exact mechanisms of action or molecular targets of drugs. Drug susceptibility prediction is an important type of phenotypic screening. Large-scale pharmacogenomics studies have provided us with large amounts of drug sensitivity data. By analyzing these data using computational methods, we can effectively build models to predict drug susceptibility. However, due to the differences in data distribution among databases, researchers cannot directly utilize data from multiple sources. In this study, we propose a deep transfer learning model. We integrate the genomic characterization of cancer cell lines with chemical information on compounds, combined with the Encyclopedia of Cancer Cell Lines (CCLE) and the Genomics of Cancer Drug Sensitivity (GDSC) datasets, through a domain-adapted approach and predict the half-maximal inhibitory concentrations (IC50 values). Afterward, the validity of the prediction results of our model is verified. This study effectively addresses the challenge of cross-database distribution discrepancies in drug sensitivity prediction by integrating multi-source heterogeneous data and constructing a deep transfer learning model. This model serves as a reliable computational tool for precision drug development. Its widespread application can facilitate the optimization of therapeutic strategies in personalized medicine while also providing technical support for high-throughput drug screening and the discovery of new drug targets.
Collapse
Affiliation(s)
- Weijun Meng
- School of Computer Science and Technology, Xi’an University of Posts & Telecommunications, Xi’an 710071, China;
| | - Xinyu Xu
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (X.X.); (Z.X.); (L.G.)
| | - Zhichao Xiao
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (X.X.); (Z.X.); (L.G.)
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (X.X.); (Z.X.); (L.G.)
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (X.X.); (Z.X.); (L.G.)
| |
Collapse
|
4
|
Wang Z, Chen Y, Shang Y, Yang X, Pan W, Ye X, Sakurai T, Zeng X. MultiCycPermea: accurate and interpretable prediction of cyclic peptide permeability using a multimodal image-sequence model. BMC Biol 2025; 23:63. [PMID: 40016695 PMCID: PMC11866622 DOI: 10.1186/s12915-025-02166-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Accepted: 02/17/2025] [Indexed: 03/01/2025] Open
Abstract
BACKGROUND Cyclic peptides, known for their high binding affinity and low toxicity, show potential as innovative drugs for targeting "undruggable" proteins. However, their therapeutic efficacy is often hindered by poor membrane permeability. Over the past decade, the FDA has approved an average of one macrocyclic peptide drug per year, with romidepsin being the only one targeting an intracellular site. Biological experiments to measure permeability are time-consuming and labor-intensive. Rapid assessment of cyclic peptide permeability is crucial for their development. RESULTS In this work, we proposed a novel deep learning model, dubbed as MultiCycPermea, for predicting cyclic peptide permeability. MultiCycPermea extracts features from both the image information (2D structural information) and sequence information (1D structural information) of cyclic peptides. Additionally, we proposed a substructure-constrained feature alignment module to align the two types of features. MultiCycPermea has made a leap in predictive accuracy. In the in-distribution setting of the CycPeptMPDB dataset, MultiCycPermea reduced the mean squared error (MSE) by approximately 44.83% compared to the latest model Multi_CycGT (0.29 vs 0.16). By leveraging visual analysis tools, MultiCycPermea can reveal the relationship between peptide modification structures and membrane permeability, providing insights to improve the membrane permeability of cyclic peptides. CONCLUSIONS MultiCycPermea provides an effective tool that accurately predicts the permeability of cyclic peptides, offering valuable insights for improving the membrane permeability of cyclic peptides. This work paves a new path for the application of artificial intelligence in assisting the design of membrane-permeable cyclic peptides.
Collapse
Affiliation(s)
- Zixu Wang
- Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki, 3058577, Japan
| | - Yangyang Chen
- Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki, 3058577, Japan
| | - Yifan Shang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, People's Republic of China
| | - Xiulong Yang
- School of Computer Science, Central China Normal University, Wuhan, 430079, People's Republic of China
| | - Wenqiong Pan
- Department of Clinical Pharmacy, Jilin University, Changchun, Jilin, 130025, People's Republic of China
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki, 3058577, Japan.
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki, 3058577, Japan
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, People's Republic of China.
| |
Collapse
|
5
|
Yang B, Liu Y, Wu J, Bai F, Zheng M, Zheng J. GENNDTI: Drug-Target Interaction Prediction Using Graph Neural Network Enhanced by Router Nodes. IEEE J Biomed Health Inform 2024; 28:7588-7598. [PMID: 40030413 DOI: 10.1109/jbhi.2024.3402529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Identifying drug-target interactions (DTI) is crucial in drug discovery and repurposing, and in silico techniques for DTI predictions are becoming increasingly important for reducing time and cost. Most interaction-based DTI models rely on the guilt-by-association principle that "similar drugs can interact with similar targets". However, such methods utilize precomputed similarity matrices and cannot dynamically discover intricate correlations. Meanwhile, some methods enrich DTI networks by incorporating additional networks like DDI and PPI networks, enriching biological signals to enhance DTI prediction. While these approaches have achieved promising performance in DTI prediction, such coarse-grained association data do not explain the specific biological mechanisms underlying DTIs. In this work, we propose GENNDTI, which constructs biologically meaningful routers to represent and integrate the salient properties of drugs and targets. Similar drugs or targets connect to more same router nodes, capturing property sharing. In addition, heterogeneous encoders are designed to distinguish different types of interactions, modeling both real and constructed interactions. This strategy enriches graph topology and enhances prediction efficiency as well. We evaluate the proposed method on benchmark datasets, demonstrating comparative performance over existing methods. We specifically analyze router nodes to validate their efficacy in improving predictions and providing biological explanations.
Collapse
|
6
|
Ru X, Zhao S, Zou Q, Xu L. Identify potential drug candidates within a high-quality compound search space. Brief Bioinform 2024; 26:bbaf024. [PMID: 39853109 PMCID: PMC11758506 DOI: 10.1093/bib/bbaf024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 12/10/2024] [Accepted: 01/14/2025] [Indexed: 01/26/2025] Open
Abstract
The identification of potential effective drug candidates is a fundamental step in new drug discovery, with profound implications for pharmaceutical research and the healthcare sector. While many computational methods have been developed for such predictions and have yielded promising results, two challenges persist: (i) The cold start problem of new drugs, which increases the difficulty of prediction due to lack of historical data or prior knowledge. (ii) The vastness of the compound search space for potential drug candidates. In this study, we present a promising method that not only enhances the accuracy of identifying potential novel drug candidates but also refines the search space. Drawing inspiration from solutions to the cold start problem in recommender systems, we apply 'learning to rank' techniques to the field of new drug discovery. Furthermore, we propose using three similarity metrics to condense the compound search space into compact yet high-quality spaces, allowing for more efficient screening of potential drug candidates. Experimental results from two widely used datasets demonstrate that our method outperforms other state-of-the-art approaches in the new drug cold-start scenario. Additionally, we have verified that it is feasible to identify potential drug candidates within these high-quality compound search spaces. To our knowledge, this study is the first to address drug cold-start problem in such a confined space, potentially providing valuable insights and guidance for drug screening.
Collapse
Affiliation(s)
- Xiaoqing Ru
- The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, No. 100, Minjiang Avenue, Smart New Town, Quzhou, Zhejiang Province, 324000, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No. 1, Chengdian Road, Kecheng District, Quzhou, Zhejiang Province, 324003, China
| | - Shulin Zhao
- The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, No. 100, Minjiang Avenue, Smart New Town, Quzhou, Zhejiang Province, 324000, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No. 1, Chengdian Road, Kecheng District, Quzhou, Zhejiang Province, 324003, China
| | - Lifeng Xu
- The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, No. 100, Minjiang Avenue, Smart New Town, Quzhou, Zhejiang Province, 324000, China
| |
Collapse
|
7
|
Xu Y, Zhou J, Ying H, Chen J, Chen W, Chen DZ, Wu J. A Protein-Context Enhanced Master Slave Framework for Zero-Shot Drug Target Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2359-2370. [PMID: 39331551 DOI: 10.1109/tcbb.2024.3468434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/29/2024]
Abstract
Drug Target Interaction (DTI) prediction plays a crucial role in in-silico drug discovery, especially for deep learning (DL) models. Along this line, existing methods usually first extract features from drugs and target proteins, and use drug-target pairs to train DL models. However, these DL-based methods essentially rely on similar structures and patterns defined by the homologous proteins from a large amount of data. When few drug-target interactions are known for a newly discovered protein and its homologous proteins, prediction performance can suffer notable reduction. In this paper, we propose a novel Protein-Context enhanced Master/Slave Framework (PCMS), for zero-shot DTI prediction. This framework facilitates the efficient discovery of ligands for newly discovered target proteins, addressing the challenge of predicting interactions without prior data. Specifically, the PCMS framework consists of two main components: a Master Learner and a Slave Learner. The Master Learner first learns the target protein context information, and then adaptively generates the corresponding parameters for the Slave Learner. The Slave Learner then perform zero-shot DTI prediction in different protein contexts. Extensive experiments verify the effectiveness of our PCMS compared to state-of-the-art methods in various metrics on two public datasets.
Collapse
|
8
|
Yang X, Yang G, Chu J. GraphCL-DTA: A Graph Contrastive Learning With Molecular Semantics for Drug-Target Binding Affinity Prediction. IEEE J Biomed Health Inform 2024; 28:4544-4552. [PMID: 38190664 DOI: 10.1109/jbhi.2024.3350666] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Drug-target binding affinity prediction plays an important role in the early stages of drug discovery, which can infer the strength of interactions between new drugs and new targets. However, the performance of previous computational models is limited by the following drawbacks. The learning of drug representation relies only on supervised data without considering the information in the molecular graph itself. Moreover, most previous studies tended to design complicated representation learning modules, while uniformity used to measure representation quality is ignored. In this study, we propose GraphCL-DTA, a graph contrastive learning with molecular semantics for drug-target binding affinity prediction. This graph contrastive learning framework replaces the dropout-based data augmentation strategy by performing data augmentation in the embedding space, thereby better preserving the semantic information of the molecular graph. A more essential and effective drug representation can be learned through this graph contrastive framework without additional supervised data. Next, we design a new loss function that can be directly used to adjust the uniformity of drug and target representations. By directly optimizing the uniformity of representations, the representation quality of drugs and targets can be improved. The effectiveness of the above innovative elements is verified on two real datasets, KIBA and Davis. Compared with the GraphDTA model, the relative improvement of the GraphCL-DTA model on the two datasets is 2.7% and 4.5%. The graph contrastive learning framework and uniformity function in the GraphCL-DTA model can be embedded into other computational models as independent modules to improve their generalization capability.
Collapse
|
9
|
Kalemati M, Zamani Emani M, Koohi S. DCGAN-DTA: Predicting drug-target binding affinity with deep convolutional generative adversarial networks. BMC Genomics 2024; 25:411. [PMID: 38724911 PMCID: PMC11080241 DOI: 10.1186/s12864-024-10326-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/19/2024] [Indexed: 05/13/2024] Open
Abstract
BACKGROUND In recent years, there has been a growing interest in utilizing computational approaches to predict drug-target binding affinity, aiming to expedite the early drug discovery process. To address the limitations of experimental methods, such as cost and time, several machine learning-based techniques have been developed. However, these methods encounter certain challenges, including the limited availability of training data, reliance on human intervention for feature selection and engineering, and a lack of validation approaches for robust evaluation in real-life applications. RESULTS To mitigate these limitations, in this study, we propose a method for drug-target binding affinity prediction based on deep convolutional generative adversarial networks. Additionally, we conducted a series of validation experiments and implemented adversarial control experiments using straw models. These experiments serve to demonstrate the robustness and efficacy of our predictive models. We conducted a comprehensive evaluation of our method by comparing it to baselines and state-of-the-art methods. Two recently updated datasets, namely the BindingDB and PDBBind, were used for this purpose. Our findings indicate that our method outperforms the alternative methods in terms of three performance measures when using warm-start data splitting settings. Moreover, when considering physiochemical-based cold-start data splitting settings, our method demonstrates superior predictive performance, particularly in terms of the concordance index. CONCLUSION The results of our study affirm the practical value of our method and its superiority over alternative approaches in predicting drug-target binding affinity across multiple validation sets. This highlights the potential of our approach in accelerating drug repurposing efforts, facilitating novel drug discovery, and ultimately enhancing disease treatment. The data and source code for this study were deposited in the GitHub repository, https://github.com/mojtabaze7/DCGAN-DTA . Furthermore, the web server for our method is accessible at https://dcgan.shinyapps.io/bindingaffinity/ .
Collapse
Affiliation(s)
- Mahmood Kalemati
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Mojtaba Zamani Emani
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Somayyeh Koohi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
| |
Collapse
|
10
|
Dehghan A, Abbasi K, Razzaghi P, Banadkuki H, Gharaghani S. CCL-DTI: contributing the contrastive loss in drug-target interaction prediction. BMC Bioinformatics 2024; 25:48. [PMID: 38291364 PMCID: PMC11264960 DOI: 10.1186/s12859-024-05671-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND The Drug-Target Interaction (DTI) prediction uses a drug molecule and a protein sequence as inputs to predict the binding affinity value. In recent years, deep learning-based models have gotten more attention. These methods have two modules: the feature extraction module and the task prediction module. In most deep learning-based approaches, a simple task prediction loss (i.e., categorical cross entropy for the classification task and mean squared error for the regression task) is used to learn the model. In machine learning, contrastive-based loss functions are developed to learn more discriminative feature space. In a deep learning-based model, extracting more discriminative feature space leads to performance improvement for the task prediction module. RESULTS In this paper, we have used multimodal knowledge as input and proposed an attention-based fusion technique to combine this knowledge. Also, we investigate how utilizing contrastive loss function along the task prediction loss could help the approach to learn a more powerful model. Four contrastive loss functions are considered: (1) max-margin contrastive loss function, (2) triplet loss function, (3) Multi-class N-pair Loss Objective, and (4) NT-Xent loss function. The proposed model is evaluated using four well-known datasets: Wang et al. dataset, Luo's dataset, Davis, and KIBA datasets. CONCLUSIONS Accordingly, after reviewing the state-of-the-art methods, we developed a multimodal feature extraction network by combining protein sequences and drug molecules, along with protein-protein interaction networks and drug-drug interaction networks. The results show it performs significantly better than the comparable state-of-the-art approaches.
Collapse
Affiliation(s)
- Alireza Dehghan
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, 1417614411, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics and Artificial Intelligence in Medicine (LBB&AI), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, 1417614411, Iran
| | - Parvin Razzaghi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 4513766731, Iran.
| | - Hossein Banadkuki
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran.
| |
Collapse
|
11
|
Chen Y, Wang J, Wang C, Zou Q. AutoEdge-CCP: A novel approach for predicting cancer-associated circRNAs and drugs based on automated edge embedding. PLoS Comput Biol 2024; 20:e1011851. [PMID: 38289973 PMCID: PMC10857569 DOI: 10.1371/journal.pcbi.1011851] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 02/09/2024] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
The unique expression patterns of circRNAs linked to the advancement and prognosis of cancer underscore their considerable potential as valuable biomarkers. Repurposing existing drugs for new indications can significantly reduce the cost of cancer treatment. Computational prediction of circRNA-cancer and drug-cancer relationships is crucial for precise cancer therapy. However, prior computational methods fail to analyze the interaction between circRNAs, drugs, and cancer at the systematic level. It is essential to propose a method that uncover more valuable information for achieving cancer-centered multi-association prediction. In this paper, we present a novel computational method, AutoEdge-CCP, to unveil cancer-associated circRNAs and drugs. We abstract the complex relationships between circRNAs, drugs, and cancer into a multi-source heterogeneous network. In this network, each molecule is represented by two types information, one is the intrinsic attribute information of molecular features, and the other is the link information explicitly modeled by autoGNN, which searches information from both intra-layer and inter-layer of message passing neural network. The significant performance on multi-scenario applications and case studies establishes AutoEdge-CCP as a potent and promising association prediction tool.
Collapse
Affiliation(s)
- Yaojia Chen
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Jiacheng Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
12
|
Zhang L, Xiao K, Kong L. A computational method for small molecule-RNA binding sites identification by utilizing position specificity and complex network information. Biosystems 2024; 235:105094. [PMID: 38056591 DOI: 10.1016/j.biosystems.2023.105094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 11/23/2023] [Accepted: 11/24/2023] [Indexed: 12/08/2023]
Abstract
Some computational methods have been given for small molecule-RNA binding site identification due to that it plays a significant role in revealing biology function researches. However, it is still challenging to design an accurate model, especially for MCC. We designed a feature extraction technology from two aspects (position specificity and complex network information). Specifically, complex network was employed to express the space topological structure and sequence position information for improving prediction effect. Then, the features fused position specificity and complex network information were input into random forest classifier for model construction. The AUC of 88.22%, 77.92% and 81.46% were obtained on three independent datasets (RB19, CS71, RB78). Compared with the existing method, the best MCC were obtained on three datasets, which were 8.19%, 0.59% and 4.35% higher than the state-of-the-art prediction methods, respectively. The outstanding performances show that our method is a powerful tool to identify RNA binding sites, helping to the design RNA-targeting small molecule drugs. The data and resource codes are available at https://github.com/Kangxiaoneuq/PCN_RNAsite.
Collapse
Affiliation(s)
- Lichao Zhang
- School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, 066000, PR China; Hebei Innovation Center for Smart Perception and Applied Technology of Agricultural Data, Qinhuangdao, 066000, PR China.
| | - Kang Xiao
- School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, 066000, PR China.
| | - Liang Kong
- Hebei Innovation Center for Smart Perception and Applied Technology of Agricultural Data, Qinhuangdao, 066000, PR China; School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao, 066000, PR China.
| |
Collapse
|
13
|
He S, Ye X, Dou L, Sakurai T. FIAMol-AB: A feature fusion and attention-based deep learning method for enhanced antibiotic discovery. Comput Biol Med 2024; 168:107762. [PMID: 38056212 DOI: 10.1016/j.compbiomed.2023.107762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 10/31/2023] [Accepted: 11/21/2023] [Indexed: 12/08/2023]
Abstract
Antibiotic resistance continues to be a growing concern for global health, accentuating the need for novel antibiotic discoveries. Traditional methodologies in this field have relied heavily on extensive experimental screening, which is often time-consuming and costly. Contrastly, computer-assisted drug screening offers rapid, cost-effective solutions. In this work, we propose FIAMol-AB, a deep learning model that combines graph neural networks, text convolutional networks and molecular fingerprint techniques. This method also combines an attention mechanism to fuse multiple forms of information within the model. The experiments show that FIAMol-AB may offer potential advantages in antibiotic discovery tasks over some existing methods. We conducted some analysis based on our model's results, which help highlight the potential significance of certain features in the model's predictive performance. Compared to different models, ours demonstrate promising results, indicating potential robustness and versatility. This suggests that by integrating multi-view information and attention mechanisms, FIAMol-AB might better learn complex molecular structures, potentially improving the precision and efficiency of antibiotic discovery. We hope our FIAMol-AB can be used as a useful method in the ongoing fight against antibiotic resistance.
Collapse
Affiliation(s)
- Shida He
- Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki, 305-8577, Japan
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki, 305-8577, Japan.
| | - Lijun Dou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland, OH, 44106, USA
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki, 305-8577, Japan
| |
Collapse
|
14
|
Jiang M, Shao Y, Zhang Y, Zhou W, Pang S. A deep learning method for drug-target affinity prediction based on sequence interaction information mining. PeerJ 2023; 11:e16625. [PMID: 38099302 PMCID: PMC10720480 DOI: 10.7717/peerj.16625] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 11/16/2023] [Indexed: 12/17/2023] Open
Abstract
Background A critical aspect of in silico drug discovery involves the prediction of drug-target affinity (DTA). Conducting wet lab experiments to determine affinity is both expensive and time-consuming, making it necessary to find alternative approaches. In recent years, deep learning has emerged as a promising technique for DTA prediction, leveraging the substantial computational power of modern computers. Methods We proposed a novel sequence-based approach, named KC-DTA, for predicting drug-target affinity (DTA). In this approach, we converted the target sequence into two distinct matrices, while representing the molecule compound as a graph. The proposed method utilized k-mers analysis and Cartesian product calculation to capture the interactions and evolutionary information among various residues, enabling the creation of the two matrices for target sequence. For molecule, it was represented by constructing a molecular graph where atoms serve as nodes and chemical bonds serve as edges. Subsequently, the obtained target matrices and molecule graph were utilized as inputs for convolutional neural networks (CNNs) and graph neural networks (GNNs) to extract hidden features, which were further used for the prediction of binding affinity. Results In order to evaluate the effectiveness of the proposed method, we conducted several experiments and made a comprehensive comparison with the state-of-the-art approaches using multiple evaluation metrics. The results of our experiments demonstrated that the KC-DTA method achieves high performance in predicting drug-target affinity (DTA). The findings of this research underscore the significance of the KC-DTA method as a valuable tool in the field of in silico drug discovery, offering promising opportunities for accelerating the drug development process. All the data and code are available for access on https://github.com/syc2017/KCDTA.
Collapse
Affiliation(s)
- Mingjian Jiang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China
| | - Yunchang Shao
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China
| | - Wei Zhou
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong, China
| | - Shunpeng Pang
- School of Computer Engineering, WeiFang University, Weifang, Shandong, China
| |
Collapse
|
15
|
Li H, Wang S, Zheng W, Yu L. Multi-dimensional search for drug-target interaction prediction by preserving the consistency of attention distribution. Comput Biol Chem 2023; 107:107968. [PMID: 37844375 DOI: 10.1016/j.compbiolchem.2023.107968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 09/27/2023] [Accepted: 10/05/2023] [Indexed: 10/18/2023]
Abstract
Predicting drug-target interaction (DTI) is a crucial step in the process of drug repurposing and new drug development. Although the attention mechanism has been widely used to capture the interactions between drugs and targets, it mainly uses the Simplified Molecular Input Line Entry System (SMILES) and two-dimensional (2D) molecular graph features of drugs. In this paper, we propose a neural network model called MdDTI for DTI prediction. The model searches for binding sites that may interact with the target from the multiple dimensions of drug structure, namely the 2D substructures and the three-dimensional (3D) spatial structure. For the 2D substructures, we have developed a novel substructure decomposition strategy based on drug molecular graphs and compared its performance with the SMILES-based decomposition method. For the 3D spatial structure of drugs, we constructed spatial feature representation matrices for drugs based on the Cartesian coordinates of heavy atoms (without hydrogen atoms) in each drug. Finally, to ensure the search results of the model are consistent across multiple dimensions, we construct a consistency loss function. We evaluate MdDTI on four drug-target interaction datasets and three independent compound-protein affinity test sets. The results indicate that our model surpasses a series of state-of-the-art models. Case studies demonstrate that our model is capable of capturing the potential binding regions between drugs and targets, and it shows efficacy in drug repurposing. Our code is available at https://github.com/lhhu1999/MdDTI.
Collapse
Affiliation(s)
- Huaihu Li
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China; The Key Lab of Intelligent Systems and Computing of Yunnan Province, Yunnan University, Kunming, Yunnan, China.
| | - Weihua Zheng
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Li Yu
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| |
Collapse
|
16
|
Dehghan A, Razzaghi P, Abbasi K, Gharaghani S. TripletMultiDTI: Multimodal representation learning in drug-target interaction prediction with triplet loss function. EXPERT SYSTEMS WITH APPLICATIONS 2023; 232:120754. [DOI: 10.1016/j.eswa.2023.120754] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
17
|
Ru X, Zou Q, Lin C. Optimization of drug-target affinity prediction methods through feature processing schemes. Bioinformatics 2023; 39:btad615. [PMID: 37812388 PMCID: PMC10636279 DOI: 10.1093/bioinformatics/btad615] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 09/19/2023] [Accepted: 10/07/2023] [Indexed: 10/10/2023] Open
Abstract
MOTIVATION Numerous high-accuracy drug-target affinity (DTA) prediction models, whose performance is heavily reliant on the drug and target feature information, are developed at the expense of complexity and interpretability. Feature extraction and optimization constitute a critical step that significantly influences the enhancement of model performance, robustness, and interpretability. Many existing studies aim to comprehensively characterize drugs and targets by extracting features from multiple perspectives; however, this approach has drawbacks: (i) an abundance of redundant or noisy features; and (ii) the feature sets often suffer from high dimensionality. RESULTS In this study, to obtain a model with high accuracy and strong interpretability, we utilize various traditional and cutting-edge feature selection and dimensionality reduction techniques to process self-associated features and adjacent associated features. These optimized features are then fed into learning to rank to achieve efficient DTA prediction. Extensive experimental results on two commonly used datasets indicate that, among various feature optimization methods, the regression tree-based feature selection method is most beneficial for constructing models with good performance and strong robustness. Then, by utilizing Shapley Additive Explanations values and the incremental feature selection approach, we obtain that the high-quality feature subset consists of the top 150D features and the top 20D features have a breakthrough impact on the DTA prediction. In conclusion, our study thoroughly validates the importance of feature optimization in DTA prediction and serves as inspiration for constructing high-performance and high-interpretable models. AVAILABILITY AND IMPLEMENTATION https://github.com/RUXIAOQING964914140/FS_DTA.
Collapse
Affiliation(s)
- Xiaoqing Ru
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Chen Lin
- Department of Computer Science and Technology, School of Informatics, Xiamen University, Xiamen, Fujian, 361005, China
| |
Collapse
|
18
|
Tian H, Xiao S, Jiang X, Tao P. PASSerRank: Prediction of allosteric sites with learning to rank. J Comput Chem 2023; 44:2223-2229. [PMID: 37561047 PMCID: PMC11127606 DOI: 10.1002/jcc.27193] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 06/19/2023] [Accepted: 07/10/2023] [Indexed: 08/11/2023]
Abstract
Allostery plays a crucial role in regulating protein activity, making it a highly sought-after target in drug development. One of the major challenges in allosteric drug research is the identification of allosteric sites. In recent years, many computational models have been developed for accurate allosteric site prediction. Most of these models focus on designing a general rule that can be applied to pockets of proteins from various families. In this study, we present a new approach using the concept of Learning to Rank (LTR). The LTR model ranks pockets based on their relevance to allosteric sites, that is, how well a pocket meets the characteristics of known allosteric sites. After the training and validation on two datasets, the Allosteric Database (ASD) and CASBench, the LTR model was able to rank an allosteric pocket in the top three positions for 83.6% and 80.5% of test proteins, respectively. The model outperforms other common machine learning models with higher F1 scores (0.662 in ASD and 0.608 in CASBench) and Matthews correlation coefficients (0.645 in ASD and 0.589 in CASBench). The trained model is available on the PASSer platform (https://passer.smu.edu) to aid in drug discovery research.
Collapse
Affiliation(s)
- Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, USA
| | - Sian Xiao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, USA
| | - Xi Jiang
- Department of Statistics, Southern Methodist University, Dallas, Texas, USA
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas, USA
| |
Collapse
|
19
|
Brahma R, Shin JM, Cho KH. KinScan: AI-based rapid profiling of activity across the kinome. Brief Bioinform 2023; 24:bbad396. [PMID: 37985454 DOI: 10.1093/bib/bbad396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 09/22/2023] [Accepted: 10/14/2023] [Indexed: 11/22/2023] Open
Abstract
Kinases play a vital role in regulating essential cellular processes, including cell cycle progression, growth, apoptosis, and metabolism, by catalyzing the transfer of phosphate groups from adenosing triphosphate to substrates. Their dysregulation has been closely associated with numerous diseases, including cancer development, making them attractive targets for drug discovery. However, accurately predicting the binding affinity between chemical compounds and kinase targets remains challenging due to the highly conserved structural similarities across the kinome. To address this limitation, we present KinScan, a novel computational approach that leverages large-scale bioactivity data and integrates the Multi-Scale Context Aware Transformer framework to construct a virtual profiling model encompassing 391 protein kinases. The developed model demonstrates exceptional prediction capability, distinguishing between kinases by utilizing structurally aligned kinase binding site features derived from multiple sequence alignment for fast and accurate predictions. Through extensive validation and benchmarking, KinScan demonstrated its robust predictive power and generalizability for large-scale kinome-wide profiling and selectivity, uncovering associations with specific diseases and providing valuable insights into kinase activity profiles of compounds. Furthermore, we deployed a web platform for end-to-end profiling and selectivity analysis, accessible at https://kinscan.drugonix.com/softwares/kinscan.
Collapse
Affiliation(s)
- Rahul Brahma
- School of Systems Biomedical Science, Soongsil University, Seoul, Republic of Korea
| | - Jae-Min Shin
- AzothBio, Rm. DA724 Hyundai Knowledge Industry Center, Hanam-si, Gyeonggi-do, Republic of Korea
| | - Kwang-Hwi Cho
- School of Systems Biomedical Science, Soongsil University, Seoul, Republic of Korea
| |
Collapse
|
20
|
Kim SK. New Sight: Enzymes as Targets for Drug Development. Curr Issues Mol Biol 2023; 45:7650-7652. [PMID: 37754266 PMCID: PMC10528434 DOI: 10.3390/cimb45090482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 09/13/2023] [Indexed: 09/28/2023] Open
Abstract
In the dynamic realm of medical research, a resounding chord is struck by recent studies that have propelled drug discovery to new horizons across a spectrum of disciplines [...].
Collapse
Affiliation(s)
- Sung-Kun Kim
- Department of Natural Sciences, Northeastern State University, Broken Arrow, OK 74014, USA
| |
Collapse
|
21
|
Zhu Y, Zhao L, Wen N, Wang J, Wang C. DataDTA: a multi-feature and dual-interaction aggregation framework for drug-target binding affinity prediction. Bioinformatics 2023; 39:btad560. [PMID: 37688568 PMCID: PMC10516524 DOI: 10.1093/bioinformatics/btad560] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 05/09/2023] [Accepted: 09/07/2023] [Indexed: 09/11/2023] Open
Abstract
MOTIVATION Accurate prediction of drug-target binding affinity (DTA) is crucial for drug discovery. The increase in the publication of large-scale DTA datasets enables the development of various computational methods for DTA prediction. Numerous deep learning-based methods have been proposed to predict affinities, some of which only utilize original sequence information or complex structures, but the effective combination of various information and protein-binding pockets have not been fully mined. Therefore, a new method that integrates available key information is urgently needed to predict DTA and accelerate the drug discovery process. RESULTS In this study, we propose a novel deep learning-based predictor termed DataDTA to estimate the affinities of drug-target pairs. DataDTA utilizes descriptors of predicted pockets and sequences of proteins, as well as low-dimensional molecular features and SMILES strings of compounds as inputs. Specifically, the pockets were predicted from the three-dimensional structure of proteins and their descriptors were extracted as the partial input features for DTA prediction. The molecular representation of compounds based on algebraic graph features was collected to supplement the input information of targets. Furthermore, to ensure effective learning of multiscale interaction features, a dual-interaction aggregation neural network strategy was developed. DataDTA was compared with state-of-the-art methods on different datasets, and the results showed that DataDTA is a reliable prediction tool for affinities estimation. Specifically, the concordance index (CI) of DataDTA is 0.806 and the Pearson correlation coefficient (R) value is 0.814 on the test dataset, which is higher than other methods. AVAILABILITY AND IMPLEMENTATION The codes and datasets of DataDTA are available at https://github.com/YanZhu06/DataDTA.
Collapse
Affiliation(s)
- Yan Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian 116600, China
| | - Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
22
|
Zhang W, Liu B. iSnoDi-MDRF: Identifying snoRNA-Disease Associations Based on Multiple Biological Data by Ranking Framework. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3013-3019. [PMID: 37030816 DOI: 10.1109/tcbb.2023.3258448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Accumulating evidence indicates that the dysregulation of small nucleolar RNAs (snoRNAs) is relevant with diseases. Identifying snoRNA-disease associations by computational methods is desired for biologists, which can save considerable costs and time compared biological experiments. However, it still faces some challenges as followings: (i) Many snoRNAs are detected in recent years, but only a few snoRNAs have been proved to be associated with diseases; (ii) Computational predictors trained with only a few known snoRNA-disease associations fail to accurately identify the snoRNA-disease associations. In this study, we propose a ranking framework, called iSnoDi-MDRF, to identify potential snoRNA-disease associations based on multiple biological data, which has the following highlights: (i) iSnoDi-MDRF integrates ranking framework, which is not only able to identify potential associations between known snoRNAs and diseases, but also can identify diseases associated with new snoRNAs. (ii) Known gene-disease associations are employed to help train a mature model for predicting snoRNA-disease association. Experimental results illustrate that iSnoDi-MDRF is very suitable for identifying potential snoRNA-disease associations. The web server of iSnoDi-MDRF predictor is freely available at http://bliulab.net/iSnoDi-MDRF/.
Collapse
|
23
|
Wang Y, Zhang Y, Wang J, Xie F, Zheng D, Zou X, Guo M, Ding Y, Wan J, Han K. Prediction of drug-target interactions via neural tangent kernel extraction feature matrix factorization model. Comput Biol Med 2023; 159:106955. [PMID: 37094465 DOI: 10.1016/j.compbiomed.2023.106955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 04/04/2023] [Accepted: 04/16/2023] [Indexed: 04/26/2023]
Abstract
Drug discovery is a complex and lengthy process that often requires years of research and development. Therefore, drug research and development require a lot of investment and resource support, as well as professional knowledge, technology, skills, and other elements. Predicting of drug-target interactions (DTIs) is an important part of drug development. If machine learning is used to predict DTIs, the cost and time of drug development can be significantly reduced. Currently, machine learning methods are widely used to predict DTIs. In this study neighborhood regularized logistic matrix factorization method based on extracted features from a neural tangent kernel (NTK) to predict DTIs. First, the potential feature matrix of drugs and targets is extracted from the NTK model, then the corresponding Laplacian matrix is constructed according to the feature matrix. Next, the Laplacian matrix of the drugs and targets is used as the condition for matrix factorization to obtain two low-dimensional matrices. Finally, the matrix of the predicted DTIs was obtained by multiplying these two low-dimensional matrices. For the four gold standard datasets, the present method is significantly better than the other methods that is compared to, indicating that the automatic feature extraction method using the deep learning model is competitive compared with the manual feature selection method.
Collapse
Affiliation(s)
- Yu Wang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China
| | - Yu Zhang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Jianchun Wang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Fang Xie
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Dequan Zheng
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Xiang Zou
- Pharmaceutical Engineering Technology Research Center, Harbin University of Commerce, Harbin, 150076, China
| | - Mian Guo
- Department of Neurosurgery, The Second Affiliated Hospital of Harbin Medical University, 150086, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China.
| | - Jie Wan
- Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin, 150001, China.
| | - Ke Han
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China; Pharmaceutical Engineering Technology Research Center, Harbin University of Commerce, Harbin, 150076, China.
| |
Collapse
|
24
|
Wen J, Gan H, Yang Z, Zhou R, Zhao J, Ye Z. Mutual-DTI: A mutual interaction feature-based neural network for drug-target protein interaction prediction. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:10610-10625. [PMID: 37322951 DOI: 10.3934/mbe.2023469] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
The prediction of drug-target protein interaction (DTI) is a crucial task in the development of new drugs in modern medicine. Accurately identifying DTI through computer simulations can significantly reduce development time and costs. In recent years, many sequence-based DTI prediction methods have been proposed, and introducing attention mechanisms has improved their forecasting performance. However, these methods have some shortcomings. For example, inappropriate dataset partitioning during data preprocessing can lead to overly optimistic prediction results. Additionally, only single non-covalent intermolecular interactions are considered in the DTI simulation, ignoring the complex interactions between their internal atoms and amino acids. In this paper, we propose a network model called Mutual-DTI that predicts DTI based on the interaction properties of sequences and a Transformer model. We use multi-head attention to extract the long-distance interdependent features of the sequence and introduce a module to extract the sequence's mutual interaction features in mining complex reaction processes of atoms and amino acids. We evaluate the experiments on two benchmark datasets, and the results show that Mutual-DTI outperforms the latest baseline significantly. In addition, we conduct ablation experiments on a label-inversion dataset that is split more rigorously. The results show that there is a significant improvement in the evaluation metrics after introducing the extracted sequence interaction feature module. This suggests that Mutual-DTI may contribute to modern medical drug development research. The experimental results show the effectiveness of our approach. The code for Mutual-DTI can be downloaded from https://github.com/a610lab/Mutual-DTI.
Collapse
Affiliation(s)
- Jiahui Wen
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| | - Haitao Gan
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei University, Wuhan 430062, China
| | - Zhi Yang
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei University, Wuhan 430062, China
| | - Ran Zhou
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| | - Jing Zhao
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei University, Wuhan 430062, China
| | - Zhiwei Ye
- School of Computer Science, Hubei University of Technology, Wuhan 430068, China
| |
Collapse
|
25
|
Voitsitskyi T, Stratiichuk R, Koleiev I, Popryho L, Ostrovsky Z, Henitsoi P, Khropachov I, Vozniak V, Zhytar R, Nechepurenko D, Yesylevskyy S, Nafiiev A, Starosyla S. 3DProtDTA: a deep learning model for drug-target affinity prediction based on residue-level protein graphs. RSC Adv 2023; 13:10261-10272. [PMID: 37006369 PMCID: PMC10065141 DOI: 10.1039/d3ra00281k] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 03/26/2023] [Indexed: 04/03/2023] Open
Abstract
Accurate prediction of the drug-target affinity (DTA) in silico is of critical importance for modern drug discovery. Computational methods of DTA prediction, applied in the early stages of drug development, are able to speed it up and cut its cost significantly. A wide range of approaches based on machine learning were recently proposed for DTA assessment. The most promising of them are based on deep learning techniques and graph neural networks to encode molecular structures. The recent breakthrough in protein structure prediction made by AlphaFold made an unprecedented amount of proteins without experimentally defined structures accessible for computational DTA prediction. In this work, we propose a new deep learning DTA model 3DProtDTA, which utilises AlphaFold structure predictions in conjunction with the graph representation of proteins. The model is superior to its rivals on common benchmarking datasets and has potential for further improvement.
Collapse
Affiliation(s)
- Taras Voitsitskyi
- Receptor.AI Inc. 20-22 Wenlock Road London N1 7GU UK
- Department of Physics of Biological Systems, Institute of Physics of The National Academy of Sciences of Ukraine Nauky Ave. 46 03038 Kyiv Ukraine
| | - Roman Stratiichuk
- Receptor.AI Inc. 20-22 Wenlock Road London N1 7GU UK
- Department of Biophysics and Medical Informatics, Educational and Scientific Centre "Institute of Biology and Medicine", Taras Shevchenko National University of Kyiv 64 Volodymyrska Str. 01601 Kyiv Ukraine
| | - Ihor Koleiev
- Receptor.AI Inc. 20-22 Wenlock Road London N1 7GU UK
| | | | | | | | | | | | - Roman Zhytar
- Receptor.AI Inc. 20-22 Wenlock Road London N1 7GU UK
| | | | - Semen Yesylevskyy
- Receptor.AI Inc. 20-22 Wenlock Road London N1 7GU UK
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences CZ-166 10 Prague 6 Czech Republic
- Department of Physics of Biological Systems, Institute of Physics of The National Academy of Sciences of Ukraine Nauky Ave. 46 03038 Kyiv Ukraine
| | - Alan Nafiiev
- Receptor.AI Inc. 20-22 Wenlock Road London N1 7GU UK
| | | |
Collapse
|
26
|
Guo B, Zheng H, Jiang H, Li X, Guan N, Zuo Y, Zhang Y, Yang H, Wang X. Enhanced compound-protein binding affinity prediction by representing protein multimodal information via a coevolutionary strategy. Brief Bioinform 2023; 24:6995409. [PMID: 36682005 DOI: 10.1093/bib/bbac628] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 12/12/2022] [Accepted: 12/25/2022] [Indexed: 01/23/2023] Open
Abstract
Due to the lack of a method to efficiently represent the multimodal information of a protein, including its structure and sequence information, predicting compound-protein binding affinity (CPA) still suffers from low accuracy when applying machine-learning methods. To overcome this limitation, in a novel end-to-end architecture (named FeatNN), we develop a coevolutionary strategy to jointly represent the structure and sequence features of proteins and ultimately optimize the mathematical models for predicting CPA. Furthermore, from the perspective of data-driven approach, we proposed a rational method that can utilize both high- and low-quality databases to optimize the accuracy and generalization ability of FeatNN in CPA prediction tasks. Notably, we visually interpret the feature interaction process between sequence and structure in the rationally designed architecture. As a result, FeatNN considerably outperforms the state-of-the-art (SOTA) baseline in virtual drug evaluation tasks, indicating the feasibility of this approach for practical use. FeatNN provides an outstanding method for higher CPA prediction accuracy and better generalization ability by efficiently representing multimodal information of proteins via a coevolutionary strategy.
Collapse
Affiliation(s)
- Binjie Guo
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Hanyu Zheng
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Haohan Jiang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Xiaodan Li
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Naiyu Guan
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Yanming Zuo
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Yicheng Zhang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
| | - Hengfu Yang
- School of Computer Science, Hunan First Normal University, Changsha, 410205 Hunan, China
| | - Xuhua Wang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310058, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou 310058, China
- Co-innovation Center of Neuroregeneration, Nantong University, Nantong, 226001 Jiangsu, China
| |
Collapse
|
27
|
Kalemati M, Zamani Emani M, Koohi S. BiComp-DTA: Drug-target binding affinity prediction through complementary biological-related and compression-based featurization approach. PLoS Comput Biol 2023; 19:e1011036. [PMID: 37000857 PMCID: PMC10096306 DOI: 10.1371/journal.pcbi.1011036] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 04/12/2023] [Accepted: 03/20/2023] [Indexed: 04/03/2023] Open
Abstract
Drug-target binding affinity prediction plays a key role in the early stage of drug discovery. Numerous experimental and data-driven approaches have been developed for predicting drug-target binding affinity. However, experimental methods highly rely on the limited structural-related information from drug-target pairs, domain knowledge, and time-consuming assays. On the other hand, learning-based methods have shown an acceptable prediction performance. However, most of them utilize several simple and complex types of proteins and drug compounds data, ranging from the protein sequences to the topology of a graph representation of drug compounds, employing multiple deep neural networks for encoding and feature extraction, and so, leads to the computational overheads. In this study, we propose a unified measure for protein sequence encoding, named BiComp, which provides compression-based and evolutionary-related features from the protein sequences. Specifically, we employ Normalized Compression Distance and Smith-Waterman measures for capturing complementary information from the algorithmic information theory and biological domains, respectively. We utilize the proposed measure to encode the input proteins feeding a new deep neural network-based method for drug-target binding affinity prediction, named BiComp-DTA. BiComp-DTA is evaluated utilizing four benchmark datasets for drug-target binding affinity prediction. Compared to the state-of-the-art methods, which employ complex models for protein encoding and feature extraction, BiComp-DTA provides superior efficiency in terms of accuracy, runtime, and the number of trainable parameters. The latter achievement facilitates execution of BiComp-DTA on a normal desktop computer in a fast fashion. As a comparative study, we evaluate BiComp's efficiency against its components for drug-target binding affinity prediction. The results have shown superior accuracy of BiComp due to the orthogonality and complementary nature of Smith-Waterman and Normalized Compression Distance measures for protein sequences. Such a protein sequence encoding provides efficient representation with no need for multiple sources of information, deep domain knowledge, and complex neural networks.
Collapse
Affiliation(s)
- Mahmood Kalemati
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Mojtaba Zamani Emani
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Somayyeh Koohi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| |
Collapse
|
28
|
Chen L, Yu L, Gao L. Potent antibiotic design via guided search from antibacterial activity evaluations. Bioinformatics 2023; 39:btad059. [PMID: 36707990 PMCID: PMC9897189 DOI: 10.1093/bioinformatics/btad059] [Citation(s) in RCA: 51] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 01/14/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
MOTIVATION The emergence of drug-resistant bacteria makes the discovery of new antibiotics an urgent issue, but finding new molecules with the desired antibacterial activity is an extremely difficult task. To address this challenge, we established a framework, MDAGS (Molecular Design via Attribute-Guided Search), to optimize and generate potent antibiotic molecules. RESULTS By designing the antibacterial activity latent space and guiding the optimization of functional compounds based on this space, the model MDAGS can generate novel compounds with desirable antibacterial activity without the need for extensive expensive and time-consuming evaluations. Compared with existing antibiotics, candidate antibacterial compounds generated by MDAGS always possessed significantly better antibacterial activity and ensured high similarity. Furthermore, although without explicit constraints on similarity to known antibiotics, these candidate antibacterial compounds all exhibited the highest structural similarity to antibiotics of expected function in the DrugBank database query. Overall, our approach provides a viable solution to the problem of bacterial drug resistance. AVAILABILITY AND IMPLEMENTATION Code of the model and datasets can be downloaded from GitHub (https://github.com/LiangYu-Xidian/MDAGS). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lu Chen
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| |
Collapse
|
29
|
Liang Q, Zhang W, Wu H, Liu B. LncRNA-disease association identification using graph auto-encoder and learning to rank. Brief Bioinform 2023; 24:6955271. [PMID: 36545805 DOI: 10.1093/bib/bbac539] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 10/18/2022] [Accepted: 11/08/2022] [Indexed: 12/24/2022] Open
Abstract
Discovering the relationships between long non-coding RNAs (lncRNAs) and diseases is significant in the treatment, diagnosis and prevention of diseases. However, current identified lncRNA-disease associations are not enough because of the expensive and heavy workload of wet laboratory experiments. Therefore, it is greatly important to develop an efficient computational method for predicting potential lncRNA-disease associations. Previous methods showed that combining the prediction results of the lncRNA-disease associations predicted by different classification methods via Learning to Rank (LTR) algorithm can be effective for predicting potential lncRNA-disease associations. However, when the classification results are incorrect, the ranking results will inevitably be affected. We propose the GraLTR-LDA predictor based on biological knowledge graphs and ranking framework for predicting potential lncRNA-disease associations. Firstly, homogeneous graph and heterogeneous graph are constructed by integrating multi-source biological information. Then, GraLTR-LDA integrates graph auto-encoder and attention mechanism to extract embedded features from the constructed graphs. Finally, GraLTR-LDA incorporates the embedded features into the LTR via feature crossing statistical strategies to predict priority order of diseases associated with query lncRNAs. Experimental results demonstrate that GraLTR-LDA outperforms the other state-of-the-art predictors and can effectively detect potential lncRNA-disease associations. Availability and implementation: Datasets and source codes are available at http://bliulab.net/GraLTR-LDA.
Collapse
Affiliation(s)
- Qi Liang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Wenxiang Zhang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Hao Wu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China.,Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
30
|
Hyperbolic matrix factorization improves prediction of drug-target associations. Sci Rep 2023; 13:959. [PMID: 36653463 PMCID: PMC9849222 DOI: 10.1038/s41598-023-27995-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 01/11/2023] [Indexed: 01/19/2023] Open
Abstract
Past research in computational systems biology has focused more on the development and applications of advanced statistical and numerical optimization techniques and much less on understanding the geometry of the biological space. By representing biological entities as points in a low dimensional Euclidean space, state-of-the-art methods for drug-target interaction (DTI) prediction implicitly assume the flat geometry of the biological space. In contrast, recent theoretical studies suggest that biological systems exhibit tree-like topology with a high degree of clustering. As a consequence, embedding a biological system in a flat space leads to distortion of distances between biological objects. Here, we present a novel matrix factorization methodology for drug-target interaction prediction that uses hyperbolic space as the latent biological space. When benchmarked against classical, Euclidean methods, hyperbolic matrix factorization exhibits superior accuracy while lowering embedding dimension by an order of magnitude. We see this as additional evidence that the hyperbolic geometry underpins large biological networks.
Collapse
|
31
|
Lin S, Zhang G, Wei DQ, Xiong Y. DeepPSE: Prediction of polypharmacy side effects by fusing deep representation of drug pairs and attention mechanism. Comput Biol Med 2022; 149:105984. [PMID: 35994933 DOI: 10.1016/j.compbiomed.2022.105984] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 07/17/2022] [Accepted: 08/14/2022] [Indexed: 11/18/2022]
Abstract
Polypharmacy (multiple use of drugs) is an effective strategy for combating complex or co-existing diseases. However, a major consequence of polypharmacy is a higher risk of adverse side effects due to drug-drug interactions, which are rare and observed in relatively small clinical testing. Thus, identification of polypharmacy side effects remains challenging. Here, we propose a deep learning-based method, DeepPSE, to predict polypharmacy side effects in an end-to-end way. DeepPSE is composed of two main modules. First, multiple types of neural networks are constructed and fused to learn the deep representation of a drug pair. Second, the encoder block of transformer that includes self-attention mechanism is built to get latent features, which are further fed into the fully connected layer to predict polypharmacy side effects of drug pairs. Further, DeepPSE is compared with five baseline or state-of-the-art methods on a benchmark dataset of 964 types of polypharmacy side effects across 63473 drug pairs. Experimental results demonstrate that DeepPSE achieves better performance than that of all five methods. The source codes and data are available at https://github.com/ShenggengLin/DeepPSE.
Collapse
Affiliation(s)
- Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Guangwei Zhang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nayang, Henan, 473006, China; Peng Cheng National Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong, 518055, China.
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| |
Collapse
|