Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Long Y, Wu M, Liu Y, Fang Y, Kwoh CK, Chen J, Luo J, Li X. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics 2022;38:2254-2262. [PMID: 35171981 DOI: 10.1093/bioinformatics/btac100] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/15/2022] [Accepted: 02/14/2022] [Indexed: 02/03/2023] Open

For:	Long Y, Wu M, Liu Y, Fang Y, Kwoh CK, Chen J, Luo J, Li X. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics 2022;38:2254-2262. [PMID: 35171981 DOI: 10.1093/bioinformatics/btac100] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/15/2022] [Accepted: 02/14/2022] [Indexed: 02/03/2023] Open

Number

Cited by Other Article(s)

Wang Z, Meng J, Li H, Dai Q, Lin X, Luan Y. Attention-augmented multi-domain cooperative graph representation learning for molecular interaction prediction. Neural Netw 2025;186:107265. [PMID: 39987715 DOI: 10.1016/j.neunet.2025.107265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Revised: 01/23/2025] [Accepted: 02/07/2025] [Indexed: 02/25/2025]

Chen X, Cai R, Huang Z, Li Z, Zheng J, Wu M. Interpretable high-order knowledge graph neural network for predicting synthetic lethality in human cancers. Brief Bioinform 2025;26:bbaf142. [PMID: 40194555 PMCID: PMC11975366 DOI: 10.1093/bib/bbaf142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 02/21/2025] [Accepted: 03/07/2025] [Indexed: 04/09/2025] Open

Zhang X, Liu Q. A graph neural network approach for hierarchical mapping of breast cancer protein communities. BMC Bioinformatics 2025;26:23. [PMID: 39838298 PMCID: PMC11749236 DOI: 10.1186/s12859-024-06015-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 12/16/2024] [Indexed: 01/23/2025] Open

Abstract

BACKGROUND

Comprehensively mapping the hierarchical structure of breast cancer protein communities and identifying potential biomarkers from them is a promising way for breast cancer research. Existing approaches are subjective and fail to take information from protein sequences into consideration. Deep learning can automatically learn features from protein sequences and protein-protein interactions for hierarchical clustering.

RESULTS

Using a large amount of publicly available proteomics data, we created a hierarchical tree for breast cancer protein communities using a novel hierarchical graph neural network, with the supervision of gene ontology terms and assistance of a pre-trained deep contextual language model. Then, a group-lasso algorithm was applied to identify protein communities that are under both mutation burden and survival burden, undergo significant alterations when targeted by specific drug molecules, and show cancer-dependent perturbations. The resulting hierarchical map of protein communities shows how gene-level mutations and survival information converge on protein communities at different scales. Internal validity of the model was established through the convergence on BRCA2 as a breast cancer hotspot. Further overlaps with breast cancer cell dependencies revealed SUPT6H and RAD21, along with their respective protein systems, HOST:37 and HOST:861, as potential biomarkers. Using gene-level perturbation data of the HOST:37 and HOST:861 gene sets, three FDA-approved drugs with high therapeutic value were selected as potential treatments to be further evaluated. These drugs include mercaptopurine, pioglitazone, and colchicine.

CONCLUSION

The proposed graph neural network approach to analyzing breast cancer protein communities in a hierarchical structure provides a novel perspective on breast cancer prognosis and treatment. By targeting entire gene sets, we were able to evaluate the prognostic and therapeutic value of genes (or gene sets) at different levels, from gene-level to system-level biology. Cancer-specific gene dependencies provide additional context for pinpointing cancer-related systems and drug-induced alterations can highlight potential therapeutic targets. These identified protein communities, in conjunction with other protein communities under strong mutation and survival burdens, can potentially be used as clinical biomarkers for breast cancer.

Collapse

Sisakht M, Shahrestanaki MK, Fallahi J, Razban V. PyComp: A Versatile Tool for Efficient Data Extraction, Conversion, and Management in High-throughput Virtual Drug Screening. Curr Comput Aided Drug Des 2025;21:479-486. [PMID: 38192133 DOI: 10.2174/0115734099274495231218150611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/29/2023] [Accepted: 12/02/2023] [Indexed: 01/10/2024]

Lin CX, Li HD, Wang J. LIMO-GCN: a linear model-integrated graph convolutional network for predicting Alzheimer disease genes. Brief Bioinform 2024;26:bbae611. [PMID: 39592152 PMCID: PMC11596108 DOI: 10.1093/bib/bbae611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 10/02/2024] [Accepted: 11/11/2024] [Indexed: 11/28/2024] Open

Marandon A, Rebafka T, Sokolovska N, Soula H. Conformal novelty detection for multiple metabolic networks. BMC Bioinformatics 2024;25:358. [PMID: 39550534 PMCID: PMC11569617 DOI: 10.1186/s12859-024-05971-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 10/25/2024] [Indexed: 11/18/2024] Open

Abstract

BACKGROUND

Graphical representations are useful to model complex data in general and biological interactions in particular. Our main motivation is the comparison of metabolic networks in the wider context of developing noninvasive accurate diagnostic tools. However, comparison and classification of graphs is still extremely challenging, although a number of highly efficient methods such as graph neural networks were developed in the recent decade. Important aspects are still lacking in graph classification: interpretability and guarantees on classification quality, i.e., control of the risk level or false discovery rate control.

RESULTS

In our contribution, we introduce a statistically sound approach to control the false discovery rate in a classification task for graphs in a semi-supervised setting. Our procedure identifies novelties in a dataset, where a graph is considered to be a novelty when its topology is significantly different from those in the reference class. It is noteworthy that the procedure is a conformal prediction approach, which does not make any distributional assumptions on the data and that can be seen as a wrapper around traditional machine learning models, so that it takes full advantage of existing methods. The performance of the proposed method is assessed on several standard benchmarks. It is also adapted and applied to the difficult task of classifying metabolic networks, where each graph is a representation of all metabolic reactions of a bacterium and to real task from a cancer data repository.

CONCLUSIONS

Our approach efficiently controls - in highly complex data - the false discovery rate, while maximizing the true discovery rate to get the most reasonable predictive performance. This contribution is focused on confident classification of complex data, what can be further used to explore complex human pathologies and their mechanisms.

Collapse

Feng Y, Long Y, Wang H, Ouyang Y, Li Q, Wu M, Zheng J. Benchmarking machine learning methods for synthetic lethality prediction in cancer. Nat Commun 2024;15:9058. [PMID: 39428397 PMCID: PMC11491473 DOI: 10.1038/s41467-024-52900-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 09/23/2024] [Indexed: 10/22/2024] Open

Tian Z, Yu Y, Ni F, Zou Q. Drug-target interaction prediction with collaborative contrastive learning and adaptive self-paced sampling strategy. BMC Biol 2024;22:216. [PMID: 39334132 PMCID: PMC11437672 DOI: 10.1186/s12915-024-02012-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 09/06/2024] [Indexed: 09/30/2024] Open

Cingiz MÖ. Ensemble decision of local similarity indices on the biological network for disease related gene prediction. PeerJ 2024;12:e17975. [PMID: 39247551 PMCID: PMC11380840 DOI: 10.7717/peerj.17975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/05/2024] [Indexed: 09/10/2024] Open

Abstract

Link prediction (LP) is a task for the identification of potential, missing and spurious links in complex networks. Protein-protein interaction (PPI) networks are important for understanding the underlying biological mechanisms of diseases. Many complex networks have been constructed using LP methods; however, there are a limited number of studies that focus on disease-related gene predictions and evaluate these genes using various evaluation criteria. The main objective of the study is to investigate the effect of a simple ensemble method in disease related gene predictions. Local similarity indices (LSIs) based disease related gene predictions were integrated by a simple ensemble decision method, simple majority voting (SMV), on the PPI network to detect accurate disease related genes. Human PPI network was utilized to discover potential disease related genes using four LSIs for the gene prediction. LSIs discovered potential links between disease related genes, which were obtained from OMIM database for gastric, colorectal, breast, prostate and lung cancers. LSIs based disease related genes were ranked due to their LSI scores in descending order for retrieving the top 10, 50 and 100 disease related genes. SMV integrated four LSIs based predictions to obtain SMV based the top 10, 50 and 100 disease related genes. The performance of LSIs based and SMV based genes were evaluated separately by employing overlap analyses, which were performed with GeneCard disease-gene relation dataset and Gene Ontology (GO) terms. The GO-terms were used for biological assessment for the inferred gene lists by LSIs and SMV on all cancer types. Adamic-Adar (AA), Resource Allocation Index (RAI), and SMV based gene lists are generally achieved good performance results on all cancers in both overlap analyses. SMV also outperformed on breast cancer data. The increment in the selection of the number of the top ranked disease related genes also enhanced the performance results of SMV.

Collapse

Ohnuki Y, Akiyama M, Sakakibara Y. Deep learning of multimodal networks with topological regularization for drug repositioning. J Cheminform 2024;16:103. [PMID: 39180095 PMCID: PMC11342530 DOI: 10.1186/s13321-024-00897-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 08/12/2024] [Indexed: 08/26/2024] Open

Trevena W, Zhong X, Lal A, Rovati L, Cubro E, Dong Y, Schulte P, Gajic O. Model-driven engineering for digital twins: a graph model-based patient simulation application. Front Physiol 2024;15:1424931. [PMID: 39189027 PMCID: PMC11345177 DOI: 10.3389/fphys.2024.1424931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 07/19/2024] [Indexed: 08/28/2024] Open

Li M, Wang Z, Liu L, Liu X, Zhang W. Subgraph-Aware Graph Kernel Neural Network for Link Prediction in Biological Networks. IEEE J Biomed Health Inform 2024;28:4373-4381. [PMID: 38630566 DOI: 10.1109/jbhi.2024.3390092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]

Hu Y, Liao T, Chen J, Bian J, Zheng Z, Chen C. Migrate demographic group for fair Graph Neural Networks. Neural Netw 2024;175:106264. [PMID: 38581810 DOI: 10.1016/j.neunet.2024.106264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 03/11/2024] [Accepted: 03/20/2024] [Indexed: 04/08/2024]

Yang Z, Wang L, Zhang X, Zeng B, Zhang Z, Liu X. LCASPMDA: a computational model for predicting potential microbe-drug associations based on learnable graph convolutional attention networks and self-paced iterative sampling ensemble. Front Microbiol 2024;15:1366272. [PMID: 38846568 PMCID: PMC11153849 DOI: 10.3389/fmicb.2024.1366272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 05/06/2024] [Indexed: 06/09/2024] Open

Thakur GK, Thakur A, Kulkarni S, Khan N, Khan S. Deep Learning Approaches for Medical Image Analysis and Diagnosis. Cureus 2024;16:e59507. [PMID: 38826977 PMCID: PMC11144045 DOI: 10.7759/cureus.59507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 05/01/2024] [Indexed: 06/04/2024] Open

Liu X, Hu J, Zheng J. SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality. Bioinformatics 2024;40:btae016. [PMID: 38244572 PMCID: PMC10868331 DOI: 10.1093/bioinformatics/btae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 12/10/2023] [Accepted: 01/16/2024] [Indexed: 01/22/2024] Open

Son J, Kim D. Applying network link prediction in drug discovery: an overview of the literature. Expert Opin Drug Discov 2024;19:43-56. [PMID: 37794688 DOI: 10.1080/17460441.2023.2267020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/02/2023] [Indexed: 10/06/2023]

Wang Y, Li Z, Rao J, Yang Y, Dai Z. Gene based message passing for drug repurposing. iScience 2023;26:107663. [PMID: 37670781 PMCID: PMC10475505 DOI: 10.1016/j.isci.2023.107663] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 08/06/2023] [Accepted: 08/14/2023] [Indexed: 09/07/2023] Open

Che L, Jin Y, Shi Y, Yu X, Sun H, Liu H, Li X. A drug molecular classification model based on graph structure generation. J Biomed Inform 2023;145:104447. [PMID: 37481052 DOI: 10.1016/j.jbi.2023.104447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 07/14/2023] [Accepted: 07/16/2023] [Indexed: 07/24/2023]

Du BX, Long Y, Li X, Wu M, Shi JY. CMMS-GCL: cross-modality metabolic stability prediction with graph contrastive learning. Bioinformatics 2023;39:btad503. [PMID: 37572298 PMCID: PMC10457661 DOI: 10.1093/bioinformatics/btad503] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 07/26/2023] [Accepted: 08/11/2023] [Indexed: 08/14/2023] Open

Abstract

MOTIVATION

Metabolic stability plays a crucial role in the early stages of drug discovery and development. Accurately modeling and predicting molecular metabolic stability has great potential for the efficient screening of drug candidates as well as the optimization of lead compounds. Considering wet-lab experiment is time-consuming, laborious, and expensive, in silico prediction of metabolic stability is an alternative choice. However, few computational methods have been developed to address this task. In addition, it remains a significant challenge to explain key functional groups determining metabolic stability.

RESULTS

To address these issues, we develop a novel cross-modality graph contrastive learning model named CMMS-GCL for predicting the metabolic stability of drug candidates. In our framework, we design deep learning methods to extract features for molecules from two modality data, i.e. SMILES sequence and molecule graph. In particular, for the sequence data, we design a multihead attention BiGRU-based encoder to preserve the context of symbols to learn sequence representations of molecules. For the graph data, we propose a graph contrastive learning-based encoder to learn structure representations by effectively capturing the consistencies between local and global structures. We further exploit fully connected neural networks to combine the sequence and structure representations for model training. Extensive experimental results on two datasets demonstrate that our CMMS-GCL consistently outperforms seven state-of-the-art methods. Furthermore, a collection of case studies on sequence data and statistical analyses of the graph structure module strengthens the validation of the interpretability of crucial functional groups recognized by CMMS-GCL. Overall, CMMS-GCL can serve as an effective and interpretable tool for predicting metabolic stability, identifying critical functional groups, and thus facilitating the drug discovery process and lead compound optimization.

AVAILABILITY AND IMPLEMENTATION

The code and data underlying this article are freely available at https://github.com/dubingxue/CMMS-GCL.

Collapse

Zhang P, Wang Z, Sun W, Xu J, Zhang W, Wu K, Wong L, Li L. RDRGSE: A Framework for Noncoding RNA-Drug Resistance Discovery by Incorporating Graph Skeleton Extraction and Attentional Feature Fusion. ACS OMEGA 2023;8:27386-27397. [PMID: 37546619 PMCID: PMC10398708 DOI: 10.1021/acsomega.3c02763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 07/06/2023] [Indexed: 08/08/2023]

Abstract

Identifying noncoding RNAs (ncRNAs)-drug resistance association computationally would have a marked effect on understanding ncRNA molecular function and drug target mechanisms and alleviating the screening cost of corresponding biological wet experiments. Although graph neural network-based methods have been developed and facilitated the detection of ncRNAs related to drug resistance, it remains a challenge to explore a highly trusty ncRNA-drug resistance association prediction framework, due to inevitable noise edges originating from the batch effect and experimental errors. Herein, we proposed a framework, referred to as RDRGSE (RDR association prediction by using graph skeleton extraction and attentional feature fusion), for detecting ncRNA-drug resistance association. Specifically, starting with the construction of the original ncRNA-drug resistance association as a bipartite graph, RDRGSE took advantage of a bi-view skeleton extraction strategy to obtain two types of skeleton views, followed by a graph neural network-based estimator for iteratively optimizing skeleton views aimed at learning high-quality ncRNA-drug resistance edge embedding and optimal graph skeleton structure, jointly. Then, RDRGSE adopted adaptive attentional feature fusion to obtain final edge embedding and identified potential RDRAs under an end-to-end pattern. Comprehensive experiments were conducted, and experimental results indicated the significant advantage of a skeleton structure for ncRNA-drug resistance association discovery. Compared with state-of-the-art approaches, RDRGSE improved the prediction performance by 6.7% in terms of AUC and 6.1% in terms of AUPR. Also, ablation-like analysis and independent case studies corroborated RDRGSE generalization ability and robustness. Overall, RDRGSE provides a powerful computational method for ncRNA-drug resistance association prediction, which can also serve as a screening tool for drug resistance biomarkers.

Collapse

Wang C, Yuan C, Wang Y, Chen R, Shi Y, Zhang T, Xue F, Patti GJ, Wei L, Hou Q. MPI-VGAE: protein-metabolite enzymatic reaction link learning by variational graph autoencoders. Brief Bioinform 2023;24:bbad189. [PMID: 37225420 PMCID: PMC10359079 DOI: 10.1093/bib/bbad189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 04/10/2023] [Accepted: 04/27/2023] [Indexed: 05/26/2023] Open

Abstract

Enzymatic reactions are crucial to explore the mechanistic function of metabolites and proteins in cellular processes and to understand the etiology of diseases. The increasing number of interconnected metabolic reactions allows the development of in silico deep learning-based methods to discover new enzymatic reaction links between metabolites and proteins to further expand the landscape of existing metabolite-protein interactome. Computational approaches to predict the enzymatic reaction link by metabolite-protein interaction (MPI) prediction are still very limited. In this study, we developed a Variational Graph Autoencoders (VGAE)-based framework to predict MPI in genome-scale heterogeneous enzymatic reaction networks across ten organisms. By incorporating molecular features of metabolites and proteins as well as neighboring information in the MPI networks, our MPI-VGAE predictor achieved the best predictive performance compared to other machine learning methods. Moreover, when applying the MPI-VGAE framework to reconstruct hundreds of metabolic pathways, functional enzymatic reaction networks and a metabolite-metabolite interaction network, our method showed the most robust performance among all scenarios. To the best of our knowledge, this is the first MPI predictor by VGAE for enzymatic reaction link prediction. Furthermore, we implemented the MPI-VGAE framework to reconstruct the disease-specific MPI network based on the disrupted metabolites and proteins in Alzheimer's disease and colorectal cancer, respectively. A substantial number of novel enzymatic reaction links were identified. We further validated and explored the interactions of these enzymatic reactions using molecular docking. These results highlight the potential of the MPI-VGAE framework for the discovery of novel disease-related enzymatic reactions and facilitate the study of the disrupted metabolisms in diseases.

Collapse

Affiliation(s)

Cheng Wang Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
Chuang Yuan Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
Yahui Wang Department of Chemistry, Washington University in St. Louis, St. Louis, MO, 63130, USA Center for Metabolomics and Isotope Tracing, Washington University in St. Louis, St. Louis, MO, 63130, USA
Ranran Chen Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
Yuying Shi Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
Tao Zhang Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
Fuzhong Xue Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
Gary J Patti Department of Chemistry, Washington University in St. Louis, St. Louis, MO, 63130, USA Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63130, USA Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, 63130, USA Center for Metabolomics and Isotope Tracing, Washington University in St. Louis, St. Louis, MO, 63130, USA
Leyi Wei School of Software, Shandong University, Jinan, 250100, China
Qingzhen Hou Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China

Collapse

Mangione W, Falls Z, Samudrala R. Effective holistic characterization of small molecule effects using heterogeneous biological networks. Front Pharmacol 2023;14:1113007. [PMID: 37180722 PMCID: PMC10169664 DOI: 10.3389/fphar.2023.1113007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 04/11/2023] [Indexed: 05/16/2023] Open

Abstract

The two most common reasons for attrition in therapeutic clinical trials are efficacy and safety. We integrated heterogeneous data to create a human interactome network to comprehensively describe drug behavior in biological systems, with the goal of accurate therapeutic candidate generation. The Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multiscale therapeutic discovery, repurposing, and design was enhanced by integrating drug side effects, protein pathways, protein-protein interactions, protein-disease associations, and the Gene Ontology, and complemented with its existing drug/compound, protein, and indication libraries. These integrated networks were reduced to a "multiscale interactomic signature" for each compound that describe its functional behavior as vectors of real values. These signatures are then used for relating compounds to each other with the hypothesis that similar signatures yield similar behavior. Our results indicated that there is significant biological information captured within our networks (particularly via side effects) which enhance the performance of our platform, as evaluated by performing all-against-all leave-one-out drug-indication association benchmarking as well as generating novel drug candidates for colon cancer and migraine disorders corroborated via literature search. Further, drug impacts on pathways derived from computed compound-protein interaction scores served as the features for a random forest machine learning model trained to predict drug-indication associations, with applications to mental disorders and cancer metastasis highlighted. This interactomic pipeline highlights the ability of Computational Analysis of Novel Drug Opportunities to accurately relate drugs in a multitarget and multiscale context, particularly for generating putative drug candidates using the information gleaned from indirect data such as side effect profiles and protein pathway information.

Collapse

Tian Z, Yu Y, Fang H, Xie W, Guo M. Predicting microbe-drug associations with structure-enhanced contrastive learning and self-paced negative sampling strategy. Brief Bioinform 2023;24:7009077. [PMID: 36715986 DOI: 10.1093/bib/bbac634] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 12/19/2022] [Accepted: 12/29/2022] [Indexed: 01/31/2023] Open

Abstract

MOTIVATION

Predicting the associations between human microbes and drugs (MDAs) is one critical step in drug development and precision medicine areas. Since discovering these associations through wet experiments is time-consuming and labor-intensive, computational methods have already been an effective way to tackle this problem. Recently, graph contrastive learning (GCL) approaches have shown great advantages in learning the embeddings of nodes from heterogeneous biological graphs (HBGs). However, most GCL-based approaches don't fully capture the rich structure information in HBGs. Besides, fewer MDA prediction methods could screen out the most informative negative samples for effectively training the classifier. Therefore, it still needs to improve the accuracy of MDA predictions.

RESULTS

In this study, we propose a novel approach that employs the Structure-enhanced Contrastive learning and Self-paced negative sampling strategy for Microbe-Drug Association predictions (SCSMDA). Firstly, SCSMDA constructs the similarity networks of microbes and drugs, as well as their different meta-path-induced networks. Then SCSMDA employs the representations of microbes and drugs learned from meta-path-induced networks to enhance their embeddings learned from the similarity networks by the contrastive learning strategy. After that, we adopt the self-paced negative sampling strategy to select the most informative negative samples to train the MLP classifier. Lastly, SCSMDA predicts the potential microbe-drug associations with the trained MLP classifier. The embeddings of microbes and drugs learning from the similarity networks are enhanced with the contrastive learning strategy, which could obtain their discriminative representations. Extensive results on three public datasets indicate that SCSMDA significantly outperforms other baseline methods on the MDA prediction task. Case studies for two common drugs could further demonstrate the effectiveness of SCSMDA in finding novel MDA associations.

AVAILABILITY

The source code is publicly available on GitHub https://github.com/Yue-Yuu/SCSMDA-master.

Collapse

Wang C, Yuan C, Wang Y, Chen R, Shi Y, Patti GJ, Hou Q. Genome-scale enzymatic reaction prediction by variational graph autoencoders. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.08.531729. [PMID: 36945484 PMCID: PMC10028866 DOI: 10.1101/2023.03.08.531729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]

Abstract

Background

Enzymatic reaction networks are crucial to explore the mechanistic function of metabolites and proteins in biological systems and understanding the etiology of diseases and potential target for drug discovery. The increasing number of metabolic reactions allows the development of deep learning-based methods to discover new enzymatic reactions, which will expand the landscape of existing enzymatic reaction networks to investigate the disrupted metabolisms in diseases.

Results

In this study, we propose the MPI-VGAE framework to predict metabolite-protein interactions (MPI) in a genome-scale heterogeneous enzymatic reaction network across ten organisms with thousands of enzymatic reactions. We improved the Variational Graph Autoencoders (VGAE) model to incorporate both molecular features of metabolites and proteins as well as neighboring features to achieve the best predictive performance of MPI. The MPI-VGAE framework showed robust performance in the reconstruction of hundreds of metabolic pathways and five functional enzymatic reaction networks. The MPI-VGAE framework was also applied to a homogenous metabolic reaction network and achieved as high performance as other state-of-art methods. Furthermore, the MPI-VGAE framework could be implemented to reconstruct the disease-specific MPI network based on hundreds of disrupted metabolites and proteins in Alzheimer's disease and colorectal cancer, respectively. A substantial number of new potential enzymatic reactions were predicted and validated by molecular docking. These results highlight the potential of the MPI-VGAE framework for the discovery of novel disease-related enzymatic reactions and drug targets in real-world applications.

Data availability and implementation

The MPI-VGAE framework and datasets are publicly accessible on GitHub https://github.com/mmetalab/mpi-vgae .

Author Biographies

Cheng Wang received his Ph.D. in Chemistry from The Ohio State Univesity, USA. He is currently a Assistant Professor in School of Public Health at Shandong University, China. His research interests include bioinformatics, machine learning-based approach with applications to biomedical networks. Chuang Yuan is a research assistant at Shandong University. He obtained the MS degree in Biology at the University of Science and Technology of China. His research interests include biochemistry & molecular biology, cell biology, biomedicine, bioinformatics, and computational biology. Yahui Wang is a PhD student in Department of Chemistry at Washington University in St. Louis. Her research interests include biochemistry, mass spectrometry-based metabolomics, and cancer metabolism. Ranran Chen is a master graduate student in School of Public Health at University of Shandong, China. Yuying Shi is a master graduate student in School of Public Health at University of Shandong, China. Gary J. Patti is the Michael and Tana Powell Professor at Washington University in St. Louis, where he holds appointments in the Department of Chemisrty and the Department of Medicine. He is also the Senior Director of the Center for Metabolomics and Isotope Tracing at Washington University. His research interests include metabolomics, bioinformatics, high-throughput mass spectrometry, environmental health, cancer, and aging. Leyi Wei received his Ph.D. in Computer Science from Xiamen University, China. He is currently a Professor in School of Software at Shandong University, China. His research interests include machine learning and its applications to bioinformatics. Qingzhen Hou received his Ph.D. in the Centre for Integrative Bioinformatics VU (IBIVU) from Vrije Universiteit Amsterdam, the Netherlands. Since 2020, He has serveved as the head of Bioinformatics Center in National Institute of Health Data Science of China and Assistant Professor in School of Public Health, Shandong University, China. His areas of research are bioinformatics and computational biophysics.

Key points

Genome-scale heterogeneous networks of metabolite-protein interaction (MPI) based on thousands of enzymatic reactions across ten organisms were constructed semi-automatically.An enzymatic reaction prediction method called Metabolite-Protein Interaction Variational Graph Autoencoders (MPI-VGAE) was developed and optimized to achieve higher performance compared with existing machine learning methods by using both molecular features of metabolites and proteins.MPI-VGAE is broadly useful for applications involving the reconstruction of metabolic pathways, functional enzymatic reaction networks, and homogenous networks (e.g., metabolic reaction networks).By implementing MPI-VGAE to Alzheimer's disease and colorectal cancer, we obtained several novel disease-related protein-metabolite reactions with biological meanings. Moreover, we further investigated the reasonable binding details of protein-metabolite interactions using molecular docking approaches which provided useful information for disease mechanism and drug design.

Collapse

Chandak P, Huang K, Zitnik M. Building a knowledge graph to enable precision medicine. Sci Data 2023;10:67. [PMID: 36732524 PMCID: PMC9893183 DOI: 10.1038/s41597-023-01960-3] [Citation(s) in RCA: 100] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 01/11/2023] [Indexed: 02/04/2023] Open

Temiz M, Bakir-Gungor B, Güner Şahan P, Coskun M. Topological feature generation for link prediction in biological networks. PeerJ 2023;11:e15313. [PMID: 37187525 PMCID: PMC10178302 DOI: 10.7717/peerj.15313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 04/06/2023] [Indexed: 05/17/2023] Open

Jiang C, Ngo V, Chapman R, Yu Y, Liu H, Jiang G, Zong N. Deep Denoising of Raw Biomedical Knowledge Graph from COVID-19 Literature, LitCovid and Pubtator. J Med Internet Res 2022;24:e38584. [PMID: 35658098 PMCID: PMC9301549 DOI: 10.2196/38584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 05/20/2022] [Accepted: 05/30/2022] [Indexed: 12/05/2022] Open

Abstract

Background

Multiple types of biomedical associations of knowledge graphs, including COVID-19–related ones, are constructed based on co-occurring biomedical entities retrieved from recent literature. However, the applications derived from these raw graphs (eg, association predictions among genes, drugs, and diseases) have a high probability of false-positive predictions as co-occurrences in the literature do not always mean there is a true biomedical association between two entities.

Objective

Data quality plays an important role in training deep neural network models; however, most of the current work in this area has been focused on improving a model’s performance with the assumption that the preprocessed data are clean. Here, we studied how to remove noise from raw knowledge graphs with limited labeled information.

Methods

The proposed framework used generative-based deep neural networks to generate a graph that can distinguish the unknown associations in the raw training graph. Two generative adversarial network models, NetGAN and Cross-Entropy Low-rank Logits (CELL), were adopted for the edge classification (ie, link prediction), leveraging unlabeled link information based on a real knowledge graph built from LitCovid and Pubtator.

Results

The performance of link prediction, especially in the extreme case of training data versus test data at a ratio of 1:9, demonstrated that the proposed method still achieved favorable results (area under the receiver operating characteristic curve >0.8 for the synthetic data set and 0.7 for the real data set), despite the limited amount of testing data available.

Conclusions

Our preliminary findings showed the proposed framework achieved promising results for removing noise during data preprocessing of the biomedical knowledge graph, potentially improving the performance of downstream applications by providing cleaner data.

Collapse