1
|
Yang Y, Yu K, Gao S, Yu S, Xiong D, Qin C, Chen H, Tang J, Tang N, Zhu H. Alzheimer's disease knowledge graph enhances knowledge discovery and disease prediction. Comput Biol Med 2025; 192:110285. [PMID: 40306017 DOI: 10.1016/j.compbiomed.2025.110285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 03/26/2025] [Accepted: 04/24/2025] [Indexed: 05/02/2025]
Abstract
OBJECTIVE To construct an Alzheimer's Disease Knowledge Graph (ADKG) by extracting and integrating relationships among Alzheimer's disease (AD), genes, variants, chemicals, drugs, and other diseases from biomedical literature, aiming to identify existing treatments, potential targets, and diagnostic methods for AD. METHODS We annotated 800 PubMed abstracts (ADERC corpus) with 20,886 entities and 4935 relationships, augmented via GPT-4. A SpERT model (SciBERT-based) trained on this data extracted relations from PubMed abstracts, supported by biomedical databases and entity linking refined via abbreviation resolution/string matching. The resulting knowledge graph trained embedding models to predict novel relationships. ADKG's utility was validated by integrating it with UK Biobank data for predictive modeling. RESULTS The ADKG contained 3,199,276 entity mentions and 633,733 triplets, linking >5K unique entities and capturing complex AD-related interactions. Its graph embedding models produced evidence-supported predictions, enabling testable hypotheses. In UK Biobank predictive modeling, ADKG-enhanced models achieved higher AUROC of 0.928 comparing to 0.903 without ADKG enhancement. CONCLUSION By synthesizing literature-derived insights into a computable framework, ADKG bridges molecular mechanisms to clinical phenotypes, advancing precision medicine in Alzheimer's research. Its structured data and predictive utility underscore its potential to accelerate therapeutic discovery and risk stratification.
Collapse
Affiliation(s)
- Yue Yang
- Department of Biostatistics, University of North Carolina at Chapel Hill, USA
| | | | - Shan Gao
- Department of Mathematics and Statistics, Yunnan University, China
| | - Sheng Yu
- Center for Statistics Science, Tsinghua University, China
| | - Di Xiong
- Department of Mathematics, Shanghai University, China
| | - Chuanyang Qin
- Department of Mathematics and Statistics, Yunnan University, China
| | - Huiyuan Chen
- Department of Mathematics and Statistics, Yunnan University, China
| | - Jiarui Tang
- Department of Biostatistics, University of North Carolina at Chapel Hill, USA
| | - Niansheng Tang
- Department of Mathematics and Statistics, Yunnan University, China
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, USA.
| |
Collapse
|
2
|
Kumar AA, Bhandary S, Hegde SG, Chatterjee J. Knowledge graph applications and multi-relation learning for drug repurposing: A scoping review. Comput Biol Chem 2025; 115:108364. [PMID: 39914071 DOI: 10.1016/j.compbiolchem.2025.108364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Revised: 01/17/2025] [Accepted: 01/23/2025] [Indexed: 02/26/2025]
Abstract
OBJECTIVE Development of novel drug solutions has always been an expensive endeavour, hence drug repurposing as an approach has gained popularity in recent years. In this review we intend to examine one of the most unique computational methods for drug repurposing, that being knowledge graphs. METHOD Through literature review we looked at the application of knowledge graphs in medicine, specifically at its use in drug repurposing. We also looked at literature embedding methods, integration of machine learning models and approaches to completion of knowledge graphs. RESULT After filtering 43 papers were used for analysis. Timeline, country distribution, application areas of knowledge graph was highlighted. General trends in the use of knowledge graphs for drug repurposing and any shortcomings of the approach was discussed. CONCLUSION This approach has gained popularity only very recently; hence it is in a nascent phase.
Collapse
Affiliation(s)
- A Arun Kumar
- Department of Biotechnology, PES University, Bangalore 560085, India
| | - Samarth Bhandary
- Department of Biotechnology, PES University, Bangalore 560085, India
| | | | - Jhinuk Chatterjee
- Department of Biotechnology, PES University, Bangalore 560085, India.
| |
Collapse
|
3
|
Wang Q, Yang F, Quan L, Fu M, Yang Z, Wang J. Knowledge graph and its application in the study of neurological and mental disorders. Front Psychiatry 2025; 16:1452557. [PMID: 40171303 PMCID: PMC11958944 DOI: 10.3389/fpsyt.2025.1452557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Accepted: 02/28/2025] [Indexed: 04/03/2025] Open
Abstract
Neurological disorders (e.g., Alzheimer's disease and Parkinson's disease) and mental disorders (e.g., depression and anxiety), pose huge challenges to global public health. The pathogenesis of these diseases can usually be attributed to many factors, such as genetic, environmental and socioeconomic status, which make the diagnosis and treatment of the diseases difficult. As research on the diseases advances, so does the body of medical data. The accumulation of such data provides unique opportunities for the basic and clinical study of these diseases, but the vast and diverse nature of the data also make it difficult for physicians and researchers to precisely extract the information and utilize it in their work. A powerful tool to extract the necessary knowledge from large amounts of data is knowledge graph (KG). KG, as an organized form of information, has great potential for the study neurological and mental disorders when it is paired with big data and deep learning technologies. In this study, we reviewed the application of KGs in common neurological and mental disorders in recent years. We also discussed the current state of medical knowledge graphs, highlighting the obstacles and constraints that still need to be overcome.
Collapse
Affiliation(s)
- Qizheng Wang
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Fan Yang
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Lijie Quan
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Mengjie Fu
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Zhongli Yang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Ju Wang
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| |
Collapse
|
4
|
Gema AP, Grabarczyk D, De Wulf W, Borole P, Alfaro JA, Minervini P, Vergari A, Rajan A. Knowledge graph embeddings in the biomedical domain: are they useful? A look at link prediction, rule learning, and downstream polypharmacy tasks. BIOINFORMATICS ADVANCES 2024; 4:vbae097. [PMID: 39506988 PMCID: PMC11538020 DOI: 10.1093/bioadv/vbae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 06/05/2024] [Accepted: 07/16/2024] [Indexed: 11/08/2024]
Abstract
Summary Knowledge graphs (KGs) are powerful tools for representing and organizing complex biomedical data. They empower researchers, physicians, and scientists by facilitating rapid access to biomedical information, enabling the discernment of patterns or insights, and fostering the formulation of decisions and the generation of novel knowledge. To automate these activities, several KG embedding algorithms have been proposed to learn from and complete KGs. However, the efficacy of these embedding algorithms appears limited when applied to biomedical KGs, prompting questions about whether they can be useful in this field. To that end, we explore several widely used KG embedding models and evaluate their performance and applications using a recent biomedical KG, BioKG. We also demonstrate that by using recent best practices for training KG embeddings, it is possible to improve performance over BioKG. Additionally, we address interpretability concerns that naturally arise with such machine learning methods. In particular, we examine rule-based methods that aim to address these concerns by making interpretable predictions using learned rules, achieving comparable performance. Finally, we discuss a realistic use case where a pretrained BioKG embedding is further trained for a specific task, in this case, four polypharmacy scenarios where the goal is to predict missing links or entities in another downstream KGs in four polypharmacy scenarios. We conclude that in the right scenarios, biomedical KG embeddings can be effective and useful. Availability and implementation Our code and data is available at https://github.com/aryopg/biokge.
Collapse
Affiliation(s)
- Aryo Pradipta Gema
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| | - Dominik Grabarczyk
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| | - Wolf De Wulf
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| | - Piyush Borole
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| | - Javier Antonio Alfaro
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
- International Centre for Cancer Vaccine Science, University of Gdańsk, Gdańsk 80-822, Poland
- Department of Biochemistry and Microbiology, University of Victoria, British Columbia V8W 2Y2, Canada
| | - Pasquale Minervini
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| | - Antonio Vergari
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| | - Ajitha Rajan
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| |
Collapse
|
5
|
Hu X, Sun Z, Nian Y, Wang Y, Dang Y, Li F, Feng J, Yu E, Tao C. Self-Explainable Graph Neural Network for Alzheimer Disease and Related Dementias Risk Prediction: Algorithm Development and Validation Study. JMIR Aging 2024; 7:e54748. [PMID: 38976869 PMCID: PMC11263893 DOI: 10.2196/54748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 03/31/2024] [Accepted: 06/02/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND Alzheimer disease and related dementias (ADRD) rank as the sixth leading cause of death in the United States, underlining the importance of accurate ADRD risk prediction. While recent advancements in ADRD risk prediction have primarily relied on imaging analysis, not all patients undergo medical imaging before an ADRD diagnosis. Merging machine learning with claims data can reveal additional risk factors and uncover interconnections among diverse medical codes. OBJECTIVE The study aims to use graph neural networks (GNNs) with claim data for ADRD risk prediction. Addressing the lack of human-interpretable reasons behind these predictions, we introduce an innovative, self-explainable method to evaluate relationship importance and its influence on ADRD risk prediction. METHODS We used a variationally regularized encoder-decoder GNN (variational GNN [VGNN]) integrated with our proposed relation importance method for estimating ADRD likelihood. This self-explainable method can provide a feature-important explanation in the context of ADRD risk prediction, leveraging relational information within a graph. Three scenarios with 1-year, 2-year, and 3-year prediction windows were created to assess the model's efficiency, respectively. Random forest (RF) and light gradient boost machine (LGBM) were used as baselines. By using this method, we further clarify the key relationships for ADRD risk prediction. RESULTS In scenario 1, the VGNN model showed area under the receiver operating characteristic (AUROC) scores of 0.7272 and 0.7480 for the small subset and the matched cohort data set. It outperforms RF and LGBM by 10.6% and 9.1%, respectively, on average. In scenario 2, it achieved AUROC scores of 0.7125 and 0.7281, surpassing the other models by 10.5% and 8.9%, respectively. Similarly, in scenario 3, AUROC scores of 0.7001 and 0.7187 were obtained, exceeding 10.1% and 8.5% than the baseline models, respectively. These results clearly demonstrate the significant superiority of the graph-based approach over the tree-based models (RF and LGBM) in predicting ADRD. Furthermore, the integration of the VGNN model and our relation importance interpretation could provide valuable insight into paired factors that may contribute to or delay ADRD progression. CONCLUSIONS Using our innovative self-explainable method with claims data enhances ADRD risk prediction and provides insights into the impact of interconnected medical code relationships. This methodology not only enables ADRD risk modeling but also shows potential for other image analysis predictions using claims data.
Collapse
Affiliation(s)
- Xinyue Hu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Zenan Sun
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Yi Nian
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Yichen Wang
- Division of Hospital Medicine at Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, United States
| | - Yifang Dang
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Fang Li
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Jingna Feng
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Evan Yu
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Cui Tao
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, United States
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| |
Collapse
|
6
|
Yang Y, Yu K, Gao S, Yu S, Xiong D, Qin C, Chen H, Tang J, Tang N, Zhu H. Alzheimer's Disease Knowledge Graph Enhances Knowledge Discovery and Disease Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.03.601339. [PMID: 39005357 PMCID: PMC11245034 DOI: 10.1101/2024.07.03.601339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Background Alzheimer's disease (AD), a progressive neurodegenerative disorder, continues to increase in prevalence without any effective treatments to date. In this context, knowledge graphs (KGs) have emerged as a pivotal tool in biomedical research, offering new perspectives on drug repurposing and biomarker discovery by analyzing intricate network structures. Our study seeks to build an AD-specific knowledge graph, highlighting interactions among AD, genes, variants, chemicals, drugs, and other diseases. The goal is to shed light on existing treatments, potential targets, and diagnostic methods for AD, thereby aiding in drug repurposing and the identification of biomarkers. Results We annotated 800 PubMed abstracts and leveraged GPT-4 for text augmentation to enrich our training data for named entity recognition (NER) and relation classification. A comprehensive data mining model, integrating NER and relationship classification, was trained on the annotated corpus. This model was subsequently applied to extract relation triplets from unannotated abstracts. To enhance entity linking, we utilized a suite of reference biomedical databases and refine the linking accuracy through abbreviation resolution. As a result, we successfully identified 3,199,276 entity mentions and 633,733 triplets, elucidating connections between 5,000 unique entities. These connections were pivotal in constructing a comprehensive Alzheimer's Disease Knowledge Graph (ADKG). We also integrated the ADKG constructed after entity linking with other biomedical databases. The ADKG served as a training ground for Knowledge Graph Embedding models with the high-ranking predicted triplets supported by evidence, underscoring the utility of ADKG in generating testable scientific hypotheses. Further application of ADKG in predictive modeling using the UK Biobank data revealed models based on ADKG outperforming others, as evidenced by higher values in the areas under the receiver operating characteristic (ROC) curves. Conclusion The ADKG is a valuable resource for generating hypotheses and enhancing predictive models, highlighting its potential to advance AD's disease research and treatment strategies.
Collapse
Affiliation(s)
- Yue Yang
- Department of Biostatistics, University of North Carolina at Chapel Hill
| | - Kaixian Yu
- Independent Researcher, Shanghai, P.R. China
| | - Shan Gao
- Department of Mathematics and Statistics, Yunnan University
| | - Sheng Yu
- Center for Statistics Science, Tsinghua University
| | - Di Xiong
- Department of Statistics, Shanghai University
| | - Chuanyang Qin
- Department of Mathematics and Statistics, Yunnan University
| | - Huiyuan Chen
- Department of Mathematics and Statistics, Yunnan University
| | - Jiarui Tang
- Department of Biostatistics, University of North Carolina at Chapel Hill
| | - Niansheng Tang
- Department of Mathematics and Statistics, Yunnan University
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill
| |
Collapse
|
7
|
Romano JD, Truong V, Kumar R, Venkatesan M, Graham BE, Hao Y, Matsumoto N, Li X, Wang Z, Ritchie MD, Shen L, Moore JH. The Alzheimer's Knowledge Base: A Knowledge Graph for Alzheimer Disease Research. J Med Internet Res 2024; 26:e46777. [PMID: 38635981 PMCID: PMC11066745 DOI: 10.2196/46777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 06/23/2023] [Accepted: 11/07/2023] [Indexed: 04/20/2024] Open
Abstract
BACKGROUND As global populations age and become susceptible to neurodegenerative illnesses, new therapies for Alzheimer disease (AD) are urgently needed. Existing data resources for drug discovery and repurposing fail to capture relationships central to the disease's etiology and response to drugs. OBJECTIVE We designed the Alzheimer's Knowledge Base (AlzKB) to alleviate this need by providing a comprehensive knowledge representation of AD etiology and candidate therapeutics. METHODS We designed the AlzKB as a large, heterogeneous graph knowledge base assembled using 22 diverse external data sources describing biological and pharmaceutical entities at different levels of organization (eg, chemicals, genes, anatomy, and diseases). AlzKB uses a Web Ontology Language 2 ontology to enforce semantic consistency and allow for ontological inference. We provide a public version of AlzKB and allow users to run and modify local versions of the knowledge base. RESULTS AlzKB is freely available on the web and currently contains 118,902 entities with 1,309,527 relationships between those entities. To demonstrate its value, we used graph data science and machine learning to (1) propose new therapeutic targets based on similarities of AD to Parkinson disease and (2) repurpose existing drugs that may treat AD. For each use case, AlzKB recovers known therapeutic associations while proposing biologically plausible new ones. CONCLUSIONS AlzKB is a new, publicly available knowledge resource that enables researchers to discover complex translational associations for AD drug discovery. Through 2 use cases, we show that it is a valuable tool for proposing novel therapeutic hypotheses based on public biomedical knowledge.
Collapse
Affiliation(s)
- Joseph D Romano
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Center of Excellence in Environmental Toxicology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Van Truong
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Rachit Kumar
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Medical Scientist Training Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Mythreye Venkatesan
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Britney E Graham
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Yun Hao
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Nick Matsumoto
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Xi Li
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Zhiping Wang
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Marylyn D Ritchie
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Li Shen
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| |
Collapse
|
8
|
Ghorbanali Z, Zare-Mirakabad F, Salehi N, Akbari M, Masoudi-Nejad A. DrugRep-HeSiaGraph: when heterogenous siamese neural network meets knowledge graphs for drug repurposing. BMC Bioinformatics 2023; 24:374. [PMID: 37789314 PMCID: PMC10548718 DOI: 10.1186/s12859-023-05479-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 09/12/2023] [Indexed: 10/05/2023] Open
Abstract
BACKGROUND Drug repurposing is an approach that holds promise for identifying new therapeutic uses for existing drugs. Recently, knowledge graphs have emerged as significant tools for addressing the challenges of drug repurposing. However, there are still major issues with constructing and embedding knowledge graphs. RESULTS This study proposes a two-step method called DrugRep-HeSiaGraph to address these challenges. The method integrates the drug-disease knowledge graph with the application of a heterogeneous siamese neural network. In the first step, a drug-disease knowledge graph named DDKG-V1 is constructed by defining new relationship types, and then numerical vector representations for the nodes are created using the distributional learning method. In the second step, a heterogeneous siamese neural network called HeSiaNet is applied to enrich the embedding of drugs and diseases by bringing them closer in a new unified latent space. Then, it predicts potential drug candidates for diseases. DrugRep-HeSiaGraph achieves impressive performance metrics, including an AUC-ROC of 91.16%, an AUC-PR of 90.32%, an accuracy of 84.63%, a BS of 0.119, and an MCC of 69.31%. CONCLUSION We demonstrate the effectiveness of the proposed method in identifying potential drugs for COVID-19 as a case study. In addition, this study shows the role of dipeptidyl peptidase 4 (DPP-4) as a potential receptor for SARS-CoV-2 and the effectiveness of DPP-4 inhibitors in facing COVID-19. This highlights the practical application of the model in addressing real-world challenges in the field of drug repurposing. The code and data for DrugRep-HeSiaGraph are publicly available at https://github.com/CBRC-lab/DrugRep-HeSiaGraph .
Collapse
Affiliation(s)
- Zahra Ghorbanali
- Computational Biology Research Center (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Fatemeh Zare-Mirakabad
- Computational Biology Research Center (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran.
| | - Najmeh Salehi
- School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Mohammad Akbari
- Computational Biology Research Center (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| |
Collapse
|
9
|
Li F, Nian Y, Sun Z, Tao C. Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions. Yearb Med Inform 2023; 32:215-224. [PMID: 38147863 PMCID: PMC10751115 DOI: 10.1055/s-0043-1768735] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023] Open
Abstract
OBJECTIVES Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research. METHODS We conducted a comprehensive search of multiple databases, including PubMed, Web of Science, IEEE Xplore, and Google Scholar, to collect relevant publications from the past two years (2021-2022). The studies selected for review were based on their relevance to the topic and the publication quality. RESULTS A total of 78 articles were included in our analysis. We identified three main categories of GRL methods and summarized their methodological foundations and notable models. In terms of GRL applications, we focused on two main topics: drug and disease. We analyzed the study frameworks and achievements of the prominent research. Based on the current state-of-the-art, we discussed the challenges and future directions. CONCLUSIONS GRL methods applied in the biomedical field demonstrated several key characteristics, including the utilization of attention mechanisms to prioritize relevant features, a growing emphasis on model interpretability, and the combination of various techniques to improve model performance. There are also challenges needed to be addressed, including mitigating model bias, accommodating the heterogeneity of large-scale knowledge graphs, and improving the availability of high-quality graph data. To fully leverage the potential of GRL, future efforts should prioritize these areas of research.
Collapse
Affiliation(s)
- Fang Li
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yi Nian
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Zenan Sun
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Cui Tao
- McWilliams School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX, USA
| |
Collapse
|