1
|
Wang C, Yang Y, Song J, Nan X. Research Progresses and Applications of Knowledge Graph Embedding Technique in Chemistry. J Chem Inf Model 2024. [PMID: 39302256 DOI: 10.1021/acs.jcim.4c00791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
A knowledge graph (KG) is a technique for modeling entities and their interrelations. Knowledge graph embedding (KGE) translates these entities and relationships into a continuous vector space to facilitate dense and efficient representations. In the domain of chemistry, applying KG and KGE techniques integrates heterogeneous chemical information into a coherent and user-friendly framework, enhances the representation of chemical data features, and is beneficial for downstream tasks, such as chemical property prediction. This paper begins with a comprehensive review of classical and contemporary KGE methodologies, including distance-based models, semantic matching models, and neural network-based approaches. We then catalogue the primary databases employed in chemistry and biochemistry that furnish the KGs with essential chemical data. Subsequently, we explore the latest applications of KG and KGE in chemistry, focusing on risk assessment, property prediction, and drug discovery. Finally, we discuss the current challenges to KG and KGE techniques and provide a perspective on their potential future developments.
Collapse
Affiliation(s)
- Chuanghui Wang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
| | - Yunqing Yang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
| | - Jinshuai Song
- Green Catalysis Center, College of Chemistry, Zhengzhou University, Zhengzhou 450001, China
| | - Xiaofei Nan
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
| |
Collapse
|
2
|
Hauben M, Rafi M, Abdelaziz I, Hassanzadeh O. Knowledge Graphs in Pharmacovigilance: A Scoping Review. Clin Ther 2024; 46:544-554. [PMID: 38981792 DOI: 10.1016/j.clinthera.2024.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 05/08/2024] [Accepted: 06/05/2024] [Indexed: 07/11/2024]
Abstract
PURPOSE To critically assess the role and added value of knowledge graphs in pharmacovigilance, focusing on their ability to predict adverse drug reactions. METHODS A systematic scoping review was conducted in which detailed information, including objectives, technology, data sources, methodology, and performance metrics, were extracted from a set of peer-reviewed publications reporting the use of knowledge graphs to support pharmacovigilance signal detection. FINDINGS The review, which included 47 peer-reviewed articles, found knowledge graphs were utilized for detecting/predicting single-drug adverse reactions and drug-drug interactions, with variable reported performance and sparse comparisons to legacy methods. IMPLICATIONS Research to date suggests that knowledge graphs have the potential to augment predictive signal detection in pharmacovigilance, but further research using more reliable reference sets of adverse drug reactions and comparison with legacy pharmacovigilance methods are needed to more clearly define best practices and to establish their place in holistic pharmacovigilance systems.
Collapse
Affiliation(s)
- Manfred Hauben
- Department of Family and Community Medicine, New York Medical College, Valhalla, New York; Truliant Consulting, Baltimore, Maryland
| | - Mazin Rafi
- Department of Statistics, Rutgers University, Piscataway, New Jersey.
| | | | | |
Collapse
|
3
|
Sawada T, Narukawa M. A Systematic Review of Treatment-Related Adverse Events for Combination Therapy of Multiple Tyrosine Kinase Inhibitor and Immune Checkpoint Inhibitor. Cancer Control 2024; 31:10732748241244586. [PMID: 38581169 PMCID: PMC10998490 DOI: 10.1177/10732748241244586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 02/06/2024] [Accepted: 03/06/2024] [Indexed: 04/08/2024] Open
Abstract
BACKGROUND Combination therapy with multiple tyrosine kinase inhibitors (multi-TKIs) and immune checkpoint inhibitors (ICIs) has been increasingly tested in clinical studies. This study aimed to investigate the effect of the addition of ICI to multi-TKIs on the profile of treatment-related adverse events. METHODS An electronic database search was performed using PubMed and Web of Science to identify published clinical studies on multi-TKI monotherapy and multi-TKI plus ICI combination therapy from July 20, 2005 to July 1, 2023. The incidence rate of common adverse events caused by multi-TKI monotherapy and multi-TKI plus ICI combination therapy was obtained and compared from the viewpoints of (1) relative risk for the combination therapy vs sunitinib, (2) adverse event incidence rate by clinical trial, and (3) pooled incidence rate. The quality of the evidence was assessed with the Cochrane risk of bias tool. Meta-analysis used random effects models. RESULTS This systematic review identified 83 clinical studies involving 7951 patients. The combination therapy of multi-TKI and ICI was associated with an increased risk of diarrhea (relative risk [RR]: 1.24, 95% confidence interval [CI]: 1.15-1.33, P < .001), hypothyroidism (RR: 1.44, 95% CI: 1.11-1.87, P = .0064) and rash (RR: 1.71, 95% CI: 1.18-2.47, P = .0045) compared with multi-TKI monotherapy. The addition of ICI was suggested to decrease the risk of adverse events related to performance status. CONCLUSION Our study identified an increased risk of treatment-related adverse events associated with multi-TKI plus ICI combination therapy. This would help optimize the management of toxicities caused by multi-TKI plus ICI combination therapy.
Collapse
Affiliation(s)
- Takashi Sawada
- Department of Clinical Medicine (Pharmaceutical Medicine), Graduate School of Pharmaceutical Sciences, Kitasato University, Minato-ku, Japan
| | - Mamoru Narukawa
- Department of Clinical Medicine (Pharmaceutical Medicine), Graduate School of Pharmaceutical Sciences, Kitasato University, Minato-ku, Japan
| |
Collapse
|
4
|
Xuan P, Xu K, Cui H, Nakaguchi T, Zhang T. Graph generative and adversarial strategy-enhanced node feature learning and self-calibrated pairwise attribute encoding for prediction of drug-related side effects. Front Pharmacol 2023; 14:1257842. [PMID: 37731739 PMCID: PMC10507253 DOI: 10.3389/fphar.2023.1257842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 08/17/2023] [Indexed: 09/22/2023] Open
Abstract
Background: Inferring drug-related side effects is beneficial for reducing drug development cost and time. Current computational prediction methods have concentrated on graph reasoning over heterogeneous graphs comprising the drug and side effect nodes. However, the various topologies and node attributes within multiple drug-side effect heterogeneous graphs have not been completely exploited. Methods: We proposed a new drug-side effect association prediction method, GGSC, to deeply integrate the diverse topologies and attributes from multiple heterogeneous graphs and the self-calibration attributes of each drug-side effect node pair. First, we created two heterogeneous graphs comprising the drug and side effect nodes and their related similarity and association connections. Since each heterogeneous graph has its specific topology and node attributes, a node feature learning strategy was designed and the learning for each graph was enhanced from a graph generative and adversarial perspective. We constructed a generator based on a graph convolutional autoencoder to encode the topological structure and node attributes from the whole heterogeneous graph and then generate the node features embedding the graph topology. A discriminator based on multilayer perceptron was designed to distinguish the generated topological features from the original ones. We also designed representation-level attention to discriminate the contributions of topological representations from multiple heterogeneous graphs and adaptively fused them. Finally, we constructed a self-calibration module based on convolutional neural networks to guide pairwise attribute learning through the features of the small latent space. Results: The comparison experiment results showed that GGSC had higher prediction performance than several state-of-the-art prediction methods. The ablation experiments demonstrated the effectiveness of topological enhancement learning, representation-level attention, and self-calibrated pairwise attribute learning. In addition, case studies over five drugs demonstrated GGSC's ability in discovering the potential drug-related side effect candidates. Conclusion: We proposed a drug-side effect association prediction method, and the method is beneficial for screening the reliable association candidates for the biologists to discover the actual associations.
Collapse
Affiliation(s)
- Ping Xuan
- Department of Computer Science, School of Engineering, Shantou University, Shantou, China
| | - Kai Xu
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, VI, Australia
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin, China
- School of Mathematical Science, Heilongjiang University, Harbin, China
| |
Collapse
|
5
|
Krix S, DeLong LN, Madan S, Domingo-Fernández D, Ahmad A, Gul S, Zaliani A, Fröhlich H. MultiGML: Multimodal graph machine learning for prediction of adverse drug events. Heliyon 2023; 9:e19441. [PMID: 37681175 PMCID: PMC10481305 DOI: 10.1016/j.heliyon.2023.e19441] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 08/22/2023] [Accepted: 08/23/2023] [Indexed: 09/09/2023] Open
Abstract
Adverse drug events constitute a major challenge for the success of clinical trials. Several computational strategies have been suggested to estimate the risk of adverse drug events in preclinical drug development. While these approaches have demonstrated high utility in practice, they are at the same time limited to specific information sources. Thus, many current computational approaches neglect a wealth of information which results from the integration of different data sources, such as biological protein function, gene expression, chemical compound structure, cell-based imaging and others. In this work we propose an integrative and explainable multi-modal Graph Machine Learning approach (MultiGML), which fuses knowledge graphs with multiple further data modalities to predict drug related adverse events and general drug target-phenotype associations. MultiGML demonstrates excellent prediction performance compared to alternative algorithms, including various traditional knowledge graph embedding techniques. MultiGML distinguishes itself from alternative techniques by providing in-depth explanations of model predictions, which point towards biological mechanisms associated with predictions of an adverse drug event. Hence, MultiGML could be a versatile tool to support decision making in preclinical drug development.
Collapse
Affiliation(s)
- Sophia Krix
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Lauren Nicole DeLong
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Artificial Intelligence and its Applications Institute, School of Informatics, University of Edinburgh, 10 Crichton Street, EH8 9AB, UK
| | - Sumit Madan
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Department of Computer Science, University of Bonn, 53115, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| | - Ashar Ahmad
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Grunenthal GmbH, 52099, Aachen, Germany
| | - Sheraz Gul
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Andrea Zaliani
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
| |
Collapse
|
6
|
Evangelista JE, Clarke DJB, Xie Z, Marino GB, Utti V, Jenkins SL, Ahooyi TM, Bologa CG, Yang JJ, Binder JL, Kumar P, Lambert CG, Grethe JS, Wenger E, Taylor D, Oprea TI, de Bono B, Ma'ayan A. Toxicology knowledge graph for structural birth defects. COMMUNICATIONS MEDICINE 2023; 3:98. [PMID: 37460679 PMCID: PMC10352311 DOI: 10.1038/s43856-023-00329-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 06/29/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND Birth defects are functional and structural abnormalities that impact about 1 in 33 births in the United States. They have been attributed to genetic and other factors such as drugs, cosmetics, food, and environmental pollutants during pregnancy, but for most birth defects there are no known causes. METHODS To further characterize associations between small molecule compounds and their potential to induce specific birth abnormalities, we gathered knowledge from multiple sources to construct a reproductive toxicity Knowledge Graph (ReproTox-KG) with a focus on associations between birth defects, drugs, and genes. Specifically, we gathered data from drug/birth-defect associations from co-mentions in published abstracts, gene/birth-defect associations from genetic studies, drug- and preclinical-compound-induced gene expression changes in cell lines, known drug targets, genetic burden scores for human genes, and placental crossing scores for small molecules. RESULTS Using ReproTox-KG and semi-supervised learning (SSL), we scored >30,000 preclinical small molecules for their potential to cross the placenta and induce birth defects, and identified >500 birth-defect/gene/drug cliques that can be used to explain molecular mechanisms for drug-induced birth defects. The ReproTox-KG can be accessed via a web-based user interface available at https://maayanlab.cloud/reprotox-kg . This site enables users to explore the associations between birth defects, approved and preclinical drugs, and all human genes. CONCLUSIONS ReproTox-KG provides a resource for exploring knowledge about the molecular mechanisms of birth defects with the potential of predicting the likelihood of genes and preclinical small molecules to induce birth defects.
Collapse
Affiliation(s)
- John Erol Evangelista
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Giacomo B Marino
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Vivian Utti
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Sherry L Jenkins
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Taha Mohseni Ahooyi
- The Children's Hospital of Philadelphia, Department of Biomedical and Health Informatics; Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Cristian G Bologa
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Jeremy J Yang
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Jessica L Binder
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Praveen Kumar
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Christophe G Lambert
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Jeffrey S Grethe
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Eric Wenger
- The Children's Hospital of Philadelphia, Department of Biomedical and Health Informatics; Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Deanne Taylor
- The Children's Hospital of Philadelphia, Department of Biomedical and Health Informatics; Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Tudor I Oprea
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Bernard de Bono
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
| |
Collapse
|
7
|
Aldughayfiq B, Ashfaq F, Jhanjhi NZ, Humayun M. Capturing Semantic Relationships in Electronic Health Records Using Knowledge Graphs: An Implementation Using MIMIC III Dataset and GraphDB. Healthcare (Basel) 2023; 11:1762. [PMID: 37372880 DOI: 10.3390/healthcare11121762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/03/2023] [Accepted: 06/12/2023] [Indexed: 06/29/2023] Open
Abstract
Electronic health records (EHRs) are an increasingly important source of information for healthcare professionals and researchers. However, EHRs are often fragmented, unstructured, and difficult to analyze due to the heterogeneity of the data sources and the sheer volume of information. Knowledge graphs have emerged as a powerful tool for capturing and representing complex relationships within large datasets. In this study, we explore the use of knowledge graphs to capture and represent complex relationships within EHRs. Specifically, we address the following research question: Can a knowledge graph created using the MIMIC III dataset and GraphDB effectively capture semantic relationships within EHRs and enable more efficient and accurate data analysis? We map the MIMIC III dataset to an ontology using text refinement and Protege; then, we create a knowledge graph using GraphDB and use SPARQL queries to retrieve and analyze information from the graph. Our results demonstrate that knowledge graphs can effectively capture semantic relationships within EHRs, enabling more efficient and accurate data analysis. We provide examples of how our implementation can be used to analyze patient outcomes and identify potential risk factors. Our results demonstrate that knowledge graphs are an effective tool for capturing semantic relationships within EHRs, enabling a more efficient and accurate data analysis. Our implementation provides valuable insights into patient outcomes and potential risk factors, contributing to the growing body of literature on the use of knowledge graphs in healthcare. In particular, our study highlights the potential of knowledge graphs to support decision-making and improve patient outcomes by enabling a more comprehensive and holistic analysis of EHR data. Overall, our research contributes to a better understanding of the value of knowledge graphs in healthcare and lays the foundation for further research in this area.
Collapse
Affiliation(s)
- Bader Aldughayfiq
- Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi Arabia
| | - Farzeen Ashfaq
- School of Computer Science-SCS, Taylor's University, Subang Jaya 47500, Malaysia
| | - N Z Jhanjhi
- School of Computer Science-SCS, Taylor's University, Subang Jaya 47500, Malaysia
| | - Mamoona Humayun
- Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi Arabia
| |
Collapse
|
8
|
Jackson DB, Racz R, Kim S, Brock S, Burkhart K. Rewiring Drug Research and Development through Human Data-Driven Discovery (HD 3). Pharmaceutics 2023; 15:1673. [PMID: 37376121 PMCID: PMC10303279 DOI: 10.3390/pharmaceutics15061673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 05/29/2023] [Accepted: 06/02/2023] [Indexed: 06/29/2023] Open
Abstract
In an era of unparalleled technical advancement, the pharmaceutical industry is struggling to transform data into increased research and development efficiency, and, as a corollary, new drugs for patients. Here, we briefly review some of the commonly discussed issues around this counterintuitive innovation crisis. Looking at both industry- and science-related factors, we posit that traditional preclinical research is front-loading the development pipeline with data and drug candidates that are unlikely to succeed in patients. Applying a first principles analysis, we highlight the critical culprits and provide suggestions as to how these issues can be rectified through the pursuit of a Human Data-driven Discovery (HD3) paradigm. Consistent with other examples of disruptive innovation, we propose that new levels of success are not dependent on new inventions, but rather on the strategic integration of existing data and technology assets. In support of these suggestions, we highlight the power of HD3, through recently published proof-of-concept applications in the areas of drug safety analysis and prediction, drug repositioning, the rational design of combination therapies and the global response to the COVID-19 pandemic. We conclude that innovators must play a key role in expediting the path to a largely human-focused, systems-based approach to drug discovery and research.
Collapse
Affiliation(s)
| | - Rebecca Racz
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20993, USA; (R.R.); (K.B.)
| | - Sarah Kim
- Department of Pharmaceutics, Center for Pharmacometrics and Systems Pharmacology, College of Pharmacy, University of Florida, Orlando, FL 32827, USA;
| | | | - Keith Burkhart
- Division of Applied Regulatory Science, Office of Clinical Pharmacology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20993, USA; (R.R.); (K.B.)
| |
Collapse
|
9
|
Murali L, Gopakumar G, Viswanathan DM, Nedungadi P. Towards electronic health record-based medical knowledge graph construction, completion, and applications: A literature study. J Biomed Inform 2023:104403. [PMID: 37230406 DOI: 10.1016/j.jbi.2023.104403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/16/2023] [Accepted: 05/19/2023] [Indexed: 05/27/2023]
Abstract
With the growth of data and intelligent technologies, the healthcare sector opened numerous technology that enabled services for patients, clinicians, and researchers. One major hurdle in achieving state-of-the-art results in health informatics is domain-specific terminologies and their semantic complexities. A knowledge graph crafted from medical concepts, events, and relationships acts as a medical semantic network to extract new links and hidden patterns from health data sources. Current medical knowledge graph construction studies are limited to generic techniques and opportunities and focus less on exploiting real-world data sources in knowledge graph construction. A knowledge graph constructed from Electronic Health Records (EHR) data obtains real-world data from healthcare records. It ensures better results in subsequent tasks like knowledge extraction and inference, knowledge graph completion, and medical knowledge graph applications such as diagnosis predictions, clinical recommendations, and clinical decision support. This review critically analyses existing works on medical knowledge graphs that used EHR data as the data source at (i) representation level, (ii) extraction level (iii) completion level. In this investigation, we found that EHR-based knowledge graph construction involves challenges such as high complexity and dimensionality of data, lack of knowledge fusion, and dynamic update of the knowledge graph. In addition, the study presents possible ways to tackle the challenges identified. Our findings conclude that future research should focus on knowledge graph integration and knowledge graph completion challenges.
Collapse
Affiliation(s)
- Lino Murali
- Center for Research in Analytics and Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India; Division of Information technology, School of Engineering, Cochin University of Science and Technology, Kochi, 682022, Kerala, India
| | - G Gopakumar
- Department of Computer Science and Engineering, School of Computing, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India
| | - Daleesha M Viswanathan
- Division of Information technology, School of Engineering, Cochin University of Science and Technology, Kochi, 682022, Kerala, India
| | - Prema Nedungadi
- Center for Research in Analytics and Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India; Department of Computer Science and Engineering, School of Computing, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India.
| |
Collapse
|
10
|
Fu M, Yan Y, Olde Loohuis LM, Chang TS. Defining the distance between diseases using SNOMED CT embeddings. J Biomed Inform 2023; 139:104307. [PMID: 36738869 DOI: 10.1016/j.jbi.2023.104307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 12/10/2022] [Accepted: 01/29/2023] [Indexed: 02/05/2023]
Abstract
Characterizing disease relationships is essential to biomedical research to understand disease etiology and improve clinical decision-making. Measurements of distance between disease pairs enable valuable research tasks, such as subgrouping patients and identifying common time courses of disease onset. Distance metrics developed in prior work focused on smaller, targeted disease sets. Distance metrics covering all diseases have not yet been defined, which limits the applications to a broader disease spectrum. Our current study defines disease distances for all disease pairs within the International Classification of Diseases, version 10 (ICD-10), the diagnostic classification system universally used in electronic health records. Our proposed distance is computed based on a biomedical ontology, SNOMED CT (Systemized Nomenclature of Medicine, Clinical Terms), which can also be viewed as a structured knowledge graph. We compared the knowledge graph-based metric to three other distance metrics based on the hierarchical structure of ICD, clinical comorbidity, and genetic correlation, to evaluate how each may capture similar or unique aspects of disease relationships. We show that our knowledge graph-based distance metric captures known phenotypic, clinical, and molecular characteristics at a finer granularity than the other three. With the continued growth of using electronic health records data for research, we believe that our distance metric will play an important role in subgrouping patients for precision health, and enabling individualized disease prevention and treatments.
Collapse
Affiliation(s)
- Mingzhou Fu
- Movement Disorders Program, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA; Medical Informatics Home Area, Department of Bioinformatics, University of California, Los Angeles, CA, USA
| | - Yu Yan
- Medical Informatics Home Area, Department of Bioinformatics, University of California, Los Angeles, CA, USA
| | - Loes M Olde Loohuis
- Center for Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
| | - Timothy S Chang
- Movement Disorders Program, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA.
| |
Collapse
|