1
|
Zhang Q, He Y, Lu YP, Wei QH, Zhang HY, Quan Y. GETdb: A comprehensive database for genetic and evolutionary features of drug targets. Comput Struct Biotechnol J 2024; 23:1429-1438. [PMID: 38616961 PMCID: PMC11015738 DOI: 10.1016/j.csbj.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 03/25/2024] [Accepted: 04/01/2024] [Indexed: 04/16/2024] Open
Abstract
The development of an innovative drug is complex and time-consuming, and the drug target identification is one of the critical steps in drug discovery process. Effective and accurate identification of drug targets can accelerate the drug development process. According to previous research, evolutionary and genetic information of genes has been found to facilitate the identification of approved drug targets. In addition, allosteric proteins have great potential as targets due to their structural diversity. However, this information that could facilitate target identification has not been collated in existing drug target databases. Here, we construct a comprehensive drug target database named Genetic and Evolutionary features of drug Targets database (GETdb, http://zhanglab.hzau.edu.cn/GETdb/page/index.jsp). This database not only integrates and standardizes data from dozens of commonly used drug and target databases, but also innovatively includes the genetic and evolutionary information of targets. Moreover, this database features an effective allosteric protein prediction model. GETdb contains approximately 4000 targets and over 29,000 drugs, and is a user-friendly database for searching, browsing and downloading data to facilitate the development of novel targets.
Collapse
Affiliation(s)
- Qi Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yang He
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Ya-Ping Lu
- Sinopharm Genomics Technology Co., Ltd., Wuhan 430030, PR China
- Sinopharm Medical Laboratory (Wuhan) Co., Ltd., Wuhan 430030, PR China
| | - Qi-Hao Wei
- Sinopharm (Wuhan) Precision Medical Technology Co., Ltd., Wuhan 430030, PR China
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Yuan Quan
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, PR China
| |
Collapse
|
2
|
Du X, Sun X, Li M. Knowledge Graph Convolutional Network with Heuristic Search for Drug Repositioning. J Chem Inf Model 2024. [PMID: 38837744 DOI: 10.1021/acs.jcim.4c00737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
Drug repositioning is a strategy of repurposing approved drugs for treating new indications, which can accelerate the drug discovery process, reduce development costs, and lower the safety risk. The advancement of biotechnology has significantly accelerated the speed and scale of biological data generation, offering significant potential for drug repositioning through biomedical knowledge graphs that integrate diverse entities and relations from various biomedical sources. To fully learn the semantic information and topological structure information from the biological knowledge graph, we propose a knowledge graph convolutional network with a heuristic search, named KGCNH, which can effectively utilize the diversity of entities and relationships in biological knowledge graphs, as well as topological structure information, to predict the associations between drugs and diseases. Specifically, we design a relation-aware attention mechanism to compute the attention scores for each neighboring entity of a given entity under different relations. To address the challenge of randomness of the initial attention scores potentially impacting model performance and to expand the search scope of the model, we designed a heuristic search module based on Gumbel-Softmax, which uses attention scores as heuristic information and introduces randomness to assist the model in exploring more optimal embeddings of drugs and diseases. Following this module, we derive the relation weights, obtain the embeddings of drugs and diseases through neighborhood aggregation, and then predict drug-disease associations. Additionally, we employ feature-based augmented views to enhance model robustness and mitigate overfitting issues. We have implemented our method and conducted experiments on two data sets. The results demonstrate that KGCNH outperforms competing methods. In particular, case studies on lithium and quetiapine confirm that KGCNH can retrieve more actual drug-disease associations in the top prediction results.
Collapse
Affiliation(s)
- Xiang Du
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
- School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China
| | - Xinliang Sun
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| |
Collapse
|
3
|
Nandi S, Bhaduri S, Das D, Ghosh P, Mandal M, Mitra P. Deciphering the Lexicon of Protein Targets: A Review on Multifaceted Drug Discovery in the Era of Artificial Intelligence. Mol Pharm 2024; 21:1563-1590. [PMID: 38466810 DOI: 10.1021/acs.molpharmaceut.3c01161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Understanding protein sequence and structure is essential for understanding protein-protein interactions (PPIs), which are essential for many biological processes and diseases. Targeting protein binding hot spots, which regulate signaling and growth, with rational drug design is promising. Rational drug design uses structural data and computational tools to study protein binding sites and protein interfaces to design inhibitors that can change these interactions, thereby potentially leading to therapeutic approaches. Artificial intelligence (AI), such as machine learning (ML) and deep learning (DL), has advanced drug discovery and design by providing computational resources and methods. Quantum chemistry is essential for drug reactivity, toxicology, drug screening, and quantitative structure-activity relationship (QSAR) properties. This review discusses the methodologies and challenges of identifying and characterizing hot spots and binding sites. It also explores the strategies and applications of artificial-intelligence-based rational drug design technologies that target proteins and protein-protein interaction (PPI) binding hot spots. It provides valuable insights for drug design with therapeutic implications. We have also demonstrated the pathological conditions of heat shock protein 27 (HSP27) and matrix metallopoproteinases (MMP2 and MMP9) and designed inhibitors of these proteins using the drug discovery paradigm in a case study on the discovery of drug molecules for cancer treatment. Additionally, the implications of benzothiazole derivatives for anticancer drug design and discovery are deliberated.
Collapse
Affiliation(s)
- Suvendu Nandi
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Soumyadeep Bhaduri
- Centre for Computational and Data Sciences, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Debraj Das
- Centre for Computational and Data Sciences, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Priya Ghosh
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Mahitosh Mandal
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| |
Collapse
|
4
|
Ryan DK, Maclean RH, Balston A, Scourfield A, Shah AD, Ross J. Artificial intelligence and machine learning for clinical pharmacology. Br J Clin Pharmacol 2024; 90:629-639. [PMID: 37845024 DOI: 10.1111/bcp.15930] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 10/18/2023] Open
Abstract
Artificial intelligence (AI) will impact many aspects of clinical pharmacology, including drug discovery and development, clinical trials, personalized medicine, pharmacogenomics, pharmacovigilance and clinical toxicology. The rapid progress of AI in healthcare means clinical pharmacologists should have an understanding of AI and its implementation in clinical practice. As with any new therapy or health technology, it is imperative that AI tools are subject to robust and stringent evaluation to ensure that they enhance clinical practice in a safe and equitable manner. This review serves as an introduction to AI for the clinical pharmacologist, highlighting current applications, aspects of model development and issues surrounding evaluation and deployment. The aim of this article is to empower clinical pharmacologists to embrace and lead on the safe and effective use of AI within healthcare.
Collapse
Affiliation(s)
- David K Ryan
- Department of Clinical Pharmacology, University College London Hospitals NHS Foundation Trust, London, UK
| | - Rory H Maclean
- Department of Clinical Pharmacology, University College London Hospitals NHS Foundation Trust, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Alfred Balston
- Department of Clinical Pharmacology, Guy's and St Thomas' NHS Foundation Trust, London, UK
| | - Andrew Scourfield
- Department of Clinical Pharmacology, University College London Hospitals NHS Foundation Trust, London, UK
| | - Anoop D Shah
- Department of Clinical Pharmacology, University College London Hospitals NHS Foundation Trust, London, UK
- Institute of Health Informatics, University College London, London, UK
- National Institute for Health Research, University College London Hospitals Biomedical Research Centre, London, UK
| | - Jack Ross
- Department of Clinical Pharmacology, University College London Hospitals NHS Foundation Trust, London, UK
| |
Collapse
|
5
|
Lin CX, Guan Y, Li HD. Artificial intelligence approaches for molecular representation in drug response prediction. Curr Opin Struct Biol 2024; 84:102747. [PMID: 38091924 DOI: 10.1016/j.sbi.2023.102747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/26/2023] [Accepted: 11/26/2023] [Indexed: 02/09/2024]
Abstract
Drug response prediction is essential for drug development and disease treatment. One key question in predicting drug response is the representation of molecules, which has been greatly advanced by artificial intelligence (AI) techniques in recent years. In this review, we first describe different types of representation methods, pinpointing their key principles and discussing their limitations. Thereafter we discuss potential ways how these methods could be further developed. We expect that this review will provide useful guidance for researchers in the community.
Collapse
Affiliation(s)
- Cui-Xiang Lin
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, Hunan Province, PR China
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Hong-Dong Li
- School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, PR China.
| |
Collapse
|
6
|
Han S, Lee JE, Kang S, So M, Jin H, Lee JH, Baek S, Jun H, Kim TY, Lee YS. Standigm ASK™: knowledge graph and artificial intelligence platform applied to target discovery in idiopathic pulmonary fibrosis. Brief Bioinform 2024; 25:bbae035. [PMID: 38349059 PMCID: PMC10862655 DOI: 10.1093/bib/bbae035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 12/28/2023] [Indexed: 02/15/2024] Open
Abstract
Standigm ASK™ revolutionizes healthcare by addressing the critical challenge of identifying pivotal target genes in disease mechanisms-a fundamental aspect of drug development success. Standigm ASK™ integrates a unique combination of a heterogeneous knowledge graph (KG) database and an attention-based neural network model, providing interpretable subgraph evidence. Empowering users through an interactive interface, Standigm ASK™ facilitates the exploration of predicted results. Applying Standigm ASK™ to idiopathic pulmonary fibrosis (IPF), a complex lung disease, we focused on genes (AMFR, MDFIC and NR5A2) identified through KG evidence. In vitro experiments demonstrated their relevance, as TGFβ treatment induced gene expression changes associated with epithelial-mesenchymal transition characteristics. Gene knockdown reversed these changes, identifying AMFR, MDFIC and NR5A2 as potential therapeutic targets for IPF. In summary, Standigm ASK™ emerges as an innovative KG and artificial intelligence platform driving insights in drug target discovery, exemplified by the identification and validation of therapeutic targets for IPF.
Collapse
Affiliation(s)
- Seokjin Han
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Ji Eun Lee
- College of Pharmacy, Ewha Womans University, Ewhayeodae-gil, 03760, Seoul, Republic of Korea
| | - Seolhee Kang
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Minyoung So
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Hee Jin
- College of Pharmacy, Ewha Womans University, Ewhayeodae-gil, 03760, Seoul, Republic of Korea
| | - Jang Ho Lee
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Sunghyeob Baek
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Hyungjin Jun
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Tae Yong Kim
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Yun-Sil Lee
- College of Pharmacy, Ewha Womans University, Ewhayeodae-gil, 03760, Seoul, Republic of Korea
| |
Collapse
|
7
|
Yang X, Huang K, Yang D, Zhao W, Zhou X. Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review. GLOBAL CHALLENGES (HOBOKEN, NJ) 2024; 8:2300163. [PMID: 38223896 PMCID: PMC10784210 DOI: 10.1002/gch2.202300163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/20/2023] [Indexed: 01/16/2024]
Abstract
The explosive growth of biomedical Big Data presents both significant opportunities and challenges in the realm of knowledge discovery and translational applications within precision medicine. Efficient management, analysis, and interpretation of big data can pave the way for groundbreaking advancements in precision medicine. However, the unprecedented strides in the automated collection of large-scale molecular and clinical data have also introduced formidable challenges in terms of data analysis and interpretation, necessitating the development of novel computational approaches. Some potential challenges include the curse of dimensionality, data heterogeneity, missing data, class imbalance, and scalability issues. This overview article focuses on the recent progress and breakthroughs in the application of big data within precision medicine. Key aspects are summarized, including content, data sources, technologies, tools, challenges, and existing gaps. Nine fields-Datawarehouse and data management, electronic medical record, biomedical imaging informatics, Artificial intelligence-aided surgical design and surgery optimization, omics data, health monitoring data, knowledge graph, public health informatics, and security and privacy-are discussed.
Collapse
Affiliation(s)
- Xue Yang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Kexin Huang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Dewei Yang
- College of Advanced Manufacturing EngineeringChongqing University of Posts and TelecommunicationsChongqingChongqing400000China
| | - Weiling Zhao
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| | - Xiaobo Zhou
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| |
Collapse
|
8
|
Boudin M, Diallo G, Drancé M, Mougin F. The OREGANO knowledge graph for computational drug repurposing. Sci Data 2023; 10:871. [PMID: 38057380 PMCID: PMC10700660 DOI: 10.1038/s41597-023-02757-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 11/16/2023] [Indexed: 12/08/2023] Open
Abstract
Drug repositioning is a faster and more affordable solution than traditional drug discovery approaches. From this perspective, computational drug repositioning using knowledge graphs is a very promising direction. Knowledge graphs constructed from drug data and information can be used to generate hypotheses (molecule/drug - target links) through link prediction using machine learning algorithms. However, it remains rare to have a holistically constructed knowledge graph using the broadest possible features and drug characteristics, which is freely available to the community. The OREGANO knowledge graph aims at filling this gap. The purpose of this paper is to present the OREGANO knowledge graph, which includes natural compounds related data. The graph was developed from scratch by retrieving data directly from the knowledge sources to be integrated. We therefore designed the expected graph model and proposed a method for merging nodes between the different knowledge sources, and finally, the data were cleaned. The knowledge graph, as well as the source codes for the ETL process, are openly available on the GitHub of the OREGANO project ( https://gitub.u-bordeaux.fr/erias/oregano ).
Collapse
Affiliation(s)
- Marina Boudin
- AHeaD team, Bordeaux Population Health Inserm U1219, Univ. Bordeaux, F-33000, Bordeaux, France.
| | - Gayo Diallo
- AHeaD team, Bordeaux Population Health Inserm U1219, Univ. Bordeaux, F-33000, Bordeaux, France
| | - Martin Drancé
- AHeaD team, Bordeaux Population Health Inserm U1219, Univ. Bordeaux, F-33000, Bordeaux, France
| | - Fleur Mougin
- AHeaD team, Bordeaux Population Health Inserm U1219, Univ. Bordeaux, F-33000, Bordeaux, France
| |
Collapse
|
9
|
Rushing BR, Thessen AE, Soliman GA, Ramesh A, Sumner SCJ. The Exposome and Nutritional Pharmacology and Toxicology: A New Application for Metabolomics. EXPOSOME 2023; 3:osad008. [PMID: 38766521 PMCID: PMC11101153 DOI: 10.1093/exposome/osad008] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
The exposome refers to all of the internal and external life-long exposures that an individual experiences. These exposures, either acute or chronic, are associated with changes in metabolism that will positively or negatively influence the health and well-being of individuals. Nutrients and other dietary compounds modulate similar biochemical processes and have the potential in some cases to counteract the negative effects of exposures or enhance their beneficial effects. We present herein the concept of Nutritional Pharmacology/Toxicology which uses high-information metabolomics workflows to identify metabolic targets associated with exposures. Using this information, nutritional interventions can be designed toward those targets to mitigate adverse effects or enhance positive effects. We also discuss the potential for this approach in precision nutrition where nutrients/diet can be used to target gene-environment interactions and other subpopulation characteristics. Deriving these "nutrient cocktails" presents an opportunity to modify the effects of exposures for more beneficial outcomes in public health.
Collapse
Affiliation(s)
- Blake R. Rushing
- Department of Nutrition, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Anne E Thessen
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Ghada A. Soliman
- Department of Environmental, Occupational and Geospatial Health Sciences, City University of New York-Graduate School of Public Health and Health Policy, New York, NY, USA
| | - Aramandla Ramesh
- Department of Biochemistry, Cancer Biology, Neuroscience & Pharmacology, Meharry Medical College, Nashville, TN, USA
| | - Susan CJ Sumner
- Department of Nutrition, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
10
|
Tan H, Wang Z, Hu G. GAABind: a geometry-aware attention-based network for accurate protein-ligand binding pose and binding affinity prediction. Brief Bioinform 2023; 25:bbad462. [PMID: 38102069 PMCID: PMC10724026 DOI: 10.1093/bib/bbad462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 11/19/2023] [Accepted: 11/22/2023] [Indexed: 12/17/2023] Open
Abstract
Protein-ligand interactions are increasingly profiled at high-throughput, playing a vital role in lead compound discovery and drug optimization. Accurate prediction of binding pose and binding affinity constitutes a pivotal challenge in advancing our computational understanding of protein-ligand interactions. However, inherent limitations still exist, including high computational cost for conformational search sampling in traditional molecular docking tools, and the unsatisfactory molecular representation learning and intermolecular interaction modeling in deep learning-based methods. Here we propose a geometry-aware attention-based deep learning model, GAABind, which effectively predicts the pocket-ligand binding pose and binding affinity within a multi-task learning framework. Specifically, GAABind comprehensively captures the geometric and topological properties of both binding pockets and ligands, and employs expressive molecular representation learning to model intramolecular interactions. Moreover, GAABind proficiently learns the intermolecular many-body interactions and simulates the dynamic conformational adaptations of the ligand during its interaction with the protein through meticulously designed networks. We trained GAABind on the PDBbindv2020 and evaluated it on the CASF2016 dataset; the results indicate that GAABind achieves state-of-the-art performance in binding pose prediction and shows comparable binding affinity prediction performance. Notably, GAABind achieves a success rate of 82.8% in binding pose prediction, and the Pearson correlation between predicted and experimental binding affinities reaches up to 0.803. Additionally, we assessed GAABind's performance on the severe acute respiratory syndrome coronavirus 2 main protease cross-docking dataset. In this evaluation, GAABind demonstrates a notable success rate of 76.5% in binding pose prediction and achieves the highest Pearson correlation coefficient in binding affinity prediction compared with all baseline methods.
Collapse
Affiliation(s)
- Huishuang Tan
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Zhixin Wang
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute of Molecular Enzymology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215123, China
| | - Guang Hu
- MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Key Laboratory of Pathogen Bioscience and Anti-infective Medicine, Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, Suzhou 215123, China
| |
Collapse
|
11
|
Fu C, Huang Z, van Harmelen F, He T, Jiang X. Food4healthKG: Knowledge graphs for food recommendations based on gut microbiota and mental health. Artif Intell Med 2023; 145:102677. [PMID: 37925207 DOI: 10.1016/j.artmed.2023.102677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 08/05/2023] [Accepted: 10/03/2023] [Indexed: 11/06/2023]
Abstract
Food is increasingly acknowledged as a powerful means to promote and maintain mental health. The introduction of the gut-brain axis has been instrumental in understanding the impact of food on mental health. It is widely reported that food can significantly influence gut microbiota metabolism, thereby playing a pivotal role in maintaining mental health. However, the vast amount of heterogeneous data published in recent research lacks systematic integration and application development. To remedy this, we construct a comprehensive knowledge graph, named Food4healthKG, focusing on food, gut microbiota, and mental diseases. The constructed workflow includes the integration of numerous heterogeneous data, entity linking to a normalized format, and the well-designed representation of the acquired knowledge. To illustrate the availability of Food4healthKG, we design two case studies: the knowledge query and the food recommendation based on Food4healthKG. Furthermore, we propose two evaluation methods to validate the quality of the results obtained from Food4healthKG. The results demonstrate the system's effectiveness in practical applications, particularly in providing convincing food recommendations based on gut microbiota and mental health. Food4healthKG is accessible at https://github.com/ccszbd/Food4healthKG.
Collapse
Affiliation(s)
- Chengcheng Fu
- National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China; School of Computer Science, Central China Normal University, Wuhan, China; Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands; National Language Resources Monitor Research Center for Network Media, Central China Normal University, Wuhan, China
| | - Zhisheng Huang
- Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands; Clinical Research Center for Mental Disorders, Shanghai Pudong New Area Mental Health Center, Tongji University School of Medicine, Shanghai, China; Deep Blue Technology Group, Shanghai, China
| | - Frank van Harmelen
- Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Tingting He
- School of Computer Science, Central China Normal University, Wuhan, China; National Language Resources Monitor Research Center for Network Media, Central China Normal University, Wuhan, China
| | - Xingpeng Jiang
- School of Computer Science, Central China Normal University, Wuhan, China; National Language Resources Monitor Research Center for Network Media, Central China Normal University, Wuhan, China.
| |
Collapse
|
12
|
Lou P, Fang A, Zhao W, Yao K, Yang Y, Hu J. Potential Target Discovery and Drug Repurposing for Coronaviruses: Study Involving a Knowledge Graph-Based Approach. J Med Internet Res 2023; 25:e45225. [PMID: 37862061 PMCID: PMC10592722 DOI: 10.2196/45225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 08/30/2023] [Accepted: 09/22/2023] [Indexed: 10/21/2023] Open
Abstract
BACKGROUND The global pandemics of severe acute respiratory syndrome, Middle East respiratory syndrome, and COVID-19 have caused unprecedented crises for public health. Coronaviruses are constantly evolving, and it is unknown which new coronavirus will emerge and when the next coronavirus will sweep across the world. Knowledge graphs are expected to help discover the pathogenicity and transmission mechanism of viruses. OBJECTIVE The aim of this study was to discover potential targets and candidate drugs to repurpose for coronaviruses through a knowledge graph-based approach. METHODS We propose a computational and evidence-based knowledge discovery approach to identify potential targets and candidate drugs for coronaviruses from biomedical literature and well-known knowledge bases. To organize the semantic triples extracted automatically from biomedical literature, a semantic conversion model was designed. The literature knowledge was associated and integrated with existing drug and gene knowledge through semantic mapping, and the coronavirus knowledge graph (CovKG) was constructed. We adopted both the knowledge graph embedding model and the semantic reasoning mechanism to discover unrecorded mechanisms of drug action as well as potential targets and drug candidates. Furthermore, we have provided evidence-based support with a scoring and backtracking mechanism. RESULTS The constructed CovKG contains 17,369,620 triples, of which 641,195 were extracted from biomedical literature, covering 13,065 concept unique identifiers, 209 semantic types, and 97 semantic relations of the Unified Medical Language System. Through multi-source knowledge integration, 475 drugs and 262 targets were mapped to existing knowledge, and 41 new drug mechanisms of action were found by semantic reasoning, which were not recorded in the existing knowledge base. Among the knowledge graph embedding models, TransR outperformed others (mean reciprocal rank=0.2510, Hits@10=0.3505). A total of 33 potential targets and 18 drug candidates were identified for coronaviruses. Among them, 7 novel drugs (ie, quinine, nelfinavir, ivermectin, asunaprevir, tylophorine, Artemisia annua extract, and resveratrol) and 3 highly ranked targets (ie, angiotensin converting enzyme 2, transmembrane serine protease 2, and M protein) were further discussed. CONCLUSIONS We showed the effectiveness of a knowledge graph-based approach in potential target discovery and drug repurposing for coronaviruses. Our approach can be extended to other viruses or diseases for biomedical knowledge discovery and relevant applications.
Collapse
Affiliation(s)
- Pei Lou
- Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - An Fang
- Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Wanqing Zhao
- Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Kuanda Yao
- Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Yusheng Yang
- Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Jiahui Hu
- Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| |
Collapse
|
13
|
Zhu C, Xia X, Li N, Zhong F, Yang Z, Liu L. RDKG-115: Assisting drug repurposing and discovery for rare diseases by trimodal knowledge graph embedding. Comput Biol Med 2023; 164:107262. [PMID: 37481946 DOI: 10.1016/j.compbiomed.2023.107262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 07/07/2023] [Accepted: 07/16/2023] [Indexed: 07/25/2023]
Abstract
Rare diseases (RDs) may affect individuals in small numbers, but they have a significant impact on a global scale. Accurate diagnosis of RDs is challenging, and there is a severe lack of drugs available for treatment. Pharmaceutical companies have shown a preference for drug repurposing from existing drugs developed for other diseases due to the high investment, high risk, and long cycle involved in RD drug development. Compared to traditional approaches, knowledge graph embedding (KGE) based methods are more efficient and convenient, as they treat drug repurposing as a link prediction task. KGE models allow for the enrichment of existing knowledge by incorporating multimodal information from various sources. In this study, we constructed RDKG-115, a rare disease knowledge graph involving 115 RDs, composed of 35,643 entities, 25 relations, and 5,539,839 refined triplets, based on 372,384 high-quality literature and 4 biomedical datasets: DRKG, Pathway Commons, PharmKG, and PMapp. Subsequently, we developed a trimodal KGE model containing structure, category, and description embeddings using reverse-hyperplane projection. We utilized this model to infer 4199 reliable new inferred triplets from RDKG-115. Finally, we calculated potential drugs and small molecules for each of the 115 RDs, taking multiple sclerosis as a case study. This study provides a paradigm for large-scale screening of drug repurposing and discovery for RDs, which will speed up the drug development process and ultimately benefit patients with RDs. The source code and data are available at https://github.com/ZhuChaoY/RDKG-115.
Collapse
Affiliation(s)
- Chaoyu Zhu
- Intelligent Medicine Institute, Shanghai Medical College, Fudan University, Shanghai, 200032, China
| | - Xiaoqiong Xia
- Intelligent Medicine Institute, Shanghai Medical College, Fudan University, Shanghai, 200032, China
| | - Nan Li
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Fan Zhong
- Intelligent Medicine Institute, Shanghai Medical College, Fudan University, Shanghai, 200032, China.
| | - Zhihao Yang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.
| | - Lei Liu
- Intelligent Medicine Institute, Shanghai Medical College, Fudan University, Shanghai, 200032, China; Shanghai Institute of Stem Cell Research and Clinical Translation, Shanghai, 200120, China.
| |
Collapse
|
14
|
Loguercio S, Calverley BC, Wang C, Shak D, Zhao P, Sun S, Budinger GS, Balch WE. Understanding the host-pathogen evolutionary balance through Gaussian process modeling of SARS-CoV-2. PATTERNS (NEW YORK, N.Y.) 2023; 4:100800. [PMID: 37602209 PMCID: PMC10436005 DOI: 10.1016/j.patter.2023.100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/22/2023] [Accepted: 06/22/2023] [Indexed: 08/22/2023]
Abstract
We have developed a machine learning (ML) approach using Gaussian process (GP)-based spatial covariance (SCV) to track the impact of spatial-temporal mutational events driving host-pathogen balance in biology. We show how SCV can be applied to understanding the response of evolving covariant relationships linking the variant pattern of virus spread to pathology for the entire SARS-CoV-2 genome on a daily basis. We show that GP-based SCV relationships in conjunction with genome-wide co-occurrence analysis provides an early warning anomaly detection (EWAD) system for the emergence of variants of concern (VOCs). EWAD can anticipate changes in the pattern of performance of spread and pathology weeks in advance, identifying signatures destined to become VOCs. GP-based analyses of variation across entire viral genomes can be used to monitor micro and macro features responsible for host-pathogen balance. The versatility of GP-based SCV defines starting point for understanding nature's evolutionary path to complexity through natural selection.
Collapse
Affiliation(s)
| | - Ben C. Calverley
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Chao Wang
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Daniel Shak
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Pei Zhao
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Shuhong Sun
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - G.R. Scott Budinger
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Chicago, IL, USA
| | - William E. Balch
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| |
Collapse
|
15
|
Evangelista JE, Clarke DJB, Xie Z, Marino GB, Utti V, Jenkins SL, Ahooyi TM, Bologa CG, Yang JJ, Binder JL, Kumar P, Lambert CG, Grethe JS, Wenger E, Taylor D, Oprea TI, de Bono B, Ma'ayan A. Toxicology knowledge graph for structural birth defects. COMMUNICATIONS MEDICINE 2023; 3:98. [PMID: 37460679 DOI: 10.1038/s43856-023-00329-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 06/29/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND Birth defects are functional and structural abnormalities that impact about 1 in 33 births in the United States. They have been attributed to genetic and other factors such as drugs, cosmetics, food, and environmental pollutants during pregnancy, but for most birth defects there are no known causes. METHODS To further characterize associations between small molecule compounds and their potential to induce specific birth abnormalities, we gathered knowledge from multiple sources to construct a reproductive toxicity Knowledge Graph (ReproTox-KG) with a focus on associations between birth defects, drugs, and genes. Specifically, we gathered data from drug/birth-defect associations from co-mentions in published abstracts, gene/birth-defect associations from genetic studies, drug- and preclinical-compound-induced gene expression changes in cell lines, known drug targets, genetic burden scores for human genes, and placental crossing scores for small molecules. RESULTS Using ReproTox-KG and semi-supervised learning (SSL), we scored >30,000 preclinical small molecules for their potential to cross the placenta and induce birth defects, and identified >500 birth-defect/gene/drug cliques that can be used to explain molecular mechanisms for drug-induced birth defects. The ReproTox-KG can be accessed via a web-based user interface available at https://maayanlab.cloud/reprotox-kg . This site enables users to explore the associations between birth defects, approved and preclinical drugs, and all human genes. CONCLUSIONS ReproTox-KG provides a resource for exploring knowledge about the molecular mechanisms of birth defects with the potential of predicting the likelihood of genes and preclinical small molecules to induce birth defects.
Collapse
Affiliation(s)
- John Erol Evangelista
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Giacomo B Marino
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Vivian Utti
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Sherry L Jenkins
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Taha Mohseni Ahooyi
- The Children's Hospital of Philadelphia, Department of Biomedical and Health Informatics; Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Cristian G Bologa
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Jeremy J Yang
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Jessica L Binder
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Praveen Kumar
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Christophe G Lambert
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Jeffrey S Grethe
- Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Eric Wenger
- The Children's Hospital of Philadelphia, Department of Biomedical and Health Informatics; Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Deanne Taylor
- The Children's Hospital of Philadelphia, Department of Biomedical and Health Informatics; Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Tudor I Oprea
- Department of Internal Medicine, Division of Translational Informatics, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Bernard de Bono
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
| |
Collapse
|
16
|
Aldughayfiq B, Ashfaq F, Jhanjhi NZ, Humayun M. Capturing Semantic Relationships in Electronic Health Records Using Knowledge Graphs: An Implementation Using MIMIC III Dataset and GraphDB. Healthcare (Basel) 2023; 11:1762. [PMID: 37372880 DOI: 10.3390/healthcare11121762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/03/2023] [Accepted: 06/12/2023] [Indexed: 06/29/2023] Open
Abstract
Electronic health records (EHRs) are an increasingly important source of information for healthcare professionals and researchers. However, EHRs are often fragmented, unstructured, and difficult to analyze due to the heterogeneity of the data sources and the sheer volume of information. Knowledge graphs have emerged as a powerful tool for capturing and representing complex relationships within large datasets. In this study, we explore the use of knowledge graphs to capture and represent complex relationships within EHRs. Specifically, we address the following research question: Can a knowledge graph created using the MIMIC III dataset and GraphDB effectively capture semantic relationships within EHRs and enable more efficient and accurate data analysis? We map the MIMIC III dataset to an ontology using text refinement and Protege; then, we create a knowledge graph using GraphDB and use SPARQL queries to retrieve and analyze information from the graph. Our results demonstrate that knowledge graphs can effectively capture semantic relationships within EHRs, enabling more efficient and accurate data analysis. We provide examples of how our implementation can be used to analyze patient outcomes and identify potential risk factors. Our results demonstrate that knowledge graphs are an effective tool for capturing semantic relationships within EHRs, enabling a more efficient and accurate data analysis. Our implementation provides valuable insights into patient outcomes and potential risk factors, contributing to the growing body of literature on the use of knowledge graphs in healthcare. In particular, our study highlights the potential of knowledge graphs to support decision-making and improve patient outcomes by enabling a more comprehensive and holistic analysis of EHR data. Overall, our research contributes to a better understanding of the value of knowledge graphs in healthcare and lays the foundation for further research in this area.
Collapse
Affiliation(s)
- Bader Aldughayfiq
- Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi Arabia
| | - Farzeen Ashfaq
- School of Computer Science-SCS, Taylor's University, Subang Jaya 47500, Malaysia
| | - N Z Jhanjhi
- School of Computer Science-SCS, Taylor's University, Subang Jaya 47500, Malaysia
| | - Mamoona Humayun
- Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi Arabia
| |
Collapse
|
17
|
Quan Y, Xiong ZK, Zhang KX, Zhang QY, Zhang W, Zhang HY. Evolution-strengthened knowledge graph enables predicting the targetability and druggability of genes. PNAS NEXUS 2023; 2:pgad147. [PMID: 37188275 PMCID: PMC10178923 DOI: 10.1093/pnasnexus/pgad147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 04/21/2023] [Indexed: 05/17/2023]
Abstract
Identifying promising targets is a critical step in modern drug discovery, with causative genes of diseases that are an important source of successful targets. Previous studies have found that the pathogeneses of various diseases are closely related to the evolutionary events of organisms. Accordingly, evolutionary knowledge can facilitate the prediction of causative genes and further accelerate target identification. With the development of modern biotechnology, massive biomedical data have been accumulated, and knowledge graphs (KGs) have emerged as a powerful approach for integrating and utilizing vast amounts of data. In this study, we constructed an evolution-strengthened knowledge graph (ESKG) and validated applications of ESKG in the identification of causative genes. More importantly, we developed an ESKG-based machine learning model named GraphEvo, which can effectively predict the targetability and the druggability of genes. We further investigated the explainability of the ESKG in druggability prediction by dissecting the evolutionary hallmarks of successful targets. Our study highlights the importance of evolutionary knowledge in biomedical research and demonstrates the potential power of ESKG in promising target identification. The data set of ESKG and the code of GraphEvo can be downloaded from https://github.com/Zhankun-Xiong/GraphEvo.
Collapse
Affiliation(s)
| | | | - Ke-Xin Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, P. R. China
| | - Qing-Ye Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, P. R. China
| | - Wen Zhang
- To whom correspondence should be addressed: ;
| | | |
Collapse
|
18
|
Dong W, Yang Q, Wang J, Xu L, Li X, Luo G, Gao X. Multi-modality attribute learning-based method for drug-protein interaction prediction based on deep neural network. Brief Bioinform 2023; 24:7145903. [PMID: 37114624 DOI: 10.1093/bib/bbad161] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/19/2023] [Accepted: 04/02/2023] [Indexed: 04/29/2023] Open
Abstract
Identification of active candidate compounds for target proteins, also called drug-protein interaction (DPI) prediction, is an essential but time-consuming and expensive step, which leads to fostering the development of drug discovery. In recent years, deep network-based learning methods were frequently proposed in DPIs due to their powerful capability of feature representation. However, the performance of existing DPI methods is still limited by insufficiently labeled pharmacological data and neglected intermolecular information. Therefore, overcoming these difficulties to perfect the performance of DPIs is an urgent challenge for researchers. In this article, we designed an innovative 'multi-modality attributes' learning-based framework for DPIs with molecular transformer and graph convolutional networks, termed, multi-modality attributes (MMA)-DPI. Specifically, intermolecular sub-structural information and chemical semantic representations were extracted through an augmented transformer module from biomedical data. A tri-layer graph convolutional neural network module was applied to associate the neighbor topology information and learn the condensed dimensional features by aggregating a heterogeneous network that contains multiple biological representations of drugs, proteins, diseases and side effects. Then, the learned representations were taken as the input of a fully connected neural network module to further integrate them in molecular and topological space. Finally, the attribute representations were fused with adaptive learning weights to calculate the interaction score for the DPIs tasks. MMA-DPI was evaluated in different experimental conditions and the results demonstrate that the proposed method achieved higher performance than existing state-of-the-art frameworks.
Collapse
Affiliation(s)
- Weihe Dong
- College of information and Computer Engineering, Northeast Forestry University, Hexing Road, 150040, Harbin, China
| | - Qiang Yang
- School of Computer Science and Technology, Heilongjiang University, Xuefu Road, 150080, Harbin, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Xuefu Road, 150080, Harbin, China
| | - Jian Wang
- College of information and Computer Engineering, Northeast Forestry University, Hexing Road, 150040, Harbin, China
| | - Long Xu
- School of Computer Science and Technology, Heilongjiang University, Xuefu Road, 150080, Harbin, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Xuefu Road, 150080, Harbin, China
| | - Xiaokun Li
- School of Computer Science and Technology, Heilongjiang University, Xuefu Road, 150080, Harbin, China
- Postdoctoral Program of Heilongjiang Hengxun Technology Co., Ltd., Xuefu Road, 150080, Harbin, China
| | - Gongning Luo
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
- School of Computer Science and Technology, Harbin Institute of Technology, West Dazhi Street, 150001, Harbin, China
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal 23955, Saudi Arabia
| |
Collapse
|
19
|
Peng C, Xia F, Naseriparsa M, Osborne F. Knowledge Graphs: Opportunities and Challenges. Artif Intell Rev 2023; 56:1-32. [PMID: 37362886 PMCID: PMC10068207 DOI: 10.1007/s10462-023-10465-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/09/2023] [Indexed: 04/05/2023]
Abstract
With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex information; hence, they rapidly gain the attention of academia and industry in recent years. Thus to develop a deeper understanding of knowledge graphs, this paper presents a systematic overview of this field. Specifically, we focus on the opportunities and challenges of knowledge graphs. We first review the opportunities of knowledge graphs in terms of two aspects: (1) AI systems built upon knowledge graphs; (2) potential application fields of knowledge graphs. Then, we thoroughly discuss severe technical challenges in this field, such as knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning. We expect that this survey will shed new light on future research and the development of knowledge graphs.
Collapse
Affiliation(s)
- Ciyuan Peng
- Institute of Innovation, Science and Sustainability, Federation University Australia, Ballarat, 3353 VIC Australia
| | - Feng Xia
- School of Computing Technologies, RMIT University, Melbourne, 3000 VIC Australia
| | - Mehdi Naseriparsa
- Global Professional School, Federation University Australia, Ballarat, 3353 VIC Australia
| | - Francesco Osborne
- Knowledge Media Institute, The Open University, Milton Keynes, MK7 6AA UK
| |
Collapse
|
20
|
Li MM, Huang K, Zitnik M. Graph representation learning in biomedicine and healthcare. Nat Biomed Eng 2022; 6:1353-1369. [PMID: 36316368 PMCID: PMC10699434 DOI: 10.1038/s41551-022-00942-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 08/09/2022] [Indexed: 11/11/2022]
Abstract
Networks-or graphs-are universal descriptors of systems of interacting elements. In biomedicine and healthcare, they can represent, for example, molecular interactions, signalling pathways, disease co-morbidities or healthcare systems. In this Perspective, we posit that representation learning can realize principles of network medicine, discuss successes and current limitations of the use of representation learning on graphs in biomedicine and healthcare, and outline algorithmic strategies that leverage the topology of graphs to embed them into compact vectorial spaces. We argue that graph representation learning will keep pushing forward machine learning for biomedicine and healthcare applications, including the identification of genetic variants underlying complex traits, the disentanglement of single-cell behaviours and their effects on health, the assistance of patients in diagnosis and treatment, and the development of safe and effective medicines.
Collapse
Affiliation(s)
- Michelle M Li
- Bioinformatics and Integrative Genomics Program, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kexin Huang
- Health Data Science Program, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Data Science Initiative, Cambridge, MA, USA.
| |
Collapse
|
21
|
Khobragade A, Mahajan R, Langi H, Mundhe R, Ghumbre S. Effective negative triplet sampling for knowledge graph embedding. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES 2022. [DOI: 10.1080/02522667.2022.2133215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Anish Khobragade
- Department of Computer Engineering and Information Technology, College of Engineering Pune, Savitribai Phule Pune University, Pune, Maharashtra 411005, India
| | - Rushikesh Mahajan
- Department of Computer Engineering and Information Technology, College of Engineering Pune, Savitribai Phule Pune University, Pune, Maharashtra 411005, India
| | - Hrithik Langi
- Department of Computer Engineering and Information Technology, College of Engineering Pune, Savitribai Phule Pune University, Pune, Maharashtra 411005, India
| | - Rohit Mundhe
- Department of Computer Engineering and Information Technology, College of Engineering Pune, Savitribai Phule Pune University, Pune, Maharashtra 411005, India
| | - Shashikant Ghumbre
- Department of Computer Engineering, Government College of Engineering and Research, Avasari Khurd, Pune, Maharashtra 412405, India
| |
Collapse
|
22
|
Feng Z, Shen Z, Li H, Li S. e-TSN: an interactive visual exploration platform for target-disease knowledge mapping from literature. Brief Bioinform 2022; 23:6809962. [PMID: 36347537 PMCID: PMC9677481 DOI: 10.1093/bib/bbac465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/20/2022] [Accepted: 09/27/2022] [Indexed: 11/10/2022] Open
Abstract
Target discovery and identification processes are driven by the increasing amount of biomedical data. The vast numbers of unstructured texts of biomedical publications provide a rich source of knowledge for drug target discovery research and demand the development of specific algorithms or tools to facilitate finding disease genes and proteins. Text mining is a method that can automatically mine helpful information related to drug target discovery from massive biomedical literature. However, there is a substantial lag between biomedical publications and the subsequent abstraction of information extracted by text mining to databases. The knowledge graph is introduced to integrate heterogeneous biomedical data. Here, we describe e-TSN (Target significance and novelty explorer, http://www.lilab-ecust.cn/etsn/), a knowledge visualization web server integrating the largest database of associations between targets and diseases from the full scientific literature by constructing significance and novelty scoring methods based on bibliometric statistics. The platform aims to visualize target-disease knowledge graphs to assist in prioritizing candidate disease-related proteins. Approved drugs and associated bioactivities for each interested target are also provided to facilitate the visualization of drug-target relationships. In summary, e-TSN is a fast and customizable visualization resource for investigating and analyzing the intricate target-disease networks, which could help researchers understand the mechanisms underlying complex disease phenotypes and improve the drug discovery and development efficiency, especially for the unexpected outbreak of infectious disease pandemics like COVID-19.
Collapse
Affiliation(s)
- Ziyan Feng
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Zihao Shen
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China,Innovation Center for AI and Drug Discovery, East China Normal University, Shanghai 200062, China,Lingang Laboratory, Shanghai 200031, China
| | - Shiliang Li
- Corresponding author: Shiliang Li, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China; Innovation Center for AI and Drug Discovery, East China Normal University, Shanghai 200062, China. E-mail:
| |
Collapse
|
23
|
Wang H, Zu Q, Lu M, Chen R, Yang Z, Gao Y, Ding J. Application of Medical Knowledge Graphs in Cardiology and Cardiovascular Medicine: A Brief Literature Review. Adv Ther 2022; 39:4052-4060. [PMID: 35908002 PMCID: PMC9402764 DOI: 10.1007/s12325-022-02254-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 06/29/2022] [Indexed: 11/28/2022]
Abstract
A knowledge graph is defined as a collection of interlinked descriptions of concepts, relationships, entities and events. Medical knowledge graphs have been the most recent advances in technology, therapy and medicine. Nowadays, a number of specific uses and applications rely on knowledge graphs. The application of the knowledge graph, another form of artificial intelligence (AI) in cardiology and cardiovascular medicine, is a new concept, and only a few studies have been carried out on this particular aspect. In this brief literature review, the use and importance of disease-specific knowledge graphs in exploring various aspects of Kawasaki disease were described. A vision of individualized knowledge graphs (iKGs) in cardiovascular medicine was also discussed. Such iKGs would be based on a modern informatics platform of exchange and inquiry that could comprehensively integrate biologic knowledge with medical histories and health outcomes of individual patients. This could transform how clinicians and scientists discover, communicate and apply new knowledge. In addition, we also described how a study based on the comprehensive longitudinal evaluation of dietary factors associated with acute myocardial infarction and fatal coronary heart disease used a knowledge graph to show the dietary factors associated with cardiovascular diseases in Nurses’ Health Study data. To conclude, in this fast-developing world, medical knowledge graphs have emerged as attractive methods of data storage and hypothesis generation. They could be a major and effective tool in cardiology and cardiovascular medicine and play an important role in reaching effective clinical decisions during treatment and management of patients in the cardiology department.
Collapse
Affiliation(s)
- Hong Wang
- Department of Cardiology, The People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, 530021, Guangxi, People's Republic of China. .,Jinan University, Guangzhou, 510632, Guangdong, People's Republic of China.
| | - Quannan Zu
- College of Management and Economics, Tianjin University, Tianjin, 300072, People's Republic of China
| | - Ming Lu
- College of Management and Economics, Tianjin University, Tianjin, 300072, People's Republic of China
| | - Rongfa Chen
- The State Key Laboratory Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People's Republic of China
| | - Zhiren Yang
- The State Key Laboratory Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People's Republic of China
| | - Yongqiang Gao
- Department of Cardiology, The People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, 530021, Guangxi, People's Republic of China
| | - Jiawang Ding
- Department of Internal Medicine, Beijing Chaoyang Hospital, Chaoyang, Beijing, 100020, People's Republic of China
| |
Collapse
|
24
|
Das S, Taylor K, Beaulah S, Gardner S. Systematic indication extension for drugs using patient stratification insights generated by combinatorial analytics. PATTERNS (NEW YORK, N.Y.) 2022; 3:100496. [PMID: 35755863 PMCID: PMC9214305 DOI: 10.1016/j.patter.2022.100496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Indication extension or repositioning of drugs can, if done well, provide a faster, cheaper, and derisked route to the approval of new therapies, creating new options to address pockets of unmet medical need for patients and offering the potential for significant commercial and clinical benefits. We look at the promises and challenges of different repositioning strategies and the disease insights and scalability that new high-resolution patient stratification methodologies can bring. This is exemplified by a systematic analysis of all development candidates and on-market drugs, which identified 477 indication extension opportunities across 30 chronic disease areas, each supported by patient stratification biomarkers. This illustrates the potential that new artificial intelligence (AI) and combinatorial analytics methods have to enhance the rate and cost of innovation across the drug discovery industry.
Collapse
Affiliation(s)
- Sayoni Das
- PrecisionLife, Unit 8b Bankside, Hanborough Business Park, Long Hanborough OX29 8LJ, UK
| | - Krystyna Taylor
- PrecisionLife, Unit 8b Bankside, Hanborough Business Park, Long Hanborough OX29 8LJ, UK
| | - Simon Beaulah
- PrecisionLife, Unit 8b Bankside, Hanborough Business Park, Long Hanborough OX29 8LJ, UK
| | - Steve Gardner
- PrecisionLife, Unit 8b Bankside, Hanborough Business Park, Long Hanborough OX29 8LJ, UK
| |
Collapse
|
25
|
Domingo-Fernández D, Gadiya Y, Patel A, Mubeen S, Rivas-Barragan D, Diana CW, Misra BB, Healey D, Rokicki J, Colluru V. Causal reasoning over knowledge graphs leveraging drug-perturbed and disease-specific transcriptomic signatures for drug discovery. PLoS Comput Biol 2022; 18:e1009909. [PMID: 35213534 PMCID: PMC8906585 DOI: 10.1371/journal.pcbi.1009909] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 03/09/2022] [Accepted: 02/09/2022] [Indexed: 12/29/2022] Open
Abstract
Network-based approaches are becoming increasingly popular for drug discovery as they provide a systems-level overview of the mechanisms underlying disease pathophysiology. They have demonstrated significant early promise over other methods of biological data representation, such as in target discovery, side effect prediction and drug repurposing. In parallel, an explosion of -omics data for the deep characterization of biological systems routinely uncovers molecular signatures of disease for similar applications. Here, we present RPath, a novel algorithm that prioritizes drugs for a given disease by reasoning over causal paths in a knowledge graph (KG), guided by both drug-perturbed as well as disease-specific transcriptomic signatures. First, our approach identifies the causal paths that connect a drug to a particular disease. Next, it reasons over these paths to identify those that correlate with the transcriptional signatures observed in a drug-perturbation experiment, and anti-correlate to signatures observed in the disease of interest. The paths which match this signature profile are then proposed to represent the mechanism of action of the drug. We demonstrate how RPath consistently prioritizes clinically investigated drug-disease pairs on multiple datasets and KGs, achieving better performance over other similar methodologies. Furthermore, we present two case studies showing how one can deconvolute the predictions made by RPath as well as predict novel targets.
Collapse
Affiliation(s)
| | - Yojana Gadiya
- Enveda Biosciences, Boulder, Colorado, United States of America
| | - Abhishek Patel
- Enveda Biosciences, Boulder, Colorado, United States of America
| | - Sarah Mubeen
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | | | - Chris W. Diana
- Enveda Biosciences, Boulder, Colorado, United States of America
| | | | - David Healey
- Enveda Biosciences, Boulder, Colorado, United States of America
| | - Joe Rokicki
- Enveda Biosciences, Boulder, Colorado, United States of America
| | - Viswa Colluru
- Enveda Biosciences, Boulder, Colorado, United States of America
| |
Collapse
|