Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Belleau F, Nolin MA, Tourigny N, Rigault P, Morissette J. Bio2RDF: Towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 2008;41:706-16. [PMID: 18472304 DOI: 10.1016/j.jbi.2008.03.004] [Citation(s) in RCA: 281] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2007] [Revised: 03/02/2008] [Accepted: 03/14/2008] [Indexed: 11/15/2022]

For:	Belleau F, Nolin MA, Tourigny N, Rigault P, Morissette J. Bio2RDF: Towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 2008;41:706-16. [PMID: 18472304 DOI: 10.1016/j.jbi.2008.03.004] [Citation(s) in RCA: 281] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2007] [Revised: 03/02/2008] [Accepted: 03/14/2008] [Indexed: 11/15/2022]

Number

Cited by Other Article(s)

Devarakonda MV, Mohanty S, Sunkishala RR, Mallampalli N, Liu X. Clinical trial recommendations using Semantics-Based inductive inference and knowledge graph embeddings. J Biomed Inform 2024;154:104627. [PMID: 38561170 DOI: 10.1016/j.jbi.2024.104627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 02/06/2024] [Accepted: 03/20/2024] [Indexed: 04/04/2024]

Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA, Hunter LE. An open source knowledge graph ecosystem for the life sciences. Sci Data 2024;11:363. [PMID: 38605048 PMCID: PMC11009265 DOI: 10.1038/s41597-024-03171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/21/2024] [Indexed: 04/13/2024] Open

Affiliation(s)

Tiffany J Callahan Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA. Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
Ignacio J Tripodi Computer Science Department, Interdisciplinary Quantitative Biology, University of Colorado Boulder, Boulder, CO, 80301, USA
Adrianne L Stefanski Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
Luca Cappelletti AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Sanya B Taneja Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
Jordan M Wyrwa Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
Elena Casiraghi AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Nicolas A Matentzoglu Semanticly, Athens, Greece
Justin Reese Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Jonathan C Silverstein Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
Charles Tapley Hoyt Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, 02115, USA
Richard D Boyce Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
Scott A Malec Division of Translational Informatics, University of New Mexico School of Medicine, Albuquerque, NM, 87131, USA
Deepak R Unni SIB Swiss Institute of Bioinformatics, Basel, Switzerland
Marcin P Joachimiak Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Peter N Robinson Berlin Institute of Health at Charité-Universitatsmedizin, 10117, Berlin, Germany
Christopher J Mungall Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Emanuele Cavalleri AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Tommaso Fontana AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Giorgio Valentini AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy ELLIS, European Laboratory for Learning and Intelligent Systems, Milan Unit, Italy
Marco Mesiti AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Lucas A Gillenwater Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Brook Santangelo Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Nicole A Vasilevsky Data Collaboration Center, Critical Path Institute, 1840 E River Rd. Suite 100, Tucson, AZ, 85718, USA
Robert Hoehndorf Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
Tellen D Bennett Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Patrick B Ryan Janssen Research and Development, Raritan, NJ, 08869, USA
George Hripcsak Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
Michael G Kahn Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Michael Bada Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA
William A Baumgartner Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
Lawrence E Hunter Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA. Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.

Collapse

Verma G, Rebholz-Schuhmann D, Madden MG. Enabling personalised disease diagnosis by combining a patient's time-specific gene expression profile with a biomedical knowledge base. BMC Bioinformatics 2024;25:62. [PMID: 38326757 PMCID: PMC10848462 DOI: 10.1186/s12859-024-05674-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 01/25/2024] [Indexed: 02/09/2024] Open

Abstract

BACKGROUND

Recent developments in the domain of biomedical knowledge bases (KBs) open up new ways to exploit biomedical knowledge that is available in the form of KBs. Significant work has been done in the direction of biomedical KB creation and KB completion, specifically, those having gene-disease associations and other related entities. However, the use of such biomedical KBs in combination with patients' temporal clinical data still largely remains unexplored, but has the potential to immensely benefit medical diagnostic decision support systems.

RESULTS

We propose two new algorithms, LOADDx and SCADDx, to combine a patient's gene expression data with gene-disease association and other related information available in the form of a KB, to assist personalized disease diagnosis. We have tested both of the algorithms on two KBs and on four real-world gene expression datasets of respiratory viral infection caused by Influenza-like viruses of 19 subtypes. We also compare the performance of proposed algorithms with that of five existing state-of-the-art machine learning algorithms (k-NN, Random Forest, XGBoost, Linear SVM, and SVM with RBF Kernel) using two validation approaches: LOOCV and a single internal validation set. Both SCADDx and LOADDx outperform the existing algorithms when evaluated with both validation approaches. SCADDx is able to detect infections with up to 100% accuracy in the cases of Datasets 2 and 3. Overall, SCADDx and LOADDx are able to detect an infection within 72 h of infection with 91.38% and 92.66% average accuracy respectively considering all four datasets, whereas XGBoost, which performed best among the existing machine learning algorithms, can detect the infection with only 86.43% accuracy on an average.

CONCLUSIONS

We demonstrate how our novel idea of using the most and least differentially expressed genes in combination with a KB can enable identification of the diseases that a patient is most likely to have at a particular time, from a KB with thousands of diseases. Moreover, the proposed algorithms can provide a short ranked list of the most likely diseases for each patient along with their most affected genes, and other entities linked with them in the KB, which can support health care professionals in their decision-making.

Collapse

Daza D, Alivanistos D, Mitra P, Pijnenburg T, Cochez M, Groth P. BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs. J Biomed Semantics 2023;14:20. [PMID: 38066573 PMCID: PMC10709903 DOI: 10.1186/s13326-023-00301-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 11/29/2023] [Indexed: 12/18/2023] Open

Abstract

BACKGROUND

Knowledge graphs (KGs) are an important tool for representing complex relationships between entities in the biomedical domain. Several methods have been proposed for learning embeddings that can be used to predict new links in such graphs. Some methods ignore valuable attribute data associated with entities in biomedical KGs, such as protein sequences, or molecular graphs. Other works incorporate such data, but assume that entities can be represented with the same data modality. This is not always the case for biomedical KGs, where entities exhibit heterogeneous modalities that are central to their representation in the subject domain.

OBJECTIVE

We aim to understand how to incorporate multimodal data into biomedical KG embeddings, and analyze the resulting performance in comparison with traditional methods. We propose a modular framework for learning embeddings in KGs with entity attributes, that allows encoding attribute data of different modalities while also supporting entities with missing attributes. We additionally propose an efficient pretraining strategy for reducing the required training runtime. We train models using a biomedical KG containing approximately 2 million triples, and evaluate the performance of the resulting entity embeddings on the tasks of link prediction, and drug-protein interaction prediction, comparing against methods that do not take attribute data into account.

RESULTS

In the standard link prediction evaluation, the proposed method results in competitive, yet lower performance than baselines that do not use attribute data. When evaluated in the task of drug-protein interaction prediction, the method compares favorably with the baselines. Further analyses show that incorporating attribute data does outperform baselines over entities below a certain node degree, comprising approximately 75% of the diseases in the graph. We also observe that optimizing attribute encoders is a challenging task that increases optimization costs. Our proposed pretraining strategy yields significantly higher performance while reducing the required training runtime.

CONCLUSION

BioBLP allows to investigate different ways of incorporating multimodal biomedical data for learning representations in KGs. With a particular implementation, we find that incorporating attribute data does not consistently outperform baselines, but improvements are obtained on a comparatively large subset of entities below a specific node-degree. Our results indicate a potential for improved performance in scientific discovery tasks where understudied areas of the KG would benefit from link prediction methods.

Collapse

Boudin M, Diallo G, Drancé M, Mougin F. The OREGANO knowledge graph for computational drug repurposing. Sci Data 2023;10:871. [PMID: 38057380 PMCID: PMC10700660 DOI: 10.1038/s41597-023-02757-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 11/16/2023] [Indexed: 12/08/2023] Open

Pascazio L, Rihm S, Naseri A, Mosbach S, Akroyd J, Kraft M. Chemical Species Ontology for Data Integration and Knowledge Discovery. J Chem Inf Model 2023;63:6569-6586. [PMID: 37883649 PMCID: PMC10647085 DOI: 10.1021/acs.jcim.3c00820] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 10/13/2023] [Accepted: 10/13/2023] [Indexed: 10/28/2023]

Diaz Benavides S, Cardoso SD, Da Silveira M, Pruski C. Analysis and implementation of the DynDiff tool when comparing versions of ontology. J Biomed Semantics 2023;14:15. [PMID: 37770956 PMCID: PMC10537977 DOI: 10.1186/s13326-023-00295-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 09/09/2023] [Indexed: 09/30/2023] Open

Abstract

BACKGROUND

Ontologies play a key role in the management of medical knowledge because they have the properties to support a wide range of knowledge-intensive tasks. The dynamic nature of knowledge requires frequent changes to the ontologies to keep them up-to-date. The challenge is to understand and manage these changes and their impact on depending systems well in order to handle the growing volume of data annotated with ontologies and the limited documentation describing the changes.

METHODS

We present a method to detect and characterize the changes occurring between different versions of an ontology together with an ontology of changes entitled DynDiffOnto, designed according to Semantic Web best practices and FAIR principles. We further describe the implementation of the method and the evaluation of the tool with different ontologies from the biomedical domain (i.e. ICD9-CM, MeSH, NCIt, SNOMEDCT, GO, IOBC and CIDO), showing its performance in terms of time execution and capacity to classify ontological changes, compared with other state-of-the-art approaches.

RESULTS

The experiments show a top-level performance of DynDiff for large ontologies and a good performance for smaller ones, with respect to execution time and capability to identify complex changes. In this paper, we further highlight the impact of ontology matchers on the diff computation and the possibility to parameterize the matcher in DynDiff, enabling the possibility of benefits from state-of-the-art matchers.

CONCLUSION

DynDiff is an efficient tool to compute differences between ontology versions and classify these differences according to DynDiffOnto concepts. This work also contributes to a better understanding of ontological changes through DynDiffOnto, which was designed to express the semantics of the changes between versions of an ontology and can be used to document the evolution of an ontology.

Collapse

Rongen S, Nikolova N, van der Pas M. Modelling with AAS and RDF in Industry 4.0. COMPUT IND 2023. [DOI: 10.1016/j.compind.2023.103910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]

Van Woensel W, Tu SW, Michalowski W, Sibte Raza Abidi S, Abidi S, Alonso JR, Bottrighi A, Carrier M, Edry R, Hochberg I, Rao M, Kingwell S, Kogan A, Marcos M, Martínez Salvador B, Michalowski M, Piovesan L, Riaño D, Terenziani P, Wilk S, Peleg M. A Community-of-Practice-based Evaluation Methodology for Knowledge Intensive Computational Methods and its Application to Multimorbidity Decision Support. J Biomed Inform 2023;142:104395. [PMID: 37201618 DOI: 10.1016/j.jbi.2023.104395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 04/25/2023] [Accepted: 05/15/2023] [Indexed: 05/20/2023]

Abstract

OBJECTIVE

The study has dual objectives. Our first objective (1) is to develop a community-of-practice-based evaluation methodology for knowledge-intensive computational methods. We target a whitebox analysis of the computational methods to gain insight on their functional features and inner workings. In more detail, we aim to answer evaluation questions on (i) support offered by computational methods for functional features within the application domain; and (ii) in-depth characterizations of the underlying computational processes, models, data and knowledge of the computational methods. Our second objective (2) involves applying the evaluation methodology to answer questions (i) and (ii) for knowledge-intensive clinical decision support (CDS) methods, which operationalize clinical knowledge as computer interpretable guidelines (CIG); we focus on multimorbidity CIG-based clinical decision support (MGCDS) methods that target multimorbidity treatment plans.

MATERIALS AND METHODS

Our methodology directly involves the research community of practice in (a) identifying functional features within the application domain; (b) defining exemplar case studies covering these features; and (c) solving the case studies using their developed computational methods-research groups detail their solutions and functional feature support in solution reports. Next, the study authors (d) perform a qualitative analysis of the solution reports, identifying and characterizing common themes (or dimensions) among the computational methods. This methodology is well suited to perform whitebox analysis, as it directly involves the respective developers in studying inner workings and feature support of computational methods. Moreover, the established evaluation parameters (e.g., features, case studies, themes) constitute a re-usable benchmark framework, which can be used to evaluate new computational methods as they are developed. We applied our community-of-practice-based evaluation methodology on MGCDS methods.

RESULTS

Six research groups submitted comprehensive solution reports for the exemplar case studies. Solutions for two of these case studies were reported by all groups. We identified four evaluation dimensions: detection of adverse interactions, management strategy representation, implementation paradigms, and human-in-the-loop support.Based on our whitebox analysis, we present answers to the evaluation questions (i) and (ii) for MGCDS methods.

DISCUSSION

The proposed evaluation methodology includes features of illuminative and comparison-based approaches; focusing on understanding rather than judging/scoring or identifying gaps in current methods. It involves answering evaluation questions with direct involvement of the research community of practice, who participate in setting up evaluation parameters and solving exemplar case studies. Our methodology was successfully applied to evaluate six MGCDS knowledge-intensive computational methods. We established that, while the evaluated methods provide a multifaceted set of solutions with different benefits and drawbacks, no single MGCDS method currently provides a comprehensive solution for MGCDS.

CONCLUSION

We posit that our evaluation methodology, applied here to gain new insights into MGCDS, can be used to assess other types of knowledge-intensive computational methods and answer other types of evaluation questions. Our case studies can be accessed at our GitHub repository (https://github.com/william-vw/MGCDS).

Collapse

Touré V, Krauss P, Gnodtke K, Buchhorn J, Unni D, Horki P, Raisaro JL, Kalt K, Teixeira D, Crameri K, Österle S. FAIRification of health-related data using semantic web technologies in the Swiss Personalized Health Network. Sci Data 2023;10:127. [PMID: 36899064 PMCID: PMC10006404 DOI: 10.1038/s41597-023-02028-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 02/17/2023] [Indexed: 03/12/2023] Open

Zhou F, Uddin S. Interpretable Drug-to-Drug Network Features for Predicting Adverse Drug Reactions. Healthcare (Basel) 2023;11:healthcare11040610. [PMID: 36833144 PMCID: PMC9957267 DOI: 10.3390/healthcare11040610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/29/2023] [Accepted: 02/06/2023] [Indexed: 02/22/2023] Open

Das P, Mazumder DH. An extensive survey on the use of supervised machine learning techniques in the past two decades for prediction of drug side effects. Artif Intell Rev 2023;56:1-28. [PMID: 36819660 PMCID: PMC9930028 DOI: 10.1007/s10462-023-10413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 02/19/2023]

Artificial Intelligence and Data Mining for the Pharmacovigilance of Drug-Drug Interactions. Clin Ther 2023;45:117-133. [PMID: 36732152 DOI: 10.1016/j.clinthera.2023.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 12/15/2022] [Accepted: 01/09/2023] [Indexed: 02/01/2023]

Systematic Construction of Knowledge Graphs for Research-Performing Organizations. INFORMATION 2022. [DOI: 10.3390/info13120562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open

Bonner S, Barrett IP, Ye C, Swiers R, Engkvist O, Bender A, Hoyt CT, Hamilton WL. A review of biomedical datasets relating to drug discovery: a knowledge graph perspective. Brief Bioinform 2022;23:6712301. [PMID: 36151740 DOI: 10.1093/bib/bbac404] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 07/14/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open

Ikeda S, Ono H, Ohta T, Chiba H, Naito Y, Moriya Y, Kawashima S, Yamamoto Y, Okamoto S, Goto S, Katayama T. TogoID: an exploratory ID converter to bridge biological datasets. Bioinformatics 2022;38:4194-4199. [PMID: 35801937 PMCID: PMC9438948 DOI: 10.1093/bioinformatics/btac491] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/08/2022] [Accepted: 07/07/2022] [Indexed: 12/24/2022] Open

Affiliation(s)

Shuya Ikeda
Hiromasa Ono
Tazro Ohta Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Hirokazu Chiba Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Yuki Naito Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Yuki Moriya Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Shuichi Kawashima Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Yasunori Yamamoto Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Shinobu Okamoto Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Susumu Goto Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, University of Tokyo Kashiwanoha-campus Station Satellite 6F, Kashiwa, Chiba 277-0871, Japan
Toshiaki Katayama To whom correspondence should be addressed.

Collapse

Zong N, Li N, Wen A, Ngo V, Yu Y, Huang M, Chowdhury S, Jiang C, Fu S, Weinshilboum R, Jiang G, Hunter L, Liu H. BETA: a comprehensive benchmark for computational drug-target prediction. Brief Bioinform 2022;23:6596989. [PMID: 35649342 PMCID: PMC9294420 DOI: 10.1093/bib/bbac199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/10/2022] [Accepted: 04/29/2022] [Indexed: 11/14/2022] Open

Gurupur VP. Key observations in terms of management of electronic health records from a mHealth perspective. Mhealth 2022;8:18. [PMID: 35449505 PMCID: PMC9014234 DOI: 10.21037/mhealth-21-39] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 01/11/2022] [Indexed: 11/06/2022] Open

Alshahrani M, Almansour A, Alkhaldi A, Thafar MA, Uludag M, Essack M, Hoehndorf R. Combining biomedical knowledge graphs and text to improve predictions for drug-target interactions and drug-indications. PeerJ 2022;10:e13061. [PMID: 35402106 PMCID: PMC8988936 DOI: 10.7717/peerj.13061] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 02/13/2022] [Indexed: 01/11/2023] Open

An Entity Recognition Model Based on Deep Learning Fusion of Text Feature. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2021.102841] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Han X, Xie R, Li X, Li J. SmileGNN: Drug–Drug Interaction Prediction Based on the SMILES and Graph Neural Network. Life (Basel) 2022;12:life12020319. [PMID: 35207606 PMCID: PMC8879716 DOI: 10.3390/life12020319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 12/27/2021] [Accepted: 01/05/2022] [Indexed: 11/16/2022] Open

Angioni S, Salatino A, Osborne F, Recupero DR, Motta E. AIDA: A knowledge graph about research dynamics in academia and industry. QUANTITATIVE SCIENCE STUDIES 2022. [DOI: 10.1162/qss_a_00162] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Larmande P, Tagny Ngompe G, Venkatesan A, Ruiz M. AgroLD: A Knowledge Graph Database for Plant Functional Genomics. Methods Mol Biol 2022;2443:527-540. [PMID: 35037225 DOI: 10.1007/978-1-0716-2067-0_28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

A semantic framework supporting multilayer networks analysis for rare diseases. INT J SEMANT WEB INF 2022. [DOI: 10.4018/ijswis.297141] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Nayyeri M, Cil GM, Vahdati S, Osborne F, Rahman M, Angioni S, Salatino A, Recupero DR, Vassilyeva N, Motta E, Lehmann J. Trans4E: Link prediction on scholarly knowledge graphs. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.02.100] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Wang M, Wang H, Liu X, Ma X, Wang B. Drug-Drug Interaction Predictions via Knowledge Graph and Text Embedding: Instrument Validation Study. JMIR Med Inform 2021;9:e28277. [PMID: 34185011 PMCID: PMC8277366 DOI: 10.2196/28277] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 04/29/2021] [Accepted: 05/05/2021] [Indexed: 11/23/2022] Open

Abstract

Background

Minimizing adverse reactions caused by drug-drug interactions (DDIs) has always been a prominent research topic in clinical pharmacology. Detecting all possible interactions through clinical studies before a drug is released to the market is a demanding task. The power of big data is opening up new approaches to discovering various DDIs. However, these data contain a huge amount of noise and provide knowledge bases that are far from being complete or used with reliability. Most existing studies focus on predicting binary DDIs between drug pairs and ignore other interactions.

Objective

Leveraging both drug knowledge graphs and biomedical text is a promising pathway for rich and comprehensive DDI prediction, but it is not without issues. Our proposed model seeks to address the following challenges: data noise and incompleteness, data sparsity, and computational complexity.

Methods

We propose a novel framework, Predicting Rich DDI, to predict DDIs. The framework uses graph embedding to overcome data incompleteness and sparsity issues to make multiple DDI label predictions. First, a large-scale drug knowledge graph is generated from different sources. The knowledge graph is then embedded with comprehensive biomedical text into a common low-dimensional space. Finally, the learned embeddings are used to efficiently compute rich DDI information through a link prediction process.

Results

To validate the effectiveness of the proposed framework, extensive experiments were conducted on real-world data sets. The results demonstrate that our model outperforms several state-of-the-art baseline methods in terms of capability and accuracy.

Conclusions

We propose a novel framework, Predicting Rich DDI, to predict DDIs. Using rich DDI information, it can competently predict multiple labels for a pair of drugs across numerous domains, ranging from pharmacological mechanisms to side effects. To the best of our knowledge, this framework is the first to provide a joint translation-based embedding model that learns DDIs by integrating drug knowledge graphs and biomedical text simultaneously in a common low-dimensional space. The model also predicts DDIs using multiple labels rather than single or binary labels. Extensive experiments were conducted on real-world data sets to demonstrate the effectiveness and efficiency of the model. The results show our proposed framework outperforms several state-of-the-art baselines.

Collapse

Galgonek J, Vondrášek J. IDSM ChemWebRDF: SPARQLing small-molecule datasets. J Cheminform 2021;13:38. [PMID: 33980298 PMCID: PMC8117646 DOI: 10.1186/s13321-021-00515-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 04/23/2021] [Indexed: 11/12/2022] Open

Abstract

The Resource Description Framework (RDF), together with well-defined ontologies, significantly increases data interoperability and usability. The SPARQL query language was introduced to retrieve requested RDF data and to explore links between them. Among other useful features, SPARQL supports federated queries that combine multiple independent data source endpoints. This allows users to obtain insights that are not possible using only a single data source. Owing to all of these useful features, many biological and chemical databases present their data in RDF, and support SPARQL querying. In our project, we primary focused on PubChem, ChEMBL and ChEBI small-molecule datasets. These datasets are already being exported to RDF by their creators. However, none of them has an official and currently supported SPARQL endpoint. This omission makes it difficult to construct complex or federated queries that could access all of the datasets, thus underutilising the main advantage of the availability of RDF data. Our goal is to address this gap by integrating the datasets into one database called the Integrated Database of Small Molecules (IDSM) that will be accessible through a SPARQL endpoint. Beyond that, we will also focus on increasing mutual interoperability of the datasets. To realise the endpoint, we decided to implement an in-house developed SPARQL engine based on the PostgreSQL relational database for data storage. In our approach, data are stored in the traditional relational form, and the SPARQL engine translates incoming SPARQL queries into equivalent SQL queries. An important feature of the engine is that it optimises the resulting SQL queries. Together with optimisations performed by PostgreSQL, this allows efficient evaluations of SPARQL queries. The endpoint provides not only querying in the dataset, but also the compound substructure and similarity search supported by our Sachem project. Although the endpoint is accessible from an internet browser, it is mainly intended to be used for programmatic access by other services, for example as a part of federated queries. For regular users, we offer a rich web application called ChemWebRDF using the endpoint. The application is publicly available at https://idsm.elixir-czech.cz/chemweb/.

Collapse

Recent trends in knowledge graphs: theory and practice. Soft comput 2021. [DOI: 10.1007/s00500-021-05756-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

MacLean F. Knowledge graphs and their applications in drug discovery. Expert Opin Drug Discov 2021;16:1057-1069. [PMID: 33843398 DOI: 10.1080/17460441.2021.1910673] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Chen Y, Ma T, Yang X, Wang J, Song B, Zeng X. MUFFIN: Multi-Scale Feature Fusion for Drug-Drug Interaction Prediction. Bioinformatics 2021;37:2651-2658. [PMID: 33720331 DOI: 10.1093/bioinformatics/btab169] [Citation(s) in RCA: 73] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 02/05/2021] [Accepted: 03/11/2021] [Indexed: 01/08/2023] Open

Abstract

MOTIVATION

Adverse drug-drug interactions (DDIs) are crucial for drug research and mainly cause morbidity and mortality. Thus, the identification of potential DDIs is essential for doctors, patients, and the society. Existing traditional machine learning models rely heavily on handcraft features and lack generalization. Recently, the deep learning approaches that can automatically learn drug features from the molecular graph or drug-related network have improved the ability of computational models to predict unknown DDIs. However, previous works utilized large labeled data and merely considered the structure or sequence information of drugs without considering the relations or topological information between drug and other biomedical objects (e.g., gene, disease, and pathway), or considered knowledge graph (KG) without considering the information from the drug molecular structure.

RESULTS

Accordingly, to effectively explore the joint effect of drug molecular structure and semantic information of drugs in knowledge graph for DDI prediction, we propose a multi-scale feature fusion deep learning model named MUFFIN. MUFFIN can jointly learn the drug representation based on both the drug-self structure information and the KG with rich bio-medical information. In MUFFIN, we designed a bi-level cross strategy that includes cross- and scalar-level components to fuse multi-modal features well. MUFFIN can alleviate the restriction of limited labeled data on deep learning models by crossing the features learned from large-scale KG and drug molecular graph. We evaluated our approach on three datasets and three different tasks including binary-class, multi-class, and multi-label DDI prediction tasks. The results showed that MUFFIN outperformed other state-of-the-art baselines.

AVAILABILITY

The source code and data are available at https://github.com/xzenglab/MUFFIN.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Biswas S, Mitra P, Rao KS. Relation Prediction of Co-Morbid Diseases Using Knowledge Graph Completion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:708-717. [PMID: 31295118 DOI: 10.1109/tcbb.2019.2927310] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Irshad O, Ghani Khan MU. Formalization and Semantic Integration of Heterogeneous Omics Annotations for Exploratory Searches. Curr Bioinform 2021. [DOI: 10.2174/1574893615666200127122818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract Aim: To facilitate researchers and practitioners for unveiling the mysterious functional aspects of human cellular system through performing exploratory searching on semantically integrated heterogeneous and geographically dispersed omics annotations. Background: Improving health standards of life is one of the motives which continuously instigates researchers and practitioners to strive for uncovering the mysterious aspects of human cellular system. Inferring new knowledge from known facts always requires reasonably large amount of data in well-structured, integrated and unified form. Due to the advent of especially high throughput and sensor technologies, biological data is growing heterogeneously and geographically at astronomical rate. Several data integration systems have been deployed to cope with the issues of data heterogeneity and global dispersion. Systems based on semantic data integration models are more flexible and expandable than syntax-based ones but still lack aspect-based data integration, persistence and querying. Furthermore, these systems do not fully support to warehouse biological entities in the form of semantic associations as naturally possessed by the human cell. Objective: To develop aspect-oriented formal data integration model for semantically integrating heterogeneous and geographically dispersed omics annotations for providing exploratory querying on integrated data. Method: We propose an aspect-oriented formal data integration model which uses web semantics standards to formally specify its each construct. Proposed model supports aspect-oriented representation of biological entities while addressing the issues of data heterogeneity and global dispersion. It associates and warehouses biological entities in the way they relate with Result: To show the significance of proposed model, we developed a data warehouse and information retrieval system based on proposed model compliant multi-layered and multi-modular software architecture. Results show that our model supports well for gathering, associating, integrating, persisting and querying each entity with respect to its all possible aspects within or across the various associated omics layers. Conclusion: Formal specifications better facilitate for addressing data integration issues by providing formal means for understanding omics data based on meaning instead of syntax Collapse

Rickett CD, Maschhoff KJ, Sukumar SR. Does tetanus vaccination contribute to reduced severity of the COVID-19 infection? Med Hypotheses 2021;146:110395. [PMID: 33341328 PMCID: PMC7695568 DOI: 10.1016/j.mehy.2020.110395] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 11/06/2020] [Indexed: 02/09/2023]

Marín-Llaó J, Mubeen S, Perera-Lluna A, Hofmann-Apitius M, Picart-Armada S, Domingo-Fernández D. MultiPaths: a python framework for analyzing multi-layer biological networks using diffusion algorithms. Bioinformatics 2020;37:137-139. [PMID: 33367476 PMCID: PMC8034528 DOI: 10.1093/bioinformatics/btaa1069] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 11/23/2020] [Accepted: 12/14/2020] [Indexed: 11/13/2022] Open

Zheng S, Rao J, Song Y, Zhang J, Xiao X, Fang EF, Yang Y, Niu Z. PharmKG: a dedicated knowledge graph benchmark for bomedical data mining. Brief Bioinform 2020;22:6042240. [PMID: 33341877 DOI: 10.1093/bib/bbaa344] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 10/12/2020] [Accepted: 10/28/2020] [Indexed: 12/11/2022] Open

Rossanez A, Dos Reis JC, Torres RDS, de Ribaupierre H. KGen: a knowledge graph generator from biomedical scientific literature. BMC Med Inform Decis Mak 2020;20:314. [PMID: 33317512 PMCID: PMC7734730 DOI: 10.1186/s12911-020-01341-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 11/17/2020] [Indexed: 11/26/2022] Open

Abstract

Background

Knowledge is often produced from data generated in scientific investigations. An ever-growing number of scientific studies in several domains result into a massive amount of data, from which obtaining new knowledge requires computational help. For example, Alzheimer’s Disease, a life-threatening degenerative disease that is not yet curable. As the scientific community strives to better understand it and find a cure, great amounts of data have been generated, and new knowledge can be produced. A proper representation of such knowledge brings great benefits to researchers, to the scientific community, and consequently, to society.

Methods

In this article, we study and evaluate a semi-automatic method that generates knowledge graphs (KGs) from biomedical texts in the scientific literature. Our solution explores natural language processing techniques with the aim of extracting and representing scientific literature knowledge encoded in KGs. Our method links entities and relations represented in KGs to concepts from existing biomedical ontologies available on the Web. We demonstrate the effectiveness of our method by generating KGs from unstructured texts obtained from a set of abstracts taken from scientific papers on the Alzheimer’s Disease. We involve physicians to compare our extracted triples from their manual extraction via their analysis of the abstracts. The evaluation further concerned a qualitative analysis by the physicians of the generated KGs with our software tool.

Results

The experimental results indicate the quality of the generated KGs. The proposed method extracts a great amount of triples, showing the effectiveness of our rule-based method employed in the identification of relations in texts. In addition, ontology links are successfully obtained, which demonstrates the effectiveness of the ontology linking method proposed in this investigation.

Conclusions

We demonstrate that our proposal is effective on building ontology-linked KGs representing the knowledge obtained from biomedical scientific texts. Such representation can add value to the research in various domains, enabling researchers to compare the occurrence of concepts from different studies. The KGs generated may pave the way to potential proposal of new theories based on data analysis to advance the state of the art in their research domains.

Collapse

Sung HY, Chi YL. A knowledge-based system to find over-the-counter medicines for self-medication. J Biomed Inform 2020;108:103504. [PMID: 32673790 DOI: 10.1016/j.jbi.2020.103504] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 07/05/2020] [Accepted: 07/06/2020] [Indexed: 11/30/2022]

Pellison FC, Rijo RPCL, Lima VC, Crepaldi NY, Bernardi FA, Galliez RM, Kritski A, Abhishek K, Alves D. Data Integration in the Brazilian Public Health System for Tuberculosis: Use of the Semantic Web to Establish Interoperability. JMIR Med Inform 2020;8:e17176. [PMID: 32628611 PMCID: PMC7381074 DOI: 10.2196/17176] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 02/17/2020] [Accepted: 03/22/2020] [Indexed: 11/13/2022] Open

Callahan TJ, Tripodi IJ, Pielke-Lombardo H, Hunter LE. Knowledge-Based Biomedical Data Science. Annu Rev Biomed Data Sci 2020;3:23-41. [PMID: 33954284 PMCID: PMC8095730 DOI: 10.1146/annurev-biodatasci-010820-091627] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

ABHD11, a new diacylglycerol lipase involved in weight gain regulation. PLoS One 2020;15:e0234780. [PMID: 32579589 PMCID: PMC7313976 DOI: 10.1371/journal.pone.0234780] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 06/02/2020] [Indexed: 01/26/2023] Open

Nicholson DN, Greene CS. Constructing knowledge graphs and their biomedical applications. Comput Struct Biotechnol J 2020;18:1414-1428. [PMID: 32637040 PMCID: PMC7327409 DOI: 10.1016/j.csbj.2020.05.017] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 05/22/2020] [Accepted: 05/23/2020] [Indexed: 12/31/2022] Open

Fan T, Yan L, Ma Z. Storing and querying fuzzy RDF(S) in HBase databases. INT J INTELL SYST 2020. [DOI: 10.1002/int.22224] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Sima AC, Mendes de Farias T, Zbinden E, Anisimova M, Gil M, Stockinger H, Stockinger K, Robinson-Rechavi M, Dessimoz C. Enabling semantic queries across federated bioinformatics databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020;2019:5614223. [PMID: 31697362 PMCID: PMC6836710 DOI: 10.1093/database/baz106] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 08/01/2019] [Accepted: 08/02/2019] [Indexed: 11/23/2022]

Affiliation(s)

Ana Claudia Sima ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland.,Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
Tarcisio Mendes de Farias Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
Erich Zbinden ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
Maria Anisimova ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
Manuel Gil ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
Heinz Stockinger SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
Kurt Stockinger ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland
Marc Robinson-Rechavi SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
Christophe Dessimoz Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Genetics, Evolution, and Environment, University College London, Gower St, London WC1E 6BT, UK.,Department of Computer Science, University College London, Gower St, London WC1E 6BT, UK

Collapse

Mohamed SK, Nounu A, Nováček V. Biological applications of knowledge graph embedding models. Brief Bioinform 2020;22:1679-1693. [PMID: 32065227 DOI: 10.1093/bib/bbaa012] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 01/10/2020] [Accepted: 01/21/2020] [Indexed: 01/04/2023] Open

Irshad O, Khan MUG. Integration and Querying of Heterogeneous Omics Semantic Annotations for Biomedical and Biomolecular Knowledge Discovery. Curr Bioinform 2020. [DOI: 10.2174/1574893614666190409112025] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Struck A, Walsh B, Buchanan A, Lee JA, Spangler R, Stuart JM, Ellrott K. Exploring Integrative Analysis Using the BioMedical Evidence Graph. JCO Clin Cancer Inform 2020;4:147-159. [PMID: 32097025 PMCID: PMC7049249 DOI: 10.1200/cci.19.00110] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2020] [Indexed: 12/22/2022] Open

Abstract

PURPOSE

The analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrated data set analysis.

METHODS

We introduce the BioMedical Evidence Graph (BMEG), a graph database and query engine for discovery and analysis of cancer biology. The BMEG is unique from other biologic data graphs in that sample-level molecular and clinical information is connected to reference knowledge bases. It combines gene expression and mutation data with drug-response experiments, pathway information databases, and literature-derived associations.

RESULTS

The construction of the BMEG has resulted in a graph containing > 41 million vertices and 57 million edges. The BMEG system provides a graph query-based application programming interface to enable analysis, with client code available for Python, Javascript, and R, and a server online at bmeg.io. Using this system, we have demonstrated several forms of cross-data set analysis to show the utility of the system.

CONCLUSION

The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug-response machine learning, patient-level knowledge-base queries, and pathway level analysis. We have compared the resulting graph to other available integrated graph systems and demonstrated the former is unique in the scale of the graph and the type of data it makes available.

Collapse

Sima AC, Stockinger K, de Farias TM, Gil M. Semantic Integration and Enrichment of Heterogeneous Biological Databases. Methods Mol Biol 2020;1910:655-690. [PMID: 31278681 DOI: 10.1007/978-1-4939-9074-0_22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]

Liu J, Yang M, Zhang L, Zhou W. An effective biomedical data migration tool from resource description framework to JSON. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020;2019:5538640. [PMID: 31343683 PMCID: PMC6657663 DOI: 10.1093/database/baz088] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 06/06/2019] [Accepted: 06/09/2019] [Indexed: 12/28/2022]

Zong N, Wong RSN, Yu Y, Wen A, Huang M, Li N. Drug-target prediction utilizing heterogeneous bio-linked network embeddings. Brief Bioinform 2019;22:568-580. [PMID: 31885036 DOI: 10.1093/bib/bbz147] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 10/11/2019] [Accepted: 10/29/2019] [Indexed: 11/12/2022] Open

Abstract

To enable modularization for network-based prediction, we conducted a review of known methods conducting the various subtasks corresponding to the creation of a drug-target prediction framework and associated benchmarking to determine the highest-performing approaches. Accordingly, our contributions are as follows: (i) from a network perspective, we benchmarked the association-mining performance of 32 distinct subnetwork permutations, arranging based on a comprehensive heterogeneous biomedical network derived from 12 repositories; (ii) from a methodological perspective, we identified the best prediction strategy based on a review of combinations of the components with off-the-shelf classification, inference methods and graph embedding methods. Our benchmarking strategy consisted of two series of experiments, totaling six distinct tasks from the two perspectives, to determine the best prediction. We demonstrated that the proposed method outperformed the existing network-based methods as well as how combinatorial networks and methodologies can influence the prediction. In addition, we conducted disease-specific prediction tasks for 20 distinct diseases and showed the reliability of the strategy in predicting 75 novel drug-target associations as shown by a validation utilizing DrugBank 5.1.0. In particular, we revealed a connection of the network topology with the biological explanations for predicting the diseases, 'Asthma' 'Hypertension', and 'Dementia'. The results of our benchmarking produced knowledge on a network-based prediction framework with the modularization of the feature selection and association prediction, which can be easily adapted and extended to other feature sources or machine learning algorithms as well as a performed baseline to comprehensively evaluate the utility of incorporating varying data sources.

Collapse

Nguyen DA, Nguyen CH, Mamitsuka H. A survey on adverse drug reaction studies: data, tasks and machine learning methods. Brief Bioinform 2019;22:164-177. [PMID: 31838499 DOI: 10.1093/bib/bbz140] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open