1
|
Furxhi I, Willighagen E, Evelo C, Costa A, Gardini D, Ammar A. A data reusability assessment in the nanosafety domain based on the NSDRA framework followed by an exploratory quantitative structure activity relationships (QSAR) modeling targeting cellular viability. NanoImpact 2023; 31:100475. [PMID: 37423508 DOI: 10.1016/j.impact.2023.100475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 07/03/2023] [Accepted: 07/04/2023] [Indexed: 07/11/2023]
Abstract
INTRODUCTION The current effort towards the digital transformation across multiple scientific domains requires data that is Findable, Accessible, Interoperable and Reusable (FAIR). In addition to the FAIR data, what is required for the application of computational tools, such as Quantitative Structure Activity Relationships (QSARs), is a sufficient data volume and the ability to merge sources into homogeneous digital assets. In the nanosafety domain there is a lack of FAIR available metadata. METHODOLOGY To address this challenge, we utilized 34 datasets from the nanosafety domain by exploiting the NanoSafety Data Reusability Assessment (NSDRA) framework, which allowed the annotation and assessment of dataset's reusability. From the framework's application results, eight datasets targeting the same endpoint (i.e. numerical cellular viability) were selected, processed and merged to test several hypothesis including universal versus nanogroup-specific QSAR models (metal oxide and nanotubes), and regression versus classification Machine Learning (ML) algorithms. RESULTS Universal regression and classification QSARs reached an 0.86 R2 and 0.92 accuracy, respectively, for the test set. Nanogroup-specific regression models reached 0.88 R2 for nanotubes test set followed by metal oxide (0.78). Nanogroup-specific classification models reached 0.99 accuracy for nanotubes test set, followed by metal oxide (0.91). Feature importance revealed different patterns depending on the dataset with common influential features including core size, exposure conditions and toxicological assay. Even in the case where the available experimental knowledge was merged, the models still failed to correctly predict the outputs of an unseen dataset, revealing the cumbersome conundrum of scientific reproducibility in realistic applications of QSAR for nanosafety. To harness the full potential of computational tools and ensure their long-term applications, embracing FAIR data practices is imperative in driving the development of responsible QSAR models. CONCLUSIONS This study reveals that the digitalization of nanosafety knowledge in a reproducible manner has a long way towards its successful pragmatic implementation. The workflow carried out in the study shows a promising approach to increase the FAIRness across all the elements of computational studies, from dataset's annotation, selection, merging to FAIR modeling reporting. This has significant implications for future research as it provides an example of how to utilize and report different tools available in the nanosafety knowledge system, while increasing the transparency of the results. One of the main benefits of this workflow is that it promotes data sharing and reuse, which is essential for advancing scientific knowledge by making data and metadata FAIR compliant. In addition, the increased transparency and reproducibility of the results can enhance the trustworthiness of the computational findings.
Collapse
Affiliation(s)
- Irini Furxhi
- Transgero Limited, Cullinagh, Newcastle West, Co. Limerick, Ireland; Dept. of Accounting and Finance, Kemmy Business School, University of Limerick, V94PH93, Ireland.
| | - Egon Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, the Netherlands.
| | - Chris Evelo
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, the Netherlands.
| | - Anna Costa
- National Research Council, Institute of Science, Technology and Sustainability for Ceramics, Faenza, Italy.
| | - Davide Gardini
- National Research Council, Institute of Science, Technology and Sustainability for Ceramics, Faenza, Italy.
| | - Ammar Ammar
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, the Netherlands.
| |
Collapse
|
2
|
Rocca-Serra P, Gu W, Ioannidis V, Abbassi-Daloii T, Capella-Gutierrez S, Chandramouliswaran I, Splendiani A, Burdett T, Giessmann RT, Henderson D, Batista D, Emam I, Gadiya Y, Giovanni L, Willighagen E, Evelo C, Gray AJG, Gribbon P, Juty N, Welter D, Quast K, Peeters P, Plasterer T, Wood C, van der Horst E, Reilly D, van Vlijmen H, Scollen S, Lister A, Thurston M, Granell R, Sansone SA. The FAIR Cookbook - the essential resource for and by FAIR doers. Sci Data 2023; 10:292. [PMID: 37208467 PMCID: PMC10198982 DOI: 10.1038/s41597-023-02166-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 04/19/2023] [Indexed: 05/21/2023] Open
Abstract
The notion that data should be Findable, Accessible, Interoperable and Reusable, according to the FAIR Principles, has become a global norm for good data stewardship and a prerequisite for reproducibility. Nowadays, FAIR guides data policy actions and professional practices in the public and private sectors. Despite such global endorsements, however, the FAIR Principles are aspirational, remaining elusive at best, and intimidating at worst. To address the lack of practical guidance, and help with capability gaps, we developed the FAIR Cookbook, an open, online resource of hands-on recipes for "FAIR doers" in the Life Sciences. Created by researchers and data managers professionals in academia, (bio)pharmaceutical companies and information service industries, the FAIR Cookbook covers the key steps in a FAIRification journey, the levels and indicators of FAIRness, the maturity model, the technologies, the tools and the standards available, as well as the skills required, and the challenges to achieve and improve data FAIRness. Part of the ELIXIR ecosystem, and recommended by funders, the FAIR Cookbook is open to contributions of new recipes.
Collapse
Affiliation(s)
- Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK.
- AstraZeneca, Data Office, Data Science & AI unit R&D, 136 Hills Rd, Cambridge, UK.
| | - Wei Gu
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
- Luxembourg National Data Service, 6 Avenue des Hauts-Fourneaux, Esch-sur-Alzette, Luxembourg, L-4362, Esch-sur-Alzette, Luxembourg
| | - Vassilios Ioannidis
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Tooba Abbassi-Daloii
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | | | - Ishwar Chandramouliswaran
- Office of Data Science Strategy, National Institutes of Health, 9000 Rockville Pike, Bethesda, Maryland, 20892, USA
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, CB10 1SD, UK
| | - Robert T Giessmann
- Bayer AG, Business Development & Licensing & OI, Pharmaceuticals, 13342, Berlin, Germany
- Institute for Globally Distributed Open Research and Education (IGDORE), Berlin, Germany
| | - David Henderson
- Bayer AG, Business Development & Licensing & OI, Pharmaceuticals, 13342, Berlin, Germany
| | - Dominique Batista
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Ibrahim Emam
- Data Science Institute, Imperial College London, William Penney Laboratory, South Kensington Campus, London, SW7 2AZ, UK
| | - Yojana Gadiya
- Fraunhofer Institute for Translational Medicine and Pharmacology and Fraunhofer Cluster of Excellence for Immune Mediated Diseases, Schnackenburgallee 114, 22525 Hamburg, and Theodor Stern Kai 7, 60590, Frankfurt, Germany
| | - Lucas Giovanni
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | - Egon Willighagen
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | - Chris Evelo
- Department of Bioinformatics (BiGCaT), NUTRIM, FHML, Maastricht University, Maastricht, the Netherlands
| | - Alasdair J G Gray
- Department of Computer Science, Heriot-Watt University, Edinburgh, EH14 4AS, Scotland, UK
| | - Philip Gribbon
- Fraunhofer Institute for Translational Medicine and Pharmacology and Fraunhofer Cluster of Excellence for Immune Mediated Diseases, Schnackenburgallee 114, 22525 Hamburg, and Theodor Stern Kai 7, 60590, Frankfurt, Germany
| | - Nick Juty
- The University of Manchester, Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
| | - Danielle Welter
- Luxembourg Centre for Systems Biomedicine, ELIXIR Luxembourg, University of Luxembourg, L-4367, Belval, Luxembourg
- Luxembourg National Data Service, 6 Avenue des Hauts-Fourneaux, Esch-sur-Alzette, Luxembourg, L-4362, Esch-sur-Alzette, Luxembourg
| | - Karsten Quast
- Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Straße 65, 88397, Biberach an der Riss, Germany
| | - Paul Peeters
- Janssen, Turnhoutseweg 30, B-2340, Beerse, Belgium
| | - Tom Plasterer
- AstraZeneca Pharmaceuticals, 36 Gatehouse Drive, Waltham, MA, 02451, USA
| | - Colin Wood
- AstraZeneca, da Vinci Building, Melbourn Science Park, Cambridge Road, Royston, SG8 6HM, UK
| | - Eelke van der Horst
- The Hyve BV, Arthur van Schendelstraat 650, 3511 MJ, Utrecht, The Netherlands
| | - Dorothy Reilly
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Basel, Switzerland
| | | | - Serena Scollen
- ELIXIR Hub, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Allyson Lister
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Milo Thurston
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Ramon Granell
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX13QG, Oxford, UK.
| |
Collapse
|
3
|
van Rijn J, Afantitis A, Culha M, Dusinska M, Exner TE, Jeliazkova N, Longhin EM, Lynch I, Melagraki G, Nymark P, Papadiamantis AG, Winkler DA, Yilmaz H, Willighagen E. European Registry of Materials: global, unique identifiers for (undisclosed) nanomaterials. J Cheminform 2022; 14:57. [PMID: 36002868 PMCID: PMC9400299 DOI: 10.1186/s13321-022-00614-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 05/21/2022] [Indexed: 11/25/2022] Open
Abstract
Management of nanomaterials and nanosafety data needs to operate under the FAIR (findability, accessibility, interoperability, and reusability) principles and this requires a unique, global identifier for each nanomaterial. Existing identifiers may not always be applicable or sufficient to definitively identify the specific nanomaterial used in a particular study, resulting in the use of textual descriptions in research project communications and reporting. To ensure that internal project documentation can later be linked to publicly released data and knowledge for the specific nanomaterials, or even to specific batches and variants of nanomaterials utilised in that project, a new identifier is proposed: the European Registry of Materials Identifier. We here describe the background to this new identifier, including FAIR interoperability as defined by FAIRSharing, identifiers.org, Bioregistry, and the CHEMINF ontology, and show how it complements other identifiers such as CAS numbers and the ongoing efforts to extend the InChI identifier to cover nanomaterials. We provide examples of its use in various H2020-funded nanosafety projects.
Collapse
Affiliation(s)
- Jeaphianne van Rijn
- Department of Bioinformatics-BiGCaT, NUTRIM, FHML, Maastricht University, Maastricht, The Netherlands.
| | | | - Mustafa Culha
- Sabanci University Nanotechnology Research and Application Center (SUNUM), Tuzla, 34956, Istanbul, Turkey
| | - Maria Dusinska
- Health Effects Laboratory, Department of Environmental Chemistry, Norwegian Institute for Air Research, 2007, Kjeller, Norway
| | | | | | - Eleonora Marta Longhin
- Health Effects Laboratory, Department of Environmental Chemistry, Norwegian Institute for Air Research, 2007, Kjeller, Norway
| | - Iseult Lynch
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, B15 2TT, UK
| | | | - Penny Nymark
- Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden
| | - Anastasios G Papadiamantis
- NovaMechanics Ltd., 1070, Nicosia, Cyprus.,School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, B15 2TT, UK
| | - David A Winkler
- School of Biochemistry and Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Bundoora, Australia.,Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Australia.,School of Pharmacy, University of Nottingham, Nottingham, UK
| | - Hulya Yilmaz
- Sabanci University Nanotechnology Research and Application Center (SUNUM), Tuzla, 34956, Istanbul, Turkey
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, FHML, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
4
|
Rutz A, Sorokina M, Galgonek J, Mietchen D, Willighagen E, Gaudry A, Graham JG, Stephan R, Page R, Vondrášek J, Steinbeck C, Pauli GF, Wolfender JL, Bisson J, Allard PM. The LOTUS initiative for open knowledge management in natural products research. eLife 2022; 11:e70780. [PMID: 35616633 PMCID: PMC9135406 DOI: 10.7554/elife.70780] [Citation(s) in RCA: 62] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 03/22/2022] [Indexed: 12/17/2022] Open
Abstract
Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges for data access, both within the discipline and for integration and interoperability between related fields. The fundamental elements of exchange are referenced structure-organism pairs that establish relationships between distinct molecular structures and the living organisms from which they were identified. Consolidating and sharing such information via an open platform has strong transformative potential for natural products research and beyond. This is the ultimate goal of the newly established LOTUS initiative, which has now completed the first steps toward the harmonization, curation, validation and open dissemination of 750,000+ referenced structure-organism pairs. LOTUS data is hosted on Wikidata and regularly mirrored on https://lotus.naturalproducts.net. Data sharing within the Wikidata framework broadens data access and interoperability, opening new possibilities for community curation and evolving publication models. Furthermore, embedding LOTUS data into the vast Wikidata knowledge graph will facilitate new biological and chemical insights. The LOTUS initiative represents an important advancement in the design and deployment of a comprehensive and collaborative natural products knowledge base.
Collapse
Affiliation(s)
- Adriano Rutz
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
| | - Maria Sorokina
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University JenaJenaGermany
| | - Jakub Galgonek
- Institute of Organic Chemistry and Biochemistry of the CASPragueCzech Republic
| | - Daniel Mietchen
- Ronin InstituteMontclairUnited States
- Leibniz Institute of Freshwater Ecology and Inland FisheriesBerlinGermany
- School of Data Science, University of VirginiaCharlottesvilleUnited States
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, Maastricht UniversityMaastrichtNetherlands
| | - Arnaud Gaudry
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
| | - James G Graham
- Center for Natural Product Technologies and WHO Collaborating Centre for Traditional Medicine (WHO CC/TRM), Pharmacognosy Institute; College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
| | - Ralf Stephan
- Ontario Institute for Cancer Research (OICR), University Ave SuiteTorontoCanada
| | | | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the CASPragueCzech Republic
| | - Christoph Steinbeck
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University JenaJenaGermany
| | - Guido F Pauli
- Center for Natural Product Technologies and WHO Collaborating Centre for Traditional Medicine (WHO CC/TRM), Pharmacognosy Institute; College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
| | - Jean-Luc Wolfender
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
| | - Jonathan Bisson
- Center for Natural Product Technologies and WHO Collaborating Centre for Traditional Medicine (WHO CC/TRM), Pharmacognosy Institute; College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at ChicagoChicagoUnited States
| | - Pierre-Marie Allard
- School of Pharmaceutical Sciences, University of GenevaGenevaSwitzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of GenevaGenevaSwitzerland
- Department of Biology, University of FribourgFribourgSwitzerland
| |
Collapse
|
5
|
Jacobs A, Williams D, Hickey K, Patrick N, Williams AJ, Chalk S, McEwen L, Willighagen E, Walker M, Bolton E, Sinclair G, Sanford A. CAS Common Chemistry in 2021: Expanding Access to Trusted Chemical Information for the Scientific Community. J Chem Inf Model 2022; 62:2737-2743. [PMID: 35559614 PMCID: PMC9199008 DOI: 10.1021/acs.jcim.2c00268] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
CAS Common Chemistry (https://commonchemistry.cas.org/) is an open web resource that provides access to reliable chemical substance information for the scientific community. Having served millions of visitors since its creation in 2009, the resource was extensively updated in 2021 with significant enhancements. The underlying dataset was expanded from 8000 to 500,000 chemical substances and includes additional associated information, such as basic properties and computer-readable chemical structure information. New use cases are supported with enhanced search capabilities and an integrated application programming interface. Reusable licensing of the content is provided through a Creative Commons Attribution-Non-Commercial (CC-BY-NC 4.0) license allowing other public resources to integrate the data into their systems. This paper provides an overview of the enhancements to data and functionality, discusses the benefits of the contribution to the chemistry community, and summarizes recent progress in leveraging this resource to strengthen other information sources.
Collapse
Affiliation(s)
- Andrea Jacobs
- CAS, 2540 Olentangy River Rd, Columbus, Ohio 43202, United States
| | - Dustin Williams
- CAS, 2540 Olentangy River Rd, Columbus, Ohio 43202, United States
| | - Katherine Hickey
- CAS, 2540 Olentangy River Rd, Columbus, Ohio 43202, United States
| | - Nathan Patrick
- CAS, 2540 Olentangy River Rd, Columbus, Ohio 43202, United States
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina 27711, United States
| | - Stuart Chalk
- Department of Chemistry, University of North Florida, Jacksonville, Florida 32224, United States
| | - Leah McEwen
- Physical Sciences Library, Cornell University, Ithaca, New York 14853, United States
| | - Egon Willighagen
- Department of Bioinformatics - BiGCaT, Maastricht University, 6229 ER Maastricht, The Netherlands
| | - Martin Walker
- Department of Chemistry, SUNY Potsdam, 44 Pierrepont Ave., Potsdam, New York 13676, United States
| | - Evan Bolton
- Department of Health and Human Services, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, Maryland 20894, United States
| | - Gabriel Sinclair
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina 27711, United States
| | - Adam Sanford
- CAS, 2540 Olentangy River Rd, Columbus, Ohio 43202, United States
| |
Collapse
|
6
|
Stocker M, Heger T, Schweidtmann A, Ćwiek-Kupczyńska H, Penev L, Dojchinovski M, Willighagen E, Vidal ME, Turki H, Balliet D, Tiddi I, Kuhn T, Mietchen D, Karras O, Vogt L, Hellmann S, Jeschke J, Krajewski P, Auer S. SKG4EOSC - Scholarly Knowledge Graphs for EOSC: Establishing a backbone of knowledge graphs for FAIR Scholarly Information in EOSC. RIO 2022. [DOI: 10.3897/rio.8.e83789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In the age of advanced information systems powering fast-paced knowledge economies that face global societal challenges, it is no longer adequate to express scholarly information - an essential resource for modern economies - primarily as article narratives in document form. Despite being a well-established tradition in scholarly communication, PDF-based text publishing is hindering scientific progress as it buries scholarly information into non-machine-readable formats. The key objective of SKG4EOSC is to improve science productivity through development and implementation of services for text and data conversion, and production, curation, and re-use of FAIR scholarly information. This will be achieved by (1) establishing the Open Research Knowledge Graph (ORKG, orkg.org), a service operated by the SKG4EOSC coordinator, as a Hub for access to FAIR scholarly information in the EOSC; (2) lifting to EOSC of numerous and heterogeneous domain-specific research infrastructures through the ORKG Hub’s harmonized access facilities; and (3) leverage the Hub to support cross-disciplinary research and policy decisions addressing societal challenges. SKG4EOSC will pilot the devised approaches and technologies in four research domains: biodiversity crisis, precision oncology, circular processes, and human cooperation. With the aim to improve machine-based scholarly information use, SKG4EOSC addresses an important current and future need of researchers. It extends the application of the FAIR data principles to scholarly communication practices, hence a more comprehensive coverage of the entire research lifecycle. Through explicit, machine actionable provenance links between FAIR scholarly information, primary data and contextual entities, it will substantially contribute to reproducibility, validation and trust in science. The resulting advanced machine support will catalyse new discoveries in basic research and solutions in key application areas.
Collapse
|
7
|
Willighagen E, Kutmon M, Martens M, Slenter D. BridgeDb and Wikidata: a powerful combination generating interoperable open research (BridgeDb). RIO 2022. [DOI: 10.3897/rio.8.e83031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Like humans have a unique social security number and different phone numbers from various providers, so do proteins and metabolites have a unique structure but different identifiers from various databases. BridgeDb is an interoperability platform that allows combining these databases, by matching database-specific identifiers. These matches are called identifier mappings, and they are indispensable when combining experimental (omics) data with knowledge in reference databases. BridgeDb takes care of this interoperability between gene, protein, metabolite, and other databases, thus enabling seamless integration of many knowledge bases and wet-lab results. Since databases get updated continuously, so should the Open Science BridgeDb project.
Collapse
|
8
|
Ammar A, Cavill R, Evelo C, Willighagen E. PSnpBind: a database of mutated binding site protein-ligand complexes constructed using a multithreaded virtual screening workflow. J Cheminform 2022; 14:8. [PMID: 35227289 PMCID: PMC8886843 DOI: 10.1186/s13321-021-00573-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 11/18/2021] [Indexed: 11/15/2022] Open
Abstract
A key concept in drug design is how natural variants, especially the ones occurring in the binding site of drug targets, affect the inter-individual drug response and efficacy by altering binding affinity. These effects have been studied on very limited and small datasets while, ideally, a large dataset of binding affinity changes due to binding site single-nucleotide polymorphisms (SNPs) is needed for evaluation. However, to the best of our knowledge, such a dataset does not exist. Thus, a reference dataset of ligands binding affinities to proteins with all their reported binding sites’ variants was constructed using a molecular docking approach. Having a large database of protein–ligand complexes covering a wide range of binding pocket mutations and a large small molecules’ landscape is of great importance for several types of studies. For example, developing machine learning algorithms to predict protein–ligand affinity or a SNP effect on it requires an extensive amount of data. In this work, we present PSnpBind: A large database of 0.6 million mutated binding site protein–ligand complexes constructed using a multithreaded virtual screening workflow. It provides a web interface to explore and visualize the protein–ligand complexes and a REST API to programmatically access the different aspects of the database contents. PSnpBind is open source and freely available at https://psnpbind.org.
Collapse
Affiliation(s)
- Ammar Ammar
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands.
| | - Rachel Cavill
- Department of Data Science and Knowledge Engineering, Maastricht University, Maastricht, The Netherlands
| | - Chris Evelo
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
9
|
Meldal BHM, Perfetto L, Combe C, Lubiana T, Ferreira Cavalcante JV, Bye-A-Jee H, Waagmeester A, del-Toro N, Shrivastava A, Barrera E, Wong E, Mlecnik B, Bindea G, Panneerselvam K, Willighagen E, Rappsilber J, Porras P, Hermjakob H, Orchard S. Complex Portal 2022: new curation frontiers. Nucleic Acids Res 2022; 50:D578-D586. [PMID: 34718729 PMCID: PMC8689886 DOI: 10.1093/nar/gkab991] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/07/2021] [Accepted: 10/10/2021] [Indexed: 01/02/2023] Open
Abstract
The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database of macromolecular complexes with known function from a range of model organisms. It summarizes complex composition, topology and function along with links to a large range of domain-specific resources (i.e. wwPDB, EMDB and Reactome). Since the last update in 2019, we have produced a first draft complexome for Escherichia coli, maintained and updated that of Saccharomyces cerevisiae, added over 40 coronavirus complexes and increased the human complexome to over 1100 complexes that include approximately 200 complexes that act as targets for viral proteins or are part of the immune system. The display of protein features in ComplexViewer has been improved and the participant table is now colour-coordinated with the nodes in ComplexViewer. Community collaboration has expanded, for example by contributing to an analysis of putative transcription cofactors and providing data accessible to semantic web tools through Wikidata which is now populated with manually curated Complex Portal content through a new bot. Our data license is now CC0 to encourage data reuse. Users are encouraged to get in touch, provide us with feedback and send curation requests through the 'Support' link.
Collapse
Affiliation(s)
- Birgit H M Meldal
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Livia Perfetto
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Fondazione Human Technopole, 20157 Milan, Italy
| | - Colin Combe
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3BF, UK
| | - Tiago Lubiana
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, Av. Professor Lineu Prestes 580, CEP 05508-000 São Paulo SP, Brasil
| | - João Vitor Ferreira Cavalcante
- Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, Federal University of Rio Grande do Norte, Av. Odilon Gomes de Lima 1722, Capim Macio, 59078-400 Natal/RN, Brasil
| | - Hema Bye-A-Jee
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Noemi del-Toro
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anjali Shrivastava
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Elisabeth Barrera
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edith Wong
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Bernhard Mlecnik
- Laboratory of Integrative Cancer Immunology, INSERM, 75006 Paris, France
- Equipe Labellisée Ligue Contre le Cancer, 75006 Paris, France
- Centre de Recherche des Cordeliers, Sorbonne Université, Université de Paris, 75006 Paris, France
- Inovarion, 75005 Paris, France
| | - Gabriela Bindea
- Laboratory of Integrative Cancer Immunology, INSERM, 75006 Paris, France
- Equipe Labellisée Ligue Contre le Cancer, 75006 Paris, France
- Centre de Recherche des Cordeliers, Sorbonne Université, Université de Paris, 75006 Paris, France
| | - Kalpana Panneerselvam
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Egon Willighagen
- Dept of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Universiteitssingel 50, 6229 ER Maastricht, The Netherlands
| | - Juri Rappsilber
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3BF, UK
- Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 13355 Berlin, Germany
| | - Pablo Porras
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
10
|
Guha R, Jeliazkova N, Willighagen E, Zdrazil B. Reply to "FAIR chemical structure in the Journal of Cheminformatics". J Cheminform 2021; 13:49. [PMID: 34229726 PMCID: PMC8261925 DOI: 10.1186/s13321-021-00521-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 05/25/2021] [Indexed: 12/04/2022] Open
Affiliation(s)
| | | | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht University, Maastricht, The Netherlands.
| | - Barbara Zdrazil
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria
| |
Collapse
|
11
|
Kyle JE, Aimo L, Bridge AJ, Clair G, Fedorova M, Helms JB, Molenaar MR, Ni Z, Orešič M, Slenter D, Willighagen E, Webb-Robertson BJM. Interpreting the lipidome: bioinformatic approaches to embrace the complexity. Metabolomics 2021; 17:55. [PMID: 34091802 DOI: 10.1007/s11306-021-01802-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 05/18/2021] [Indexed: 12/13/2022]
Abstract
BACKGROUND Improvements in mass spectrometry (MS) technologies coupled with bioinformatics developments have allowed considerable advancement in the measurement and interpretation of lipidomics data in recent years. Since research areas employing lipidomics are rapidly increasing, there is a great need for bioinformatic tools that capture and utilize the complexity of the data. Currently, the diversity and complexity within the lipidome is often concealed by summing over or averaging individual lipids up to (sub)class-based descriptors, losing valuable information about biological function and interactions with other distinct lipids molecules, proteins and/or metabolites. AIM OF REVIEW To address this gap in knowledge, novel bioinformatics methods are needed to improve identification, quantification, integration and interpretation of lipidomics data. The purpose of this mini-review is to summarize exemplary methods to explore the complexity of the lipidome. KEY SCIENTIFIC CONCEPTS OF REVIEW Here we describe six approaches that capture three core focus areas for lipidomics: (1) lipidome annotation including a resolvable database identifier, (2) interpretation via pathway- and enrichment-based methods, and (3) understanding complex interactions to emphasize specific steps in the analytical process and highlight challenges in analyses associated with the complexity of lipidome data.
Collapse
Affiliation(s)
- Jennifer E Kyle
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Lucila Aimo
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, 1 rue Michel-Servet, 1211, Geneva 4, Switzerland
| | - Alan J Bridge
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, 1 rue Michel-Servet, 1211, Geneva 4, Switzerland
| | - Geremy Clair
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Maria Fedorova
- Institute of Bioanalytical Chemistry, Faculty of Chemistry and Mineralogy, Center for Biotechnology and Biomedicine, Universität Leipzig, Deutscher Platz 5, Leipzig, Germany
| | - J Bernd Helms
- Department of Biomolecular Health Sciences, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Martijn R Molenaar
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Zhixu Ni
- Institute of Bioanalytical Chemistry, Faculty of Chemistry and Mineralogy, Center for Biotechnology and Biomedicine, Universität Leipzig, Deutscher Platz 5, Leipzig, Germany
| | - Matej Orešič
- School of Medical Sciences, Örebro University, 702 81, Örebro, Sweden
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520, Turku, Finland
| | - Denise Slenter
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER, Maastricht, The Netherlands
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER, Maastricht, The Netherlands
| | | |
Collapse
|
12
|
Affiliation(s)
- Rajarshi Guha
- Vertex Pharmaceuticals, 50 Northern Ave, Boston, MA, 02210, USA.
| | - Egon Willighagen
- Maastricht University, Universiteitssingel 50, 6229 ER, Maastricht, Netherlands
| | - Barbara Zdrazil
- University of Vienna, Althanstraße 14, 1090, Vienna, Austria
| | | |
Collapse
|
13
|
Lynch I, Afantitis A, Exner T, Himly M, Lobaskin V, Doganis P, Maier D, Sanabria N, Papadiamantis AG, Rybinska-Fryca A, Gromelski M, Puzyn T, Willighagen E, Johnston BD, Gulumian M, Matzke M, Green Etxabe A, Bossa N, Serra A, Liampa I, Harper S, Tämm K, Jensen ACØ, Kohonen P, Slater L, Tsoumanis A, Greco D, Winkler DA, Sarimveis H, Melagraki G. Can an InChI for Nano Address the Need for a Simplified Representation of Complex Nanomaterials across Experimental and Nanoinformatics Studies? Nanomaterials (Basel) 2020; 10:E2493. [PMID: 33322568 PMCID: PMC7764592 DOI: 10.3390/nano10122493] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Revised: 12/05/2020] [Accepted: 12/08/2020] [Indexed: 12/16/2022]
Abstract
Chemoinformatics has developed efficient ways of representing chemical structures for small molecules as simple text strings, simplified molecular-input line-entry system (SMILES) and the IUPAC International Chemical Identifier (InChI), which are machine-readable. In particular, InChIs have been extended to encode formalized representations of mixtures and reactions, and work is ongoing to represent polymers and other macromolecules in this way. The next frontier is encoding the multi-component structures of nanomaterials (NMs) in a machine-readable format to enable linking of datasets for nanoinformatics and regulatory applications. A workshop organized by the H2020 research infrastructure NanoCommons and the nanoinformatics project NanoSolveIT analyzed issues involved in developing an InChI for NMs (NInChI). The layers needed to capture NM structures include but are not limited to: core composition (possibly multi-layered); surface topography; surface coatings or functionalization; doping with other chemicals; and representation of impurities. NM distributions (size, shape, composition, surface properties, etc.), types of chemical linkages connecting surface functionalization and coating molecules to the core, and various crystallographic forms exhibited by NMs also need to be considered. Six case studies were conducted to elucidate requirements for unambiguous description of NMs. The suggested NInChI layers are intended to stimulate further analysis that will lead to the first version of a "nano" extension to the InChI standard.
Collapse
Affiliation(s)
- Iseult Lynch
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK;
| | - Antreas Afantitis
- Nanoinformatics Department, NovaMechanics Ltd., 1666 Nicosia, Cyprus; (A.A.); (A.T.)
| | - Thomas Exner
- Edelweiss Connect GmbH, Hochbergerstrasse 60C, 4057 Basel, Switzerland;
| | - Martin Himly
- Department Biosciences, Paris Lodron University of Salzburg, Hellbrunnerstrasse 34, 5020 Salzburg, Austria;
| | - Vladimir Lobaskin
- School of Physics, University College Dublin, Belfield, Dublin 4, Ireland;
| | - Philip Doganis
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece; (P.D.); (I.L.); (H.S.)
| | - Dieter Maier
- Biomax Informatics AG, Robert-Koch-Str. 2, 82152 Planegg, Germany;
| | - Natasha Sanabria
- National Health Laboratory Services, 1 Modderfontein Rd, Sandringham, Johannesburg 2192, South Africa; (N.S.); (M.G.)
| | - Anastasios G. Papadiamantis
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK;
- Nanoinformatics Department, NovaMechanics Ltd., 1666 Nicosia, Cyprus; (A.A.); (A.T.)
| | - Anna Rybinska-Fryca
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland; (A.R.-F.); (M.G.); (T.P.)
| | - Maciej Gromelski
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland; (A.R.-F.); (M.G.); (T.P.)
| | - Tomasz Puzyn
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland; (A.R.-F.); (M.G.); (T.P.)
| | - Egon Willighagen
- Department of Bioinformatics—BiGCaT, School of Nutrition and Translational Research in Metabolism, Maastricht University, Universiteitssingel 50, 6229 ER Maastricht, The Netherlands;
| | - Blair D. Johnston
- Department Chemicals and Product Safety, Federal Institute for Risk Assessment, Max-Dohrn-Str. 8-10, 10589 Berlin, Germany;
| | - Mary Gulumian
- National Health Laboratory Services, 1 Modderfontein Rd, Sandringham, Johannesburg 2192, South Africa; (N.S.); (M.G.)
- Haematology and Molecular Medicine, University of the Witwatersrand, 1 Jan Smuts Ave, Johannesburg 2000, South Africa
| | - Marianne Matzke
- UK Centre for Ecology and Hydrology, Maclean Building, Benson Lane, Crowmarsh Gifford OX10 8BB, UK; (M.M.); (A.G.E.)
| | - Amaia Green Etxabe
- UK Centre for Ecology and Hydrology, Maclean Building, Benson Lane, Crowmarsh Gifford OX10 8BB, UK; (M.M.); (A.G.E.)
| | - Nathan Bossa
- LEITAT Technological Center, Circular Economy Business Unit, C/de La Innovació 2, 08225 Terrassa, Barcelona, Spain;
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.S.); (D.G.)
| | - Irene Liampa
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece; (P.D.); (I.L.); (H.S.)
| | - Stacey Harper
- School of Chemical, Biological, and Environmental Engineering, Oregon State University, 116 Johnson Hall 105 SW 26th St., Corvallis, OR 97331, USA;
| | - Kaido Tämm
- Institute of Chemistry, University of Tartu, Ülikooli 18, 50090 Tartu, Estonia;
| | - Alexander CØ Jensen
- The National Research Center for the Work Environment, Lersø Parkallé 105, 2100 Copenhagen, Denmark;
| | - Pekka Kohonen
- Misvik Biology OY, Karjakatu 35 B, 20520 Turku, Finland;
| | - Luke Slater
- Institute of Cancer and Genomics, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK;
| | - Andreas Tsoumanis
- Nanoinformatics Department, NovaMechanics Ltd., 1666 Nicosia, Cyprus; (A.A.); (A.T.)
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland; (A.S.); (D.G.)
| | - David A. Winkler
- Institute of Molecular Sciences, La Trobe University, Kingsbury Drive, Bundoora 3086, Australia;
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville 3052, Australia
- School of Pharmacy, University of Nottingham, Nottingham NG7 2RD, UK
- CSIRO Data61, Pullenvale 4069, Australia
| | - Haralambos Sarimveis
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece; (P.D.); (I.L.); (H.S.)
| | - Georgia Melagraki
- Nanoinformatics Department, NovaMechanics Ltd., 1666 Nicosia, Cyprus; (A.A.); (A.T.)
| |
Collapse
|
14
|
Isigonis P, Afantitis A, Antunes D, Bartonova A, Beitollahi A, Bohmer N, Bouman E, Chaudhry Q, Cimpan MR, Cimpan E, Doak S, Dupin D, Fedrigo D, Fessard V, Gromelski M, Gutleb AC, Halappanavar S, Hoet P, Jeliazkova N, Jomini S, Lindner S, Linkov I, Longhin EM, Lynch I, Malsch I, Marcomini A, Mariussen E, de la Fuente JM, Melagraki G, Murphy F, Neaves M, Packroff R, Pfuhler S, Puzyn T, Rahman Q, Pran ER, Semenzin E, Serchi T, Steinbach C, Trump B, Vrček IV, Warheit D, Wiesner MR, Willighagen E, Dusinska M. Risk Governance of Emerging Technologies Demonstrated in Terms of its Applicability to Nanomaterials. Small 2020; 16:e2003303. [PMID: 32700469 DOI: 10.1002/smll.202003303] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Indexed: 06/11/2023]
Abstract
Nanotechnologies have reached maturity and market penetration that require nano-specific changes in legislation and harmonization among legislation domains, such as the amendments to REACH for nanomaterials (NMs) which came into force in 2020. Thus, an assessment of the components and regulatory boundaries of NMs risk governance is timely, alongside related methods and tools, as part of the global efforts to optimise nanosafety and integrate it into product design processes, via Safe(r)-by-Design (SbD) concepts. This paper provides an overview of the state-of-the-art regarding risk governance of NMs and lays out the theoretical basis for the development and implementation of an effective, trustworthy and transparent risk governance framework for NMs. The proposed framework enables continuous integration of the evolving state of the science, leverages best practice from contiguous disciplines and facilitates responsive re-thinking of nanosafety governance to meet future needs. To achieve and operationalise such framework, a science-based Risk Governance Council (RGC) for NMs is being developed. The framework will provide a toolkit for independent NMs' risk governance and integrates needs and views of stakeholders. An extension of this framework to relevant advanced materials and emerging technologies is also envisaged, in view of future foundations of risk research in Europe and globally.
Collapse
Affiliation(s)
- Panagiotis Isigonis
- Department of Environmental Sciences, Informatics and Statistics, University Ca' Foscari of Venice, Via Torino 155, Mestre, Venice, 30172, Italy
| | | | | | - Alena Bartonova
- NILU, Norwegian Institute for Air Research, Kjeller, 2007, Norway
| | - Ali Beitollahi
- INIC, Iran Nanotechnology Initiate Council, Tehran, Iran
| | - Nils Bohmer
- Society for Chemical Engineering and Biotechnology (DECHEMA), Theodor-Heuss-Allee 25, Frankfurt am Main, 60486, Germany
| | - Evert Bouman
- NILU, Norwegian Institute for Air Research, Kjeller, 2007, Norway
| | - Qasim Chaudhry
- University of Chester, Parkgate Road, Chester, CH1 4BJ, UK
| | - Mihaela Roxana Cimpan
- Department of Clinical Dentistry, Biomaterials, Faculty of Medicine, University of Bergen, Aarstadveien 19, Bergen, 5009, Norway
| | - Emil Cimpan
- Western Norway University of Applied Sciences, Inndalsveien 28, Bergen, 5063, Norway
| | - Shareen Doak
- Swansea University Medical School, Singleton Park, Swansea, Wales, SA2 8PP, UK
| | - Damien Dupin
- CIDETEC, Paseo Miramón 196, Donostia-San Sebastián, 20014, Spain
| | - Doreen Fedrigo
- ECOS - European Environmental Citizens Organization for Standardization, Rue d'Edimbourg, 26, Brussels, 1050, Belgium
| | - Valérie Fessard
- ANSES Fougères Laboratory, Contaminant Toxicology Unit and Risk Management Support, Unit of Chemicals Assessment, Risk Assessment Department, 14 rue Pierre et Marie Curie, Maisons-Alfort, Cedex 94701, France
| | - Maciej Gromelski
- QSAR Lab Sp. z o.o., al. Grunwaldzka 190/102, Gdańsk, 80-266, Poland
| | - Arno C Gutleb
- LIST, Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | - Sabina Halappanavar
- Environmental Health Science and Research Bureau, Health Canada, Ottawa, Ontario, Canada
| | - Peter Hoet
- KU Leuven, Department of Public Health and Primary Care, Unit of Environment and Health, Leuven, 3000, Belgium
| | - Nina Jeliazkova
- IDEA Ideaconsult Limited Liability Company, Angel Kanchev 4, Sofia, 1000, Bulgaria
| | - Stéphane Jomini
- ANSES Fougères Laboratory, Contaminant Toxicology Unit and Risk Management Support, Unit of Chemicals Assessment, Risk Assessment Department, 14 rue Pierre et Marie Curie, Maisons-Alfort, Cedex 94701, France
| | - Sabine Lindner
- Plastics Europe Deutschland e. V., Mainzer Landstrasse 55, Frankfurt am Main, 60329, Germany
| | - Igor Linkov
- Factor Social Lda., Lisbon, Portugal
- US Army Engineer Research and Development Center and Carnegie Mellon University, Lisbon, Portugal
| | | | - Iseult Lynch
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Ineke Malsch
- Malsch TechnoValuation, PO Box 455, Utrecht, AL, 3500, The Netherlands
| | - Antonio Marcomini
- Department of Environmental Sciences, Informatics and Statistics, University Ca' Foscari of Venice, Via Torino 155, Mestre, Venice, 30172, Italy
| | - Espen Mariussen
- NILU, Norwegian Institute for Air Research, Kjeller, 2007, Norway
| | - Jesus M de la Fuente
- Instituto de Ciencia de Materiales de Aragón (ICMA), Consejo Superior de Investigaciones Científicas (CSIC)-Universidad de Zaragoza, C/Pedro Cerbuna 12, Zaragoza, 50009, Spain
| | | | | | - Michael Neaves
- ECOS - European Environmental Citizens Organization for Standardization, Rue d'Edimbourg, 26, Brussels, 1050, Belgium
| | - Rolf Packroff
- Division of 'Hazardous chemicals and biological agents', BAuA - Federal Institute for Occupational Safety and Health, Dortmund, Germany
| | - Stefan Pfuhler
- Procter & Gamble Co., Miami Valley Innovation Center, 11810 East Miami River Road, Cincinnati, OH, 45239 8707, USA
| | - Tomasz Puzyn
- QSAR Lab Sp. z o.o., al. Grunwaldzka 190/102, Gdańsk, 80-266, Poland
- University of Gdansk, Faculty of Chemistry, Group of Environmental Chemometrics, Wita Stwosza 63, Gdańsk, 80-308, Poland
| | | | | | - Elena Semenzin
- Department of Environmental Sciences, Informatics and Statistics, University Ca' Foscari of Venice, Via Torino 155, Mestre, Venice, 30172, Italy
| | - Tommaso Serchi
- LIST, Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | - Christoph Steinbach
- Society for Chemical Engineering and Biotechnology (DECHEMA), Theodor-Heuss-Allee 25, Frankfurt am Main, 60486, Germany
| | - Benjamin Trump
- Factor Social Lda., Lisbon, Portugal
- US Army Engineer Research and Development Center and University of Michigan, Lisbon, Portugal
| | - Ivana Vinković Vrček
- Institute for Medical Research and Occupational Health, Analytical Toxicology and Mineral Metabolism Unit, Ksaverska cesta 2, Zagreb, 10 000, Croatia
| | | | - Mark R Wiesner
- Department of Civil and Environmental Engineering, Center for the Environmental Implications of NanoTechnology (CEINT) Duke University, 121 Hudson Hall, Durham, NC, 27708-0287, USA
| | - Egon Willighagen
- Department of Bioinformatics, BiGCaT, NUTRIM, Maastricht University, Maastricht, ER 6229, The Netherlands
| | - Maria Dusinska
- NILU, Norwegian Institute for Air Research, Kjeller, 2007, Norway
| |
Collapse
|
15
|
Affiliation(s)
- Egon Willighagen
- Dept of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Universiteitssingel 50, 6229 ER, Maastricht, The Netherlands.
| |
Collapse
|
16
|
Waagmeester A, Stupp G, Burgstaller-Muehlbacher S, Good BM, Griffith M, Griffith OL, Hanspers K, Hermjakob H, Hudson TS, Hybiske K, Keating SM, Manske M, Mayers M, Mietchen D, Mitraka E, Pico AR, Putman T, Riutta A, Queralt-Rosinach N, Schriml LM, Shafee T, Slenter D, Stephan R, Thornton K, Tsueng G, Tu R, Ul-Hasan S, Willighagen E, Wu C, Su AI. Wikidata as a knowledge graph for the life sciences. eLife 2020; 9:e52614. [PMID: 32180547 PMCID: PMC7077981 DOI: 10.7554/elife.52614] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 02/28/2020] [Indexed: 12/22/2022] Open
Abstract
Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.
Collapse
Affiliation(s)
| | - Gregory Stupp
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Sebastian Burgstaller-Muehlbacher
- Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of ViennaViennaAustria
| | - Benjamin M Good
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Malachi Griffith
- McDonnell Genome Institute, Washington University School of MedicineSt. LouisUnited States
| | - Obi L Griffith
- McDonnell Genome Institute, Washington University School of MedicineSt. LouisUnited States
| | - Kristina Hanspers
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | | | - Toby S Hudson
- School of Chemistry, The University of SydneySydneyAustralia
| | - Kevin Hybiske
- Division of Allergy and Infectious Diseases, Department of Medicine, University of WashingtonSeattleUnited States
| | - Sarah M Keating
- European Bioinformatics Institute (EMBL-EBI)HinxtonUnited Kingdom
| | - Magnus Manske
- Wellcome Trust Sanger InstituteCambridgeUnited Kingdom
| | - Michael Mayers
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Daniel Mietchen
- School of Data Science, University of VirginiaCharlottesvilleUnited States
| | - Elvira Mitraka
- University of Maryland School of MedicineBaltimoreUnited States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | - Timothy Putman
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Anders Riutta
- Institute of Data Science and Biotechnology, Gladstone InstitutesSan FranciscoUnited States
| | - Nuria Queralt-Rosinach
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Lynn M Schriml
- University of Maryland School of MedicineBaltimoreUnited States
| | - Thomas Shafee
- Department of Animal Plant and Soil Sciences, La Trobe UniversityMelbourneAustralia
| | - Denise Slenter
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht UniversityMaastrichtNetherlands
| | | | | | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Roger Tu
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Sabah Ul-Hasan
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht UniversityMaastrichtNetherlands
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research InstituteLa JollaUnited States
| |
Collapse
|
17
|
Afantitis A, Melagraki G, Isigonis P, Tsoumanis A, Varsou DD, Valsami-Jones E, Papadiamantis A, Ellis LJA, Sarimveis H, Doganis P, Karatzas P, Tsiros P, Liampa I, Lobaskin V, Greco D, Serra A, Kinaret PAS, Saarimäki LA, Grafström R, Kohonen P, Nymark P, Willighagen E, Puzyn T, Rybinska-Fryca A, Lyubartsev A, Alstrup Jensen K, Brandenburg JG, Lofts S, Svendsen C, Harrison S, Maier D, Tamm K, Jänes J, Sikk L, Dusinska M, Longhin E, Rundén-Pran E, Mariussen E, El Yamani N, Unger W, Radnik J, Tropsha A, Cohen Y, Leszczynski J, Ogilvie Hendren C, Wiesner M, Winkler D, Suzuki N, Yoon TH, Choi JS, Sanabria N, Gulumian M, Lynch I. NanoSolveIT Project: Driving nanoinformatics research to develop innovative and integrated tools for in silico nanosafety assessment. Comput Struct Biotechnol J 2020; 18:583-602. [PMID: 32226594 PMCID: PMC7090366 DOI: 10.1016/j.csbj.2020.02.023] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 02/18/2020] [Accepted: 02/29/2020] [Indexed: 01/26/2023] Open
Abstract
Nanotechnology has enabled the discovery of a multitude of novel materials exhibiting unique physicochemical (PChem) properties compared to their bulk analogues. These properties have led to a rapidly increasing range of commercial applications; this, however, may come at a cost, if an association to long-term health and environmental risks is discovered or even just perceived. Many nanomaterials (NMs) have not yet had their potential adverse biological effects fully assessed, due to costs and time constraints associated with the experimental assessment, frequently involving animals. Here, the available NM libraries are analyzed for their suitability for integration with novel nanoinformatics approaches and for the development of NM specific Integrated Approaches to Testing and Assessment (IATA) for human and environmental risk assessment, all within the NanoSolveIT cloud-platform. These established and well-characterized NM libraries (e.g. NanoMILE, NanoSolutions, NANoREG, NanoFASE, caLIBRAte, NanoTEST and the Nanomaterial Registry (>2000 NMs)) contain physicochemical characterization data as well as data for several relevant biological endpoints, assessed in part using harmonized Organisation for Economic Co-operation and Development (OECD) methods and test guidelines. Integration of such extensive NM information sources with the latest nanoinformatics methods will allow NanoSolveIT to model the relationships between NM structure (morphology), properties and their adverse effects and to predict the effects of other NMs for which less data is available. The project specifically addresses the needs of regulatory agencies and industry to effectively and rapidly evaluate the exposure, NM hazard and risk from nanomaterials and nano-enabled products, enabling implementation of computational 'safe-by-design' approaches to facilitate NM commercialization.
Collapse
Key Words
- (quantitative) Structure–activity relationships
- AI, Artificial Intelligence
- AOPs, Adverse Outcome Pathways
- API, Application Programming interface
- CG, coarse-grained (model)
- CNTs, carbon nanotubes
- Computational toxicology
- Engineered nanomaterials
- FAIR, Findable Accessible Inter-operable and Re-usable
- GUI, Graphical Processing Unit
- HOMO-LUMO, Highest Occupied Molecular Orbital Lowest Unoccupied Molecular Orbital
- Hazard assessment
- IATA, Integrated Approaches to Testing and Assessment
- Integrated approach for testing and assessment
- KE, key events
- MIE, molecular initiating events
- ML, machine learning
- MOA, mechanism (mode) of action
- MWCNT, multi-walled carbon nanotubes
- Machine learning
- NMs, nanomaterials
- Nanoinformatics
- OECD, Organisation for Economic Co-operation and Development
- PBPK, Physiologically Based PharmacoKinetics
- PC, Protein Corona
- PChem, Physicochemical
- PTGS, Predictive Toxicogenomics Space
- Predictive modelling
- QC, quantum-chemical
- QM, quantum-mechanical
- QSAR, quantitative structure-activity relationship
- QSPR, quantitative structure-property relationship
- RA, risk assessment
- REST, Representational State Transfer
- ROS, reactive oxygen species
- Read across
- SAR, structure-activity relationship
- SMILES, Simplified Molecular Input Line Entry System
- SOPs, standard operating procedures
- Safe-by-design
- Toxicogenomics
Collapse
Affiliation(s)
| | | | | | | | | | - Eugenia Valsami-Jones
- School of Geography, Earth and Environmental Sciences, University of Birmingham, B15 2TT Birmingham, UK
| | - Anastasios Papadiamantis
- School of Geography, Earth and Environmental Sciences, University of Birmingham, B15 2TT Birmingham, UK
| | - Laura-Jayne A. Ellis
- School of Geography, Earth and Environmental Sciences, University of Birmingham, B15 2TT Birmingham, UK
| | - Haralambos Sarimveis
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
| | - Philip Doganis
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
| | - Pantelis Karatzas
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
| | - Periklis Tsiros
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
| | - Irene Liampa
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
| | - Vladimir Lobaskin
- School of Physics, University College Dublin, Belfield, Dublin 4, Ireland
| | - Dario Greco
- Faculty of Medicine and Health Technology, University of Tampere, FI-33014, Finland
| | - Angela Serra
- Faculty of Medicine and Health Technology, University of Tampere, FI-33014, Finland
| | | | | | - Roland Grafström
- Misvik Biology OY, Itäinen Pitkäkatu 4, 20520 Turku, Finland
- Karolinska Institute, Institute of Environmental Medicine, Nobels väg 13, 17177 Stockholm, Sweden
| | - Pekka Kohonen
- Misvik Biology OY, Itäinen Pitkäkatu 4, 20520 Turku, Finland
- Karolinska Institute, Institute of Environmental Medicine, Nobels väg 13, 17177 Stockholm, Sweden
| | - Penny Nymark
- Misvik Biology OY, Itäinen Pitkäkatu 4, 20520 Turku, Finland
- Karolinska Institute, Institute of Environmental Medicine, Nobels väg 13, 17177 Stockholm, Sweden
| | - Egon Willighagen
- Department of Bioinformatics – BiGCaT, School of Nutrition and Translational Research in Metabolism, Maastricht University, Universiteitssingel 50, 6229 ER Maastricht, the Netherlands
| | - Tomasz Puzyn
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland
- University of Gdansk, Faculty of Chemistry, Wita Stwosza 63, 80-308 Gdansk, Poland
| | | | - Alexander Lyubartsev
- Institutionen för material- och miljökemi, Stockholms Universitet, 106 91 Stockholm, Sweden
| | - Keld Alstrup Jensen
- The National Research Center for the Work Environment, Lersø Parkallé 105, 2100 Copenhagen, Denmark
| | - Jan Gerit Brandenburg
- Interdisciplinary Center for Scientific Computing, Heidelberg University, Germany
- Chief Digital Organization, Merck KGaA, Frankfurter Str. 250, 64293 Darmstadt, Germany
| | - Stephen Lofts
- UK Centre for Ecology and Hydrology, Library Ave, Bailrigg, Lancaster LA1 4AP, UK
| | - Claus Svendsen
- UK Centre for Ecology and Hydrology, MacLean Bldg, Benson Ln, Crowmarsh Gifford, Wallingford OX10 8BB, UK
| | - Samuel Harrison
- UK Centre for Ecology and Hydrology, Library Ave, Bailrigg, Lancaster LA1 4AP, UK
| | - Dieter Maier
- Biomax Informatics AG, Robert-Koch-Str. 2, 82152 Planegg, Germany
| | - Kaido Tamm
- Department of Chemistry, University of Tartu, Ülikooli 18, 50090 Tartu, Estonia
| | - Jaak Jänes
- Department of Chemistry, University of Tartu, Ülikooli 18, 50090 Tartu, Estonia
| | - Lauri Sikk
- Department of Chemistry, University of Tartu, Ülikooli 18, 50090 Tartu, Estonia
| | - Maria Dusinska
- NILU-Norwegian Institute for Air Research, Instituttveien 18, 2002 Kjeller, Norway
| | - Eleonora Longhin
- NILU-Norwegian Institute for Air Research, Instituttveien 18, 2002 Kjeller, Norway
| | - Elise Rundén-Pran
- NILU-Norwegian Institute for Air Research, Instituttveien 18, 2002 Kjeller, Norway
| | - Espen Mariussen
- NILU-Norwegian Institute for Air Research, Instituttveien 18, 2002 Kjeller, Norway
| | - Naouale El Yamani
- NILU-Norwegian Institute for Air Research, Instituttveien 18, 2002 Kjeller, Norway
| | - Wolfgang Unger
- Federal Institute for Material Testing and Research (BAM), Unter den Eichen 44-46, 12203 Berlin, Germany
| | - Jörg Radnik
- Federal Institute for Material Testing and Research (BAM), Unter den Eichen 44-46, 12203 Berlin, Germany
| | - Alexander Tropsha
- Eschelman School of Pharmacy, University of North Carolina at Chapel Hill, 100K Beard Hall, CB# 7568, Chapel Hill, NC 27955-7568, USA
| | - Yoram Cohen
- Samueli School Of Engineering, University of California, Los Angeles, 5531 Boelter Hall, Los Angeles, CA 90095, USA
| | - Jerzy Leszczynski
- Interdisciplinary Nanotoxicity Center, Jackson State University, 1400 J. R. Lynch Street, Jackson, MS 39217, USA
| | - Christine Ogilvie Hendren
- Center for Environmental Implications of Nanotechnologies, Duke University, 121 Hudson Hall, Durham, NC 27708-0287, USA
| | - Mark Wiesner
- Center for Environmental Implications of Nanotechnologies, Duke University, 121 Hudson Hall, Durham, NC 27708-0287, USA
| | - David Winkler
- La Trobe Institute of Molecular Sciences, La Trobe University, Plenty Rd & Kingsbury Dr, Bundoora, VIC 3086, Australia
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville 3052, Australia
- CSIRO Data61, Clayton 3168, Australia
- School of Pharmacy, University of Nottingham, Nottingham, UK
| | - Noriyuki Suzuki
- National Institute for Environmental Studies, 16-2 Onogawa, Tsukuba, Ibaraki 305-0053, Japan
| | - Tae Hyun Yoon
- Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Republic of Korea
| | - Jang-Sik Choi
- Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Republic of Korea
| | - Natasha Sanabria
- National Health Laboratory Services, 1 Modderfontein Rd, Sandringham, Johannesburg 2192, South Africa
| | - Mary Gulumian
- National Health Laboratory Services, 1 Modderfontein Rd, Sandringham, Johannesburg 2192, South Africa
- Haematology and Molecular Medicine, University of the Witwatersrand, Johannesburg, South Africa
| | - Iseult Lynch
- School of Geography, Earth and Environmental Sciences, University of Birmingham, B15 2TT Birmingham, UK
| |
Collapse
|
18
|
Affiliation(s)
- Rajarshi Guha
- Vertex Pharmaceuticals, 50 Northern Ave, Boston, MA, 02210, USA.
| | - Egon Willighagen
- Maastricht University, Universiteitssingel 50, 6229 ER , Maastricht, The Netherlands
| |
Collapse
|
19
|
Stanstrup J, Broeckling CD, Helmus R, Hoffmann N, Mathé E, Naake T, Nicolotti L, Peters K, Rainer J, Salek RM, Schulze T, Schymanski EL, Stravs MA, Thévenot EA, Treutler H, Weber RJM, Willighagen E, Witting M, Neumann S. The metaRbolomics Toolbox in Bioconductor and beyond. Metabolites 2019; 9:E200. [PMID: 31548506 PMCID: PMC6835268 DOI: 10.3390/metabo9100200] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/17/2022] Open
Abstract
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
Collapse
Affiliation(s)
- Jan Stanstrup
- Preventive and Clinical Nutrition, University of Copenhagen, Rolighedsvej 30, 1958 Frederiksberg C, Denmark.
| | - Corey D Broeckling
- Proteomics and Metabolomics Facility, Colorado State University, Fort Collins, CO 80523, USA.
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1098 XH Amsterdam, The Netherlands.
| | - Nils Hoffmann
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Otto-Hahn-Straße 6b, 44227 Dortmund, Germany.
| | - Ewy Mathé
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | - Thomas Naake
- Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
| | - Luca Nicolotti
- The Australian Wine Research Institute, Metabolomics Australia, PO Box 197, Adelaide SA 5064, Australia.
| | - Kristian Peters
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Johannes Rainer
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, 39100 Bolzano, Italy.
| | - Reza M Salek
- The International Agency for Research on Cancer, 150 cours Albert Thomas, CEDEX 08, 69372 Lyon, France.
| | - Tobias Schulze
- Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research-UFZ, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg.
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dubendorf, Switzerland.
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Sciences and Decision, MetaboHUB, Gif-Sur-Yvette F-91191, France.
| | - Hendrik Treutler
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Ralf J M Weber
- Phenome Centre Birmingham and School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, The Netherlands.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, 85764 Neuherberg, Germany.
- Chair of Analytical Food Chemistry, Technische Universität München, 85354 Weihenstephan, Germany.
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103 Leipzig, Germany.
| |
Collapse
|
20
|
Affiliation(s)
- Egon Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, P.O. Box 616, UNS 50 Box 19, 6200 MD, Maastricht, The Netherlands.
| | | | - Rajarshi Guha
- Vertex Pharmaceuticals, 50 Northern Avenue, Boston, MA, 02210, USA
| |
Collapse
|
21
|
Rasberry L, Willighagen E, Nielsen F, Mietchen D. Robustifying Scholia: paving the way for knowledge discovery and research assessment through Wikidata. RIO 2019. [DOI: 10.3897/rio.5.e35820] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Knowledge workers like researchers, students, journalists, research evaluators or funders need tools to explore what is known, how it was discovered, who made which contributions, and where the scholarly record has gaps. Existing tools and services of this kind are not available as Linked Open Data, but Wikidata is. It has the technology, active contributor base, and content to build a large-scale knowledge graph for scholarship, also known as WikiCite. Scholia visualizes this graph in an exploratory interface with profiles and links to the literature. However, it is just a working prototype. This project aims to "robustify Scholia" with back-end development and testing based on pilot corpora. The main objective at this stage is to attain stability in challenging cases such as server throttling and handling of large or incomplete datasets. Further goals include integrating Scholia with data curation and manuscript writing workflows, serving more languages, generating usage stats, and documentation.
Collapse
|
22
|
Martens M, Willighagen E, Nymark P, Grafström R, Burgoon L, Aladjov H, Andón FT, Evelo C. Introducing WikiPathways to link molecular pathways to adverse outcome pathways to support regulatory risk assessment. Toxicol Lett 2018. [DOI: 10.1016/j.toxlet.2018.06.962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
23
|
Exner T, Dokler J, Bachler D, Farcal L, Evelo C, Willighagen E, Jennen D, Jabocs M, Doganis P, Sarimveis H, Lynch I, Gkoutos G, Kramer S, Notredame C, Spjuth O, Jennings P, Dudgeon T, Bois F, Hardy B. OpenRiskNet, an open e-infrastructure to support data sharing, knowledge integration and in silico analysis and modelling in risk assessment. Toxicol Lett 2018. [DOI: 10.1016/j.toxlet.2018.06.617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
24
|
Nymark P, Rieswijk L, Ehrhart F, Jeliazkova N, Tsiliki G, Sarimveis H, Evelo CT, Hongisto V, Kohonen P, Willighagen E, Grafström RC. A Data Fusion Pipeline for Generating and Enriching Adverse Outcome Pathway Descriptions. Toxicol Sci 2017; 162:264-275. [DOI: 10.1093/toxsci/kfx252] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Affiliation(s)
- Penny Nymark
- Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden
- Department of Toxicology, Misvik Biology, 20520 Turku, Finland
| | - Linda Rieswijk
- Department of Bioinformatics, NUTRIM, Maastricht University, 6200MD Maastricht, The Netherlands
- Division of Environmental Health Sciences, School of Public Health, University of California, 94720-7360 Berkeley, California, United States
| | - Friederike Ehrhart
- Department of Bioinformatics, NUTRIM, Maastricht University, 6200MD Maastricht, The Netherlands
| | | | - Georgia Tsiliki
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
- Institute for the Management of Information Systems, ATHENA Research and Innovation Centre, 151 25 Athens, Greece
| | - Haralambos Sarimveis
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
| | - Chris T Evelo
- Department of Bioinformatics, NUTRIM, Maastricht University, 6200MD Maastricht, The Netherlands
| | - Vesa Hongisto
- Department of Toxicology, Misvik Biology, 20520 Turku, Finland
| | - Pekka Kohonen
- Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden
- Department of Toxicology, Misvik Biology, 20520 Turku, Finland
| | - Egon Willighagen
- Department of Bioinformatics, NUTRIM, Maastricht University, 6200MD Maastricht, The Netherlands
| | - Roland C Grafström
- Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden
- Department of Toxicology, Misvik Biology, 20520 Turku, Finland
| |
Collapse
|
25
|
Leist M, Ghallab A, Graepel R, Marchan R, Hassan R, Bennekou SH, Limonciel A, Vinken M, Schildknecht S, Waldmann T, Danen E, van Ravenzwaay B, Kamp H, Gardner I, Godoy P, Bois FY, Braeuning A, Reif R, Oesch F, Drasdo D, Höhme S, Schwarz M, Hartung T, Braunbeck T, Beltman J, Vrieling H, Sanz F, Forsby A, Gadaleta D, Fisher C, Kelm J, Fluri D, Ecker G, Zdrazil B, Terron A, Jennings P, van der Burg B, Dooley S, Meijer AH, Willighagen E, Martens M, Evelo C, Mombelli E, Taboureau O, Mantovani A, Hardy B, Koch B, Escher S, van Thriel C, Cadenas C, Kroese D, van de Water B, Hengstler JG. Adverse outcome pathways: opportunities, limitations and open questions. Arch Toxicol 2017; 91:3477-3505. [DOI: 10.1007/s00204-017-2045-3] [Citation(s) in RCA: 221] [Impact Index Per Article: 31.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 08/21/2017] [Indexed: 12/18/2022]
|
26
|
Lampa S, Willighagen E, Kohonen P, King A, Vrandečić D, Grafström R, Spjuth O. RDFIO: extending Semantic MediaWiki for interoperable biomedical data management. J Biomed Semantics 2017; 8:35. [PMID: 28870259 PMCID: PMC5584330 DOI: 10.1186/s13326-017-0136-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Accepted: 08/01/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biological sciences are characterised not only by an increasing amount but also the extreme complexity of its data. This stresses the need for efficient ways of integrating these data in a coherent description of biological systems. In many cases, biological data needs organization before integration. This is not seldom a collaborative effort, and it is thus important that tools for data integration support a collaborative way of working. Wiki systems with support for structured semantic data authoring, such as Semantic MediaWiki, provide a powerful solution for collaborative editing of data combined with machine-readability, so that data can be handled in an automated fashion in any downstream analyses. Semantic MediaWiki lacks a built-in data import function though, which hinders efficient round-tripping of data between interoperable Semantic Web formats such as RDF and the internal wiki format. RESULTS To solve this deficiency, the RDFIO suite of tools is presented, which supports importing of RDF data into Semantic MediaWiki, with metadata needed to export it again in the same RDF format, or ontology. Additionally, the new functionality enables mash-ups of automated data imports combined with manually created data presentations. The application of the suite of tools is demonstrated by importing drug discovery related data about rare diseases from Orphanet and acid dissociation constants from Wikidata. The RDFIO suite of tools is freely available for download via pharmb.io/project/rdfio . CONCLUSIONS Through a set of biomedical demonstrators, it is demonstrated how the new functionality enables a number of usage scenarios where the interoperability of SMW and the wider Semantic Web is leveraged for biomedical data sets, to create an easy to use and flexible platform for exploring and working with biomedical data.
Collapse
Affiliation(s)
- Samuel Lampa
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, SE-751 24, Sweden.
| | - Egon Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, P.O. Box 616, UNS50 Box 19, Maastricht, NL-6200, MD, The Netherlands
| | - Pekka Kohonen
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, SE-171 77, Sweden.,Division of Toxicology, Misvik Biology Oy, Turku, Finland
| | | | | | - Roland Grafström
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, SE-171 77, Sweden.,Division of Toxicology, Misvik Biology Oy, Turku, Finland
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, SE-751 24, Sweden
| |
Collapse
|
27
|
Affiliation(s)
- Rajarshi Guha
- National Center for Advancing Translational Sciences, 9800 Medical Center Drive, Rockville, MD, 20850, USA.
| | - Egon Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, P.O. Box 616, UNS 50 Box 19, 6200 MD, Maastricht, The Netherlands.
| |
Collapse
|
28
|
Grafström RC, Nymark P, Hongisto V, Spjuth O, Ceder R, Willighagen E, Hardy B, Kaski S, Kohonen P. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures. Altern Lab Anim 2016; 43:325-32. [PMID: 26551289 DOI: 10.1177/026119291504300506] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety.
Collapse
Affiliation(s)
- Roland C Grafström
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Penny Nymark
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Vesa Hongisto
- Toxicology Department, Misvik Biology Corporation, Turku, Finland
| | - Ola Spjuth
- Department of Medical Epidemiology and Biostatistics, Swedish e-Science Research Centre, Karolinska Institutet, Stockholm, Sweden and Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Rebecca Ceder
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Egon Willighagen
- Department of Bioinformatics-BiGCat, Maastricht University, Maastricht, The Netherlands
| | - Barry Hardy
- Douglas Connect GmbH, Zeiningen, Switzerland
| | - Samuel Kaski
- Helsinki Institute for Information Technology, Aalto University, Department of Computer Science, Helsinki, Finland
| | - Pekka Kohonen
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
29
|
Mietchen D, Hagedorn G, Willighagen E, Rico M, Gómez-Pérez A, Aibar E, Rafes K, Germain C, Dunning A, Pintscher L, Kinzler D. Enabling Open Science: Wikidata for Research (Wiki4R). RIO 2015. [DOI: 10.3897/rio.1.e7573] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
30
|
Fu G, Batchelor C, Dumontier M, Hastings J, Willighagen E, Bolton E. PubChemRDF: towards the semantic annotation of PubChem compound and substance databases. J Cheminform 2015; 7:34. [PMID: 26175801 PMCID: PMC4500850 DOI: 10.1186/s13321-015-0084-4] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Accepted: 06/22/2015] [Indexed: 12/02/2022] Open
Abstract
Background PubChem is an open repository for chemical structures, biological activities and biomedical annotations. Semantic Web technologies are emerging as an increasingly important approach to distribute and integrate scientific data. Exposing PubChem data to Semantic Web services may help enable automated data integration and management, as well as facilitate interoperable web applications. Description This work, one of a series covering the PubChemRDF project, describes an approach to translate PubChem Substance and Compound information into Resource Description Framework (RDF) format. Basic examples are provided to demonstrate its use. The aim of this effort is to provide two new primary benefits to researchers in a cost-effective manner. Firstly, we aim to remove the inherent limitations of using the web-based resource PubChem by allowing a researcher to use readily available semantic technologies (namely, RDF triple stores and their corresponding SPARQL query engines) to query and analyze PubChem data on local computing resources. Secondly, this work intends to help improve data sharing, analysis, and integration of PubChem data to resources external to NCBI and across scientific domains, by means of the association of PubChem data to existing ontological frameworks, including CHEMical INFormation ontology, Semanticscience Integrated Ontology, and others. Conclusions With the goal of semantically describing information available in the PubChem archive, pre-existing ontological frameworks were used, rather than creating new ones. Semantic relationships between compounds and substances, chemical descriptors associated with compounds and substances, interrelationships between chemicals, as well as provenance and attribute metadata of substances are described. Electronic supplementary material The online version of this article (doi:10.1186/s13321-015-0084-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Gang Fu
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD USA
| | - Colin Batchelor
- Royal Society of Chemistry, Thomas Graham House, Cambridge, UK
| | - Michel Dumontier
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, USA
| | - Janna Hastings
- European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Evan Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD USA
| |
Collapse
|
31
|
Hastings J, Jeliazkova N, Owen G, Tsiliki G, Munteanu CR, Steinbeck C, Willighagen E. eNanoMapper: harnessing ontologies to enable data integration for nanomaterial risk assessment. J Biomed Semantics 2015; 6:10. [PMID: 25815161 PMCID: PMC4374589 DOI: 10.1186/s13326-015-0005-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Accepted: 02/27/2015] [Indexed: 11/18/2022] Open
Abstract
Engineered nanomaterials (ENMs) are being developed to meet specific application needs in diverse domains across the engineering and biomedical sciences (e.g. drug delivery). However, accompanying the exciting proliferation of novel nanomaterials is a challenging race to understand and predict their possibly detrimental effects on human health and the environment. The eNanoMapper project (www.enanomapper.net) is creating a pan-European computational infrastructure for toxicological data management for ENMs, based on semantic web standards and ontologies. Here, we describe the development of the eNanoMapper ontology based on adopting and extending existing ontologies of relevance for the nanosafety domain. The resulting eNanoMapper ontology is available at http://purl.enanomapper.net/onto/enanomapper.owl. We aim to make the re-use of external ontology content seamless and thus we have developed a library to automate the extraction of subsets of ontology content and the assembly of the subsets into an integrated whole. The library is available (open source) at http://github.com/enanomapper/slimmer/. Finally, we give a comprehensive survey of the domain content and identify gap areas. ENM safety is at the boundary between engineering and the life sciences, and at the boundary between molecular granularity and bulk granularity. This creates challenges for the definition of key entities in the domain, which we also discuss.
Collapse
Affiliation(s)
- Janna Hastings
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | | | - Gareth Owen
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Georgia Tsiliki
- National Technical University of Athens (NTUA), Athens, Greece
| | - Cristian R Munteanu
- Computer Science Faculty, University of A Coruña, A Coruña, Spain ; Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands
| | - Christoph Steinbeck
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Egon Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
32
|
Jeliazkova N, Chomenidis C, Doganis P, Fadeel B, Grafström R, Hardy B, Hastings J, Hegi M, Jeliazkov V, Kochev N, Kohonen P, Munteanu CR, Sarimveis H, Smeets B, Sopasakis P, Tsiliki G, Vorgrimmler D, Willighagen E. The eNanoMapper database for nanomaterial safety information. Beilstein J Nanotechnol 2015; 6:1609-34. [PMID: 26425413 PMCID: PMC4578352 DOI: 10.3762/bjnano.6.165] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 07/03/2015] [Indexed: 05/20/2023]
Abstract
BACKGROUND The NanoSafety Cluster, a cluster of projects funded by the European Commision, identified the need for a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs). Ontologies, open standards, and interoperable designs were envisioned to empower a harmonized approach to European research in nanotechnology. This setting provides a number of opportunities and challenges in the representation of nanomaterials data and the integration of ENM information originating from diverse systems. Within this cluster, eNanoMapper works towards supporting the collaborative safety assessment for ENMs by creating a modular and extensible infrastructure for data sharing, data analysis, and building computational toxicology models for ENMs. RESULTS The eNanoMapper database solution builds on the previous experience of the consortium partners in supporting diverse data through flexible data storage, open source components and web services. We have recently described the design of the eNanoMapper prototype database along with a summary of challenges in the representation of ENM data and an extensive review of existing nano-related data models, databases, and nanomaterials-related entries in chemical and toxicogenomic databases. This paper continues with a focus on the database functionality exposed through its application programming interface (API), and its use in visualisation and modelling. Considering the preferred community practice of using spreadsheet templates, we developed a configurable spreadsheet parser facilitating user friendly data preparation and data upload. We further present a web application able to retrieve the experimental data via the API and analyze it with multiple data preprocessing and machine learning algorithms. CONCLUSION We demonstrate how the eNanoMapper database is used to import and publish online ENM and assay data from several data sources, how the "representational state transfer" (REST) API enables building user friendly interfaces and graphical summaries of the data, and how these resources facilitate the modelling of reproducible quantitative structure-activity relationships for nanomaterials (NanoQSAR).
Collapse
Affiliation(s)
| | | | - Philip Doganis
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
| | | | | | - Barry Hardy
- Douglas Connect GmbH, Zeiningen, Switzerland
| | - Janna Hastings
- European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Markus Hegi
- Douglas Connect GmbH, Zeiningen, Switzerland
| | | | - Nikolay Kochev
- Ideaconsult Ltd., Sofia, Bulgaria
- Department of Analytical Chemistry and Computer Chemistry, University of Plovdiv, Plovdiv, Bulgaria
| | | | - Cristian R Munteanu
- Department of Bioinformatics, NUTRIM, Maastricht University, Maastricht, The Netherlands
- Computer Science Faculty, University of A Coruna, A Coruña, Spain
| | - Haralambos Sarimveis
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
| | - Bart Smeets
- Department of Bioinformatics, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Pantelis Sopasakis
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
- IMT Institute for Advanced Studies Lucca, Lucca, Italy
| | - Georgia Tsiliki
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
| | | | - Egon Willighagen
- Department of Bioinformatics, NUTRIM, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
33
|
Spjuth O, Carlsson L, Alvarsson J, Georgiev V, Willighagen E, Eklund M. Open source drug discovery with bioclipse. Curr Top Med Chem 2013; 12:1980-6. [PMID: 23110533 DOI: 10.2174/156802612804910287] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Revised: 08/15/2012] [Accepted: 08/15/2012] [Indexed: 11/22/2022]
Abstract
We present the open source components for drug discovery that has been developed and integrated into the graphical workbench Bioclipse. Building on a solid open source cheminformatics core, Bioclipse has advanced functionality for managing and visualizing chemical structures and related information. The features presented here include QSAR/QSPR modeling, various predictive solutions such as decision support for chemical liability assessment, site-ofmetabolism prediction, virtual screening, and knowledge discovery and integration. We demonstrate the utility of the described tools with examples from computational pharmacology, toxicology, and ADME. Bioclipse is used in both academia and industry, and is a good example of open source leading to new solutions for drug discovery.
Collapse
Affiliation(s)
- Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, P.O. Box 591, SE-751 24 Uppsala, Sweden.
| | | | | | | | | | | |
Collapse
|
34
|
Kohonen P, Benfenati E, Bower D, Ceder R, Crump M, Cross K, Grafström RC, Healy L, Helma C, Jeliazkova N, Jeliazkov V, Maggioni S, Miller S, Myatt G, Rautenberg M, Stacey G, Willighagen E, Wiseman J, Hardy B. The ToxBank Data Warehouse: Supporting the Replacement of In Vivo Repeated Dose Systemic Toxicity Testing. Mol Inform 2013; 32:47-63. [PMID: 27481023 DOI: 10.1002/minf.201200114] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2012] [Accepted: 11/27/2012] [Indexed: 12/12/2022]
Abstract
The aim of the SEURAT-1 (Safety Evaluation Ultimately Replacing Animal Testing-1) research cluster, comprised of seven EU FP7 Health projects co-financed by Cosmetics Europe, is to generate a proof-of-concept to show how the latest technologies, systems toxicology and toxicogenomics can be combined to deliver a test replacement for repeated dose systemic toxicity testing on animals. The SEURAT-1 strategy is to adopt a mode-of-action framework to describe repeated dose toxicity, combining in vitro and in silico methods to derive predictions of in vivo toxicity responses. ToxBank is the cross-cluster infrastructure project whose activities include the development of a data warehouse to provide a web-accessible shared repository of research data and protocols, a physical compounds repository, reference or "gold compounds" for use across the cluster (available via wiki.toxbank.net), and a reference resource for biomaterials. Core technologies used in the data warehouse include the ISA-Tab universal data exchange format, REpresentational State Transfer (REST) web services, the W3C Resource Description Framework (RDF) and the OpenTox standards. We describe the design of the data warehouse based on cluster requirements, the implementation based on open standards, and finally the underlying concepts and initial results of a data analysis utilizing public data related to the gold compounds.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Lyn Healy
- National Institute for Biological Standards and Control, Potters Bar, UK
| | | | | | | | - Silvia Maggioni
- Istituto di Ricerche Farmacologiche Mario Negri, Milan, Italy
| | | | | | | | - Glyn Stacey
- National Institute for Biological Standards and Control, Potters Bar, UK
| | | | | | | |
Collapse
|
35
|
Spjuth O, Alvarsson J, Willighagen E, Carlsson L, Georgiev V, Eklund M. Open Source Drug Discovery with Bioclipse. Curr Top Med Chem 2013. [DOI: 10.2174/1568026611212180005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We present the open source components for drug discovery that has been developed and integrated into the graphical workbench Bioclipse. Building on a solid open source cheminformatics core, Bioclipse has advanced functionality for managing and visualizing chemical structures and related information. The features presented here include QSAR/QSPR modeling, various predictive solutions such as decision support for chemical liability assessment, site-ofmetabolism prediction, virtual screening, and knowledge discovery and integration. We demonstrate the utility of the described tools with examples from computational pharmacology, toxicology, and ADME. Bioclipse is used in both academia and industry, and is a good example of open source leading to new solutions for drug discovery.
Collapse
Affiliation(s)
- Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, P.O. Box 591, SE-751 24 Uppsala, Sweden, Sweden
| | | | | | | | | | | |
Collapse
|
36
|
Abstract
Numerical characterization of molecular structure is a first step in many computational analysis of chemical structure data. These numerical representations, termed descriptors, come in many forms, ranging from simple atom counts and invariants of the molecular graph to distribution of properties, such as charge, across a molecular surface. In this article we first present a broad categorization of descriptors and then describe applications and toolkits that can be employed to evaluate them. We highlight a number of issues surrounding molecular descriptor calculations such as versioning and reproducibility and describe how some toolkits have attempted to address these problems.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Center for Advancing Translational Science, 9800 Medical Center Drive Rockville, MD 20850, U.S.A
| | | |
Collapse
|
37
|
Spjuth O, Georgiev V, Carlsson L, Alvarsson J, Berg A, Willighagen E, Wikberg JES, Eklund M. Bioclipse-R: integrating management and visualization of life science data with statistical analysis. Bioinformatics 2012. [PMID: 23178637 PMCID: PMC3546796 DOI: 10.1093/bioinformatics/bts681] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Summary: Bioclipse, a graphical workbench for the life sciences, provides functionality for managing and visualizing life science data. We introduce Bioclipse-R, which integrates Bioclipse and the statistical programming language R. The synergy between Bioclipse and R is demonstrated by the construction of a decision support system for anticancer drug screening and mutagenicity prediction, which shows how Bioclipse-R can be used to perform complex tasks from within a single software system. Availability and implementation: Bioclipse-R is implemented as a set of Java plug-ins for Bioclipse based on the R-package rj. Source code and binary packages are available from https://github.com/bioclipse and http://www.bioclipse.net/bioclipse-r, respectively. Contact:martin.eklund@farmbio.uu.se Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, SE-751 24 Uppsala, Sweden
| | | | | | | | | | | | | | | |
Collapse
|
38
|
Guha R, Willighagen E. A Survey of Quantitative Descriptions of Molecular Structure. Curr Top Med Chem 2012; 12:1946-56. [DOI: 10.2174/156802612804910278] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Revised: 07/30/2012] [Accepted: 08/07/2012] [Indexed: 11/22/2022]
|
39
|
Beronius A, Willighagen E, Rudén C, Hanberg A. Factors influencing developmental neurotoxicity study outcome in the bisphenol A case. Toxicol Lett 2012. [DOI: 10.1016/j.toxlet.2012.03.472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
40
|
Willighagen E, Helma C, Benfenati E, Jeliazkova N, Wiseman J, Grafström R, Hardy B, Myatt G, Stacey G. Sharing dose response data and analysis across the SEURAT-1 cluster. Toxicol Lett 2012. [DOI: 10.1016/j.toxlet.2012.03.585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
41
|
Willighagen E, Karlsson H, Grafström R, Fadeel B. Developing a novel QSAR framework for nanomaterial toxicity. Toxicol Lett 2012. [DOI: 10.1016/j.toxlet.2012.03.245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
42
|
Grafström RC, Ceder R, Fadeel B, Roberg K, Willighagen E. Bioinformatics-based cancer research have wide toxicological applicability. Toxicol Lett 2012. [DOI: 10.1016/j.toxlet.2012.03.580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
43
|
Neylon C, Aerts J, Brown CT, Coles SJ, Hatton L, Lemire D, Millman KJ, Murray-Rust P, Perez F, Saunders N, Shah N, Smith A, Varoquaux G, Willighagen E. Changing computational research. The challenges ahead. Source Code Biol Med 2012; 7:2. [PMID: 22640749 PMCID: PMC3441321 DOI: 10.1186/1751-0473-7-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Accepted: 05/28/2012] [Indexed: 11/10/2022]
Affiliation(s)
- Cameron Neylon
- Science and Technology Facilities Council, Didcot, Harwell Oxford, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Hastings J, Chepelev L, Willighagen E, Adams N, Steinbeck C, Dumontier M. The chemical information ontology: provenance and disambiguation for chemical data on the biological semantic web. PLoS One 2011; 6:e25513. [PMID: 21991315 PMCID: PMC3184996 DOI: 10.1371/journal.pone.0025513] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2011] [Accepted: 09/07/2011] [Indexed: 11/19/2022] Open
Abstract
Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas in biology where cheminformatics plays an important role in computational research, including metabolism, proteomics, and systems biology. One critical aspect in the application of cheminformatics in these fields is the accurate exchange of data, which is increasingly accomplished through the use of ontologies. Ontologies are formal representations of objects and their properties using a logic-based ontology language. Many such ontologies are currently being developed to represent objects across all the domains of science. Ontologies enable the definition, classification, and support for querying objects in a particular domain, enabling intelligent computer applications to be built which support the work of scientists both within the domain of interest and across interrelated neighbouring domains. Modern chemical research relies on computational techniques to filter and organise data to maximise research productivity. The objects which are manipulated in these algorithms and procedures, as well as the algorithms and procedures themselves, enjoy a kind of virtual life within computers. We will call these information entities. Here, we describe our work in developing an ontology of chemical information entities, with a primary focus on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context. Our ontology distinguishes algorithmic, or procedural information from declarative, or factual information, and renders of particular importance the annotation of provenance to calculated data. The Chemical Information Ontology is being developed as an open collaborative project. More details, together with a downloadable OWL file, are available at http://code.google.com/p/semanticchemistry/ (license: CC-BY-SA).
Collapse
Affiliation(s)
- Janna Hastings
- Chemoinformatics and Metabolism, European Bioinformatics Institute, Hinxton, United Kingdom.
| | | | | | | | | | | |
Collapse
|
45
|
Samwald M, Jentzsch A, Bouton C, Kallesøe CS, Willighagen E, Hajagos J, Marshall MS, Prud'hommeaux E, Hassenzadeh O, Pichler E, Stephens S. Linked open drug data for pharmaceutical research and development. J Cheminform 2011; 3:19. [PMID: 21575203 PMCID: PMC3121711 DOI: 10.1186/1758-2946-3-19] [Citation(s) in RCA: 128] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Accepted: 05/16/2011] [Indexed: 11/18/2022] Open
Abstract
There is an abundance of information about drugs available on the Web. Data sources range from medicinal chemistry results, over the impact of drugs on gene expression, to the outcomes of drugs in clinical trials. These data are typically not connected together, which reduces the ease with which insights can be gained. Linking Open Drug Data (LODD) is a task force within the World Wide Web Consortium's (W3C) Health Care and Life Sciences Interest Group (HCLS IG). LODD has surveyed publicly available data about drugs, created Linked Data representations of the data sets, and identified interesting scientific and business questions that can be answered once the data sets are connected. The task force provides recommendations for the best practices of exposing data in a Linked Data representation. In this paper, we present past and ongoing work of LODD and discuss the growing importance of Linked Data as a foundation for pharmaceutical R&D data sharing.
Collapse
Affiliation(s)
- Matthias Samwald
- Section for Medical Expert and Knowledge-Based Systems, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Wohlgemuth G, Haldiya PK, Willighagen E, Kind T, Fiehn O. The Chemical Translation Service--a web-based tool to improve standardization of metabolomic reports. Bioinformatics 2010; 26:2647-8. [PMID: 20829444 PMCID: PMC2951090 DOI: 10.1093/bioinformatics/btq476] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Summary: Metabolomic publications and databases use different database identifiers or even trivial names which disable queries across databases or between studies. The best way to annotate metabolites is by chemical structures, encoded by the International Chemical Identifier code (InChI) or InChIKey. We have implemented a web-based Chemical Translation Service that performs batch conversions of the most common compound identifiers, including CAS, CHEBI, compound formulas, Human Metabolome Database HMDB, InChI, InChIKey, IUPAC name, KEGG, LipidMaps, PubChem CID+SID, SMILES and chemical synonym names. Batch conversion downloads of 1410 CIDs are performed in 2.5 min. Structures are automatically displayed. Implementation: The software was implemented in Groovy and JAVA, the web frontend was implemented in GRAILS and the database used was PostgreSQL. Availability: The source code and an online web interface are freely available. Chemical Translation Service (CTS): http://cts.fiehnlab.ucdavis.edu Contact:ofiehn@ucdavis.edu
Collapse
|
47
|
Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E. The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics. J Chem Inf Comput Sci 2003; 43:493-500. [PMID: 12653513 PMCID: PMC4901983 DOI: 10.1021/ci025584y] [Citation(s) in RCA: 645] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Chemistry Development Kit (CDK) is a freely available open-source Java library for Structural Chemo- and Bioinformatics. Its architecture and capabilities as well as the development as an open-source project by a team of international collaborators from academic and industrial institutions is described. The CDK provides methods for many common tasks in molecular informatics, including 2D and 3D rendering of chemical structures, I/O routines, SMILES parsing and generation, ring searches, isomorphism checking, structure diagram generation, etc. Application scenarios as well as access information for interested users and potential contributors are given.
Collapse
|