1
|
Lin CL, Huang PC, Gräßle S, Grathwol C, Tremouilhac P, Vanderheiden S, Hodapp P, Herres-Pawlis S, Hoffmann A, Fink F, Manolikakes G, Opatz T, Link A, Marques MMB, Daumann LJ, Tsotsalas M, Biedermann F, Mutlu H, Täuscher E, Bach F, Drees T, Neumann S, Harivyasi SS, Jung N, Bräse S. Linking Research Data with Physically Preserved Research Materials in Chemistry. Sci Data 2025; 12:130. [PMID: 39843501 PMCID: PMC11754846 DOI: 10.1038/s41597-025-04404-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 01/03/2025] [Indexed: 01/24/2025] Open
Abstract
Results of scientific work in chemistry can usually be obtained in the form of materials and data. A big step towards transparency and reproducibility of the scientific work can be gained if scientists publish their data in research data repositories in a FAIR manner. Nevertheless, in order to make chemistry a sustainable discipline, obtaining FAIR data is insufficient and a comprehensive concept that includes preservation of materials is needed. In order to offer a comprehensive infrastructure to find and access data and materials that were generated in chemistry projects, we combined the infrastructure Chemotion repository with an archive for chemical compounds. Samples play a key role in this concept: we describe how FAIR metadata of a virtual sample representation can be used to refer to a physically available sample in a materials' archive and to link it with the FAIR research data gained using the said sample. We further describe the measures to make the physically available samples not only FAIR through their metadata but also findable, accessible and reusable.
Collapse
Affiliation(s)
- Chia-Lin Lin
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Pei-Chi Huang
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Simone Gräßle
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Christoph Grathwol
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Pierre Tremouilhac
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Sylvia Vanderheiden
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Patrick Hodapp
- Institute for Biological Interfaces 3 - Soft Matter Laboratory (IBG 3 - SML), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Sonja Herres-Pawlis
- RWTH Aachen University, Institute of Inorganic Chemistry, Landoltweg 1a, 52074, Aachen, Germany
| | - Alexander Hoffmann
- RWTH Aachen University, Institute of Inorganic Chemistry, Landoltweg 1a, 52074, Aachen, Germany
| | - Fabian Fink
- RWTH Aachen University, Institute of Inorganic Chemistry, Landoltweg 1a, 52074, Aachen, Germany
| | - Georg Manolikakes
- RPTU Kaiserslautern-Landau, Department Chemie, Erwin-Schrödinger-Str. Geb. 54, 67663, Kaiserslautern, Germany
| | - Till Opatz
- JGU Mainz, Department Chemie, Duesbergweg 10-14, 55128, Mainz, Germany
| | - Andreas Link
- Universität Greifswald, Institut für Pharmazie, Friedrich-Ludwig-Jahn-Str. 17, 17489, Greifswald, Germany
| | - M Manuel B Marques
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
| | - Lena J Daumann
- Chair of Bioinorganic Chemistry, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 13, 40225, Düsseldorf, Germany
| | - Manuel Tsotsalas
- Institute of Functional Interfaces (IFG), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Frank Biedermann
- Institute of Nanotechnology (INT), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Hatice Mutlu
- Institut de Science des Matériaux de MulhouseUMR 7361 CNRS/Université de Haute Alsace15 rue Jean Starcky, Mulhouse Cedex, 68057, France
| | - Eric Täuscher
- Technische Universität Ilmenau, Institut für Chemie und Biotechnik, Weimarer Straße 25, 98693, Ilmenau, Germany
| | - Felix Bach
- FIZ Karlsruhe - Leibniz-Institut für Informationsinfrastruktur GmbH, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
| | - Tim Drees
- Legal Affairs, Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry, Computational Plant Biochemistry group, Halle, Germany
| | - Shashank S Harivyasi
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany
| | - Nicole Jung
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany.
- Karlsruhe Nano Micro Facility (KNMFi), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany.
| | - Stefan Bräse
- Institute of Biological and Chemical Systems - Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany.
- Institute of Organic Chemistry (IOC), Karlsruhe Institute of Technology, Kaiserstraße 12, 76131, Karlsruhe, Germany.
| |
Collapse
|
2
|
Seggi L, Trabucco R, Martellos S. Valorization of Historical Natural History Collections Through Digitization: The Algarium Vatova-Schiffner. PLANTS (BASEL, SWITZERLAND) 2024; 13:2901. [PMID: 39458848 PMCID: PMC11511501 DOI: 10.3390/plants13202901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 09/29/2024] [Accepted: 10/15/2024] [Indexed: 10/28/2024]
Abstract
Digitization of Natural History Collections (NHCs) and mobilization of their data are pivotal for their study, preservation, and accessibility. Furthermore, thanks to digitization and mobilization, Natural History Museums can better showcase their collections, potentially attracting more visitors. However, the optimization of digitization workflows, especially when addressing small and/or historical NHCs, remains a challenge. Starting from a practical example, this contribution aims at providing a general guideline for the digitization of historical NHCs, with a particular focus on pre-digitization planning, during which some decisions should be made for ensuring a smooth, cost- and time-effective digitization process. The digitization of the algarium by Aristocle Vatova and Victor Schiffner was carried out following an image-to-data workflow, which allowed for reducing the handling of the specimens. The metadata were organized according to the Darwin Core standard scheme, and, together with the digital images of the specimens, have been made available to the scientific community and to the general public via an online portal. Thanks to the application of digital technologies and standardized methods, the accessibility of the collection has been enhanced, and its integration with historical data is possible, highlighting the relevance of shared experiences and protocols in advancing the digital transformation of natural history heritage.
Collapse
Affiliation(s)
- Linda Seggi
- Department of Life Sciences, University of Trieste, 34127 Trieste, Italy;
- Fondazione Musei Civici di Venezia, Natural History Museum of Venice Giancarlo Ligabue, 30135 Venezia, Italy;
| | - Raffaella Trabucco
- Fondazione Musei Civici di Venezia, Natural History Museum of Venice Giancarlo Ligabue, 30135 Venezia, Italy;
| | - Stefano Martellos
- Department of Life Sciences, University of Trieste, 34127 Trieste, Italy;
| |
Collapse
|
3
|
Renner SS, Scherz MD, Schoch CL, Gottschling M, Vences M. Improving the gold standard in NCBI GenBank and related databases: DNA sequences from type specimens and type strains. Syst Biol 2024; 73:486-494. [PMID: 37956405 PMCID: PMC11502950 DOI: 10.1093/sysbio/syad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/21/2023] [Accepted: 11/11/2023] [Indexed: 11/15/2023] Open
Abstract
Scientific names permit humans and search engines to access knowledge about the biodiversity that surrounds us, and names linked to DNA sequences are playing an ever-greater role in search-and-match identification procedures. Here, we analyze how users and curators of the National Center for Biotechnology Information (NCBI) are flagging and curating sequences derived from nomenclatural type material, which is the only way to improve the quality of DNA-based identification in the long run. For prokaryotes, 18,281 genome assemblies from type strains have been curated by NCBI staff and improve the quality of prokaryote naming. For Fungi, type-derived sequences representing over 21,000 species are now essential for fungus naming and identification. For the remaining eukaryotes, however, the numbers of sequences identifiable as type-derived are minuscule, representing only 739 species of arthropods, 1542 vertebrates, and 125 embryophytes. An increase in the production and curation of such sequences will come from (i) sequencing of types or topotypic specimens in museum collections, (ii) the March 2023 rule changes at the International Nucleotide Sequence Database Collaboration requiring more metadata for specimens, and (iii) efforts by data submitters to facilitate curation, including informing NCBI curators about a specimen's type status. We illustrate different type-data submission journeys and provide best-practice examples from a range of organisms. Expanding the number of type-derived sequences in DNA databases, especially of eukaryotes, is crucial for capturing, documenting, and protecting biodiversity.
Collapse
Affiliation(s)
- Susanne S Renner
- Department of Biology, Washington University, Saint Louis, MO 63130, USA
| | - Mark D Scherz
- Natural History Museum of Denmark, University of Copenhagen, Universitetsparken 15, Copenhagen 2100, Denmark
| | - Conrad L Schoch
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Marc Gottschling
- Faculty of Biology, GeoBio-Center, Ludwig-Maximilians-University, Munich 80333, Germany
| | - Miguel Vences
- Division of Evolutionary Biology, Zoological Institute, University of Technology, Mendelssohnstr. 4, 38106 Braunschweig, Germany
| |
Collapse
|
4
|
Schindel DE, Page RMP. Creating Virtuous Cycles for DNA Barcoding: A Case Study in Science Innovation, Entrepreneurship, and Diplomacy. Methods Mol Biol 2024; 2744:7-32. [PMID: 38683309 DOI: 10.1007/978-1-0716-3581-0_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2024]
Abstract
This chapter on the history of the DNA barcoding enterprise attempts to set the stage for the more scholarly contributions in this volume by addressing the following questions. How did the DNA barcoding enterprise begin? What were its goals, how did it develop, and to what degree are its goals being realized? We have taken a keen interest in the barcoding movement and its relationship to taxonomy, collections, and biodiversity informatics more broadly considered. This chapter integrates our two different perspectives on barcoding. DES was the Executive Secretary of the Consortium for the Barcode of Life from 2004 to 2017, with the mission to support the success of DNA barcoding without being directly involved in generating barcode data. RDMP viewed barcoding as an important entry into the landscape of biodiversity data, with many potential linkages to other components of that landscape. We also saw it as a critical step toward the era of international genomic research that was sure to follow. Like the Mercury Program that paved the way for lunar landings by the Apollo Program, we saw DNA barcoding as the proving grounds for the interdisciplinary and international cooperation that would be needed for success of whole-genome research.
Collapse
Affiliation(s)
| | - Roderic M P Page
- School of Biodiversity, One Health, and Veterinary Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK
| |
Collapse
|
5
|
Macgregor G, Lancho-Barrantes BS, Pennington DR. Measuring the Concept of PID Literacy: User Perceptions and Understanding of PIDs in Support of Open Scholarly Infrastructure. OPEN INFORMATION SCIENCE 2023. [DOI: 10.1515/opis-2022-0142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/15/2023] Open
Abstract
Abstract
The increasing centrality of persistent identifiers (PIDs) to scholarly ecosystems and the contribution they can make to the burgeoning “PID graph” has the potential to transform scholarship. Despite their importance as originators of PID data, little is known about researchers’ awareness and understanding of PIDs, or their efficacy in using them. In this article, we report on the results of an online interactive test designed to elicit exploratory data about researcher awareness and understanding of PIDs. This instrument was designed to explore recognition of PIDs (e.g. Digital Object Identifiers [DOIs], Open Researcher and Contributor IDs [ORCIDs], etc.) and the extent to which researchers correctly apply PIDs within digital scholarly ecosystems, as well as measure researchers’ perceptions of PIDs. Our results reveal irregular patterns of PID understanding and certainty across all participants, though statistically significant disciplinary and academic job role differences were observed in some instances. Uncertainty and confusion were found to exist in relation to dominant schemes such as ORCID and DOIs, even when contextualized within real-world examples. We also show researchers’ perceptions of PIDs to be generally positive but that disciplinary differences can be noted, as well as higher levels of aversion to PIDs in specific use cases and negative perceptions where PIDs are measured on an “activity” semantic dimension. This work therefore contributes to our understanding of scholars’ “PID literacy” and should inform those designing PID-centric scholarly infrastructures that a significant need for training and outreach to active researchers remains necessary.
Collapse
Affiliation(s)
- George Macgregor
- Scholarly Publications & Research Data, Information Services – Scholarly Research Communications, University of Strathclyde , Glasgow , UK
- Department of Computer & Information Sciences, University of Strathclyde , Glasgow , UK
| | | | - Diane Rasmussen Pennington
- Department of Computer & Information Sciences, University of Strathclyde , Glasgow , UK
- School of Computing, Engineering, and the Built Environment, Edinburgh Napier University , Edinburgh , UK
| |
Collapse
|
6
|
Agosti D, Benichou L, Addink W, Arvanitidis C, Catapano T, Cochrane G, Dillen M, Döring M, Georgiev T, Gérard I, Groom Q, Kishor P, Kroh A, Kvaček J, Mergen P, Mietchen D, Pauperio J, Sautter G, Penev L. Recommendations for use of annotations and persistent identifiers in taxonomy and biodiversity publishing. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e97374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The paper summarises many years of discussions and experience of biodiversity publishers, organisations, research projects and individual researchers, and proposes recommendations for implementation of persistent identifiers for article metadata, structural elements (sections, subsections, figures, tables, references, supplementary materials and others) and data specific to biodiversity (taxonomic treatments, treatment citations, taxon names, material citations, gene sequences, specimens, scientific collections) in taxonomy and biodiversity publishing. The paper proposes best practices on how identifiers should be used in the different cases and on how they can be minted, cited, and expressed in the backend article XML to facilitate conversion to and further re-use of the article content as FAIR data. The paper also discusses several specific routes for post-publication re-use of semantically enhanced content through large biodiversity data aggregators such as the Global Biodiversity Information Facility (GBIF), the International Nucleotide Sequence Database Collaboration (INSDC) and others, and proposes specifications of both identifiers and XML tags to be used for that purpose. A summary table provides an account and overview of the recommendations. The guidelines are supported with examples from the existing publishing practices.
Collapse
|