1
|
Cicero C, Koo MS, Braker E, Abbott J, Bloom D, Campbell M, Cook JA, Demboski JR, Doll AC, Frederick LM, Linn AJ, Mayfield-Meyer TJ, McDonald DL, Nachman MW, Olson LE, Roberts D, Sikes DS, Witt CC, Wommack EA. Arctos: Community-driven innovations for managing natural and cultural history collections. PLoS One 2024; 19:e0296478. [PMID: 38820381 PMCID: PMC11142579 DOI: 10.1371/journal.pone.0296478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 04/22/2024] [Indexed: 06/02/2024] Open
Abstract
More than tools for managing physical and digital objects, museum collection management systems (CMS) serve as platforms for structuring, integrating, and making accessible the rich data embodied by natural history collections. Here we describe Arctos, a scalable community solution for managing and publishing global biological, geological, and cultural collections data for research and education. Specific goals are to: (1) Describe the core features and implementation of Arctos for a broad audience with respect to the biodiversity informatics principles that enable high quality research; (2) Highlight the unique aspects of Arctos; (3) Illustrate Arctos as a model for supporting and enhancing the Digital Extended Specimen concept; and (4) Emphasize the role of the Arctos community for improving data discovery and enabling cross-disciplinary, integrative studies within a sustainable governance model. In addition to detailing Arctos as both a community of museum professionals and a collection database platform, we discuss how Arctos achieves its richly annotated data by creating a web of knowledge with deep connections between catalog records and derived or associated data. We also highlight the value of Arctos as an educational resource. Finally, we present the financial model of fiscal sponsorship by a nonprofit organization, implemented in 2022, to ensure the long-term success and sustainability of Arctos.
Collapse
Affiliation(s)
- Carla Cicero
- Museum of Vertebrate Zoology, University of California, Berkeley, California, United States of America
| | - Michelle S. Koo
- Museum of Vertebrate Zoology, University of California, Berkeley, California, United States of America
| | - Emily Braker
- University of Colorado Museum of Natural History, University of Colorado, Boulder, Colorado, United States of America
| | - John Abbott
- Department of Museums Research and Collections and Alabama Museum of Natural History, The University of Alabama, Tuscaloosa, Alabama, United States of America
| | - David Bloom
- VertNet, Sebastopol, California, United States of America
| | - Mariel Campbell
- Museum of Southwestern Biology, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Joseph A. Cook
- Museum of Southwestern Biology, University of New Mexico, Albuquerque, New Mexico, United States of America
- Department of Biology, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - John R. Demboski
- Denver Museum of Nature & Science, Denver, Colorado, United States of America
| | - Andrew C. Doll
- Denver Museum of Nature & Science, Denver, Colorado, United States of America
| | - Lindsey M. Frederick
- New Mexico Museum of Natural History & Science, Albuquerque, New Mexico, United States of America
| | - Angela J. Linn
- University of Alaska Museum, University of Alaska Fairbanks, Fairbanks, Alaska, United States of America
| | | | | | - Michael W. Nachman
- Museum of Vertebrate Zoology, University of California, Berkeley, California, United States of America
| | - Link E. Olson
- University of Alaska Museum, University of Alaska Fairbanks, Fairbanks, Alaska, United States of America
| | - Dawn Roberts
- Chicago Academy of Sciences, Chicago, Illinois, United States of America
| | - Derek S. Sikes
- University of Alaska Museum, University of Alaska Fairbanks, Fairbanks, Alaska, United States of America
- Department of Biology & Wildlife, University of Alaska Fairbanks, Fairbanks, Alaska, United States of America
| | - Christopher C. Witt
- Museum of Southwestern Biology, University of New Mexico, Albuquerque, New Mexico, United States of America
- Department of Biology, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Elizabeth A. Wommack
- University of Wyoming Museum of Vertebrates, University of Wyoming, Laramie, Wyoming, United States of America
| |
Collapse
|
2
|
Astudillo-Clavijo V, Mankis T, López-Fernández H. Opening the Museum's Vault: Historical Field Records Preserve Reliable Ecological Data. Am Nat 2024; 203:305-322. [PMID: 38358812 DOI: 10.1086/728422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
AbstractMuseum specimens have long served as foundational data sources for ecological, evolutionary, and environmental research. Continued reimagining of museum collections is now also generating new types of data associated with but beyond physical specimens, a concept known as "extended specimens." Field notes penned by generations of naturalists contain firsthand ecological observations associated with museum collections and comprise a form of extended specimens with the potential to provide novel ecological data spanning broad geographic and temporal scales. Despite their data-yielding potential, however, field notes remain underutilized in research because of their heterogeneous, unstandardized, and qualitative nature. We introduce an approach for transforming descriptive ecological notes into quantitative data suitable for statistical analysis. Tests with simulated and real-world published data show that field notes and our transformation approach retain reliable quantitative ecological information under a range of sample sizes and evolutionary scenarios. Unlocking the wealth of data contained within field records could facilitate investigations into the ecology of clades whose diversity, distribution, or other demographic features present challenges to traditional ecological studies, improve our understanding of long-term environmental and evolutionary change, and enhance predictions of future change.
Collapse
|
3
|
Weaver WN, Smith SA. FieldPrism: A system for creating snapshot vouchers from field images using photogrammetric markers and QR codes. APPLICATIONS IN PLANT SCIENCES 2023; 11:e11545. [PMID: 37915427 PMCID: PMC10617303 DOI: 10.1002/aps3.11545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 04/18/2023] [Accepted: 05/26/2023] [Indexed: 11/03/2023]
Abstract
Premise Field images are important sources of information for research in the natural sciences. However, images that lack photogrammetric scale bars, including most iNaturalist observations, cannot yield accurate trait measurements. We introduce FieldPrism, a novel system of photogrammetric markers, QR codes, and software to automate the curation of snapshot vouchers. Methods and Results Our photogrammetric background templates (FieldSheets) increase the utility of field images by providing machine-readable scale bars and photogrammetric reference points to automatically correct image distortion and calculate a pixel-to-metric conversion ratio. Users can generate a QR code flipbook derived from a specimen identifier naming hierarchy, enabling machine-readable specimen identification for automatic file renaming. We also developed FieldStation, a Raspberry Pi-based mobile imaging apparatus that records images, GPS location, and metadata redundantly on up to four USB storage devices and can be monitored and controlled from any Wi-Fi connected device. Conclusions FieldPrism is a flexible software tool designed to standardize and improve the utility of images captured in the field. When paired with the optional FieldStation, researchers can create a self-contained mobile imaging apparatus for quantitative trait data collection.
Collapse
Affiliation(s)
- William N. Weaver
- Department of Ecology and Evolutionary BiologyUniversity of Michigan1105 N. University Ave.Ann Arbor48109MichiganUSA
| | - Stephen A. Smith
- Department of Ecology and Evolutionary BiologyUniversity of Michigan1105 N. University Ave.Ann Arbor48109MichiganUSA
| |
Collapse
|
4
|
Weaver WN, Smith SA. From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2. APPLICATIONS IN PLANT SCIENCES 2023; 11:e11548. [PMID: 37915430 PMCID: PMC10617304 DOI: 10.1002/aps3.11548] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/28/2023] [Accepted: 07/17/2023] [Indexed: 11/03/2023]
Abstract
Premise Quantitative plant traits play a crucial role in biological research. However, traditional methods for measuring plant morphology are time consuming and have limited scalability. We present LeafMachine2, a suite of modular machine learning and computer vision tools that can automatically extract a base set of leaf traits from digital plant data sets. Methods LeafMachine2 was trained on 494,766 manually prepared annotations from 5648 herbarium images obtained from 288 institutions and representing 2663 species; it employs a set of plant component detection and segmentation algorithms to isolate individual leaves, petioles, fruits, flowers, wood samples, buds, and roots. Our landmarking network automatically identifies and measures nine pseudo-landmarks that occur on most broadleaf taxa. Text labels and barcodes are automatically identified by an archival component detector and are prepared for optical character recognition methods or natural language processing algorithms. Results LeafMachine2 can extract trait data from at least 245 angiosperm families and calculate pixel-to-metric conversion factors for 26 commonly used ruler types. Discussion LeafMachine2 is a highly efficient tool for generating large quantities of plant trait data, even from occluded or overlapping leaves, field images, and non-archival data sets. Our project, along with similar initiatives, has made significant progress in removing the bottleneck in plant trait data acquisition from herbarium specimens and shifted the focus toward the crucial task of data revision and quality control.
Collapse
Affiliation(s)
- William N. Weaver
- Department of Ecology and Evolutionary BiologyUniversity of Michigan1105 N. University Ave.Ann Arbor48109MichiganUSA
| | - Stephen A. Smith
- Department of Ecology and Evolutionary BiologyUniversity of Michigan1105 N. University Ave.Ann Arbor48109MichiganUSA
| |
Collapse
|
5
|
Stebbins TD, Wetzer R. Review and guide to the isopods (Crustacea, Isopoda) of littoral and sublittoral marine habitats in the Southern California Bight. Zookeys 2023; 1162:1-167. [PMID: 37235199 PMCID: PMC10206732 DOI: 10.3897/zookeys.1162.100390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 03/14/2023] [Indexed: 05/28/2023] Open
Abstract
The isopod crustaceans reported from or expected to occur in littoral and sublittoral marine habitats of the Southern California Bight (SCB) in the northeastern Pacific Ocean are reviewed. A total of 190 species, representing 105 genera in 42 families and six suborders are covered. Approximately 84% of these isopods represent described species with the remaining 16% comprising well-documented "provisional" but undescribed species. Cymothoida and Asellota are the most diverse of the six suborders, accounting for ca. 36% and 29% of the species, respectively. Valvifera and Sphaeromatidea are the next most speciose suborders with between 13-15% of the species each, while the suborder Limnorioidea represents fewer than 2% of the SCB isopod fauna. Finally, the mostly terrestrial suborder Oniscidea accounts for ca. 5% of the species treated herein, each which occurs at or above the high tide mark in intertidal habitats. A key to the suborders and superfamilies is presented followed by nine keys to the SCB species within each of the resultant groups. Figures are provided for most species. Bathymetric range, geographic distribution, type locality, habitat, body size, and a comprehensive list of references are included for most species.
Collapse
Affiliation(s)
- Timothy D. Stebbins
- Research and Collections Branch, Natural History Museum of Los Angeles County, 900 Exposition Boulevard, Los Angeles, California 90007, USANatural History Museum of Los Angeles CountyLos AngelesUnited States of America
- City of San Diego Marine Biology Laboratory (retired), Public Utilities Department, San Diego, California 92101, USACity of San Diego Marine Biology LaboratorySan DiegoUnited States of America
| | - Regina Wetzer
- Research and Collections Branch, Natural History Museum of Los Angeles County, 900 Exposition Boulevard, Los Angeles, California 90007, USANatural History Museum of Los Angeles CountyLos AngelesUnited States of America
| |
Collapse
|
6
|
Bachmann L, Beermann J, Brey T, de Boer HJ, Dannheim J, Edvardsen B, Ericson PGP, Holston KC, Johansson VA, Kloss P, Konijnenberg R, Osborn KJ, Pappalardo P, Pehlke H, Piepenburg D, Struck TH, Sundberg P, Markussen SS, Teschke K, Vanhove MPM. The role of systematics for understanding ecosystem functions: Proceedings of the Zoologica Scripta Symposium, Oslo, Norway, 25 August 2022. ZOOL SCR 2023. [DOI: 10.1111/zsc.12593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
7
|
Agosti D, Benichou L, Addink W, Arvanitidis C, Catapano T, Cochrane G, Dillen M, Döring M, Georgiev T, Gérard I, Groom Q, Kishor P, Kroh A, Kvaček J, Mergen P, Mietchen D, Pauperio J, Sautter G, Penev L. Recommendations for use of annotations and persistent identifiers in taxonomy and biodiversity publishing. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e97374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The paper summarises many years of discussions and experience of biodiversity publishers, organisations, research projects and individual researchers, and proposes recommendations for implementation of persistent identifiers for article metadata, structural elements (sections, subsections, figures, tables, references, supplementary materials and others) and data specific to biodiversity (taxonomic treatments, treatment citations, taxon names, material citations, gene sequences, specimens, scientific collections) in taxonomy and biodiversity publishing. The paper proposes best practices on how identifiers should be used in the different cases and on how they can be minted, cited, and expressed in the backend article XML to facilitate conversion to and further re-use of the article content as FAIR data. The paper also discusses several specific routes for post-publication re-use of semantically enhanced content through large biodiversity data aggregators such as the Global Biodiversity Information Facility (GBIF), the International Nucleotide Sequence Database Collaboration (INSDC) and others, and proposes specifications of both identifiers and XML tags to be used for that purpose. A summary table provides an account and overview of the recommendations. The guidelines are supported with examples from the existing publishing practices.
Collapse
|
8
|
Islam S, Weiland C, Addink W. From data pipelines to FAIR data infrastructures: A vision for the new horizons of bio- and geodiversity data for scientific research. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e93816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Natural science collections are vast repositories of bio- and geodiversity specimens. These collections, originating from natural history cabinets or expeditions, are increasingly becoming unparalleled sources of data facilitating multidisciplinary research (Meineke et al. 2018, Heberling et al. 2019, Cook et al. 2020, Thompson et al. 2021). Due to various global data mobilization and digitisation efforts (Blagoderov et al. 2012,Nelson and Ellis 2018), this digitised information about specimens includes database records along with two/three-dimensional images, sonograms, sound or video recordings, computerised tomography scans, machine-readable texts from labels on the specimens as well as media items and notes related to the discovery sites and acquisition (Hedrick et al. 2020,Phillipson 2022).
The scope and practice of specimen gathering are also evolving. The term extended specimen was coined to refer to the specimen and associated data extending beyond the singular physical object to other physical or digital entities such as chemical composition, genetic sequence data or species data. Thus the specimen becomes an interconnected network of data resources that have incredible potential to enhance integrative and data-driven research (Webster 2017,Lendemer et al. 2019,Hardisty et al. 2022). These practices also reflect the role of data and the curatorial data life-cycle starting from the initial material sampling process to the downstream analysis. We are also seeing growing acknowledgement that disparate and domain specific data elements prevent interdisciplinarity which is crucial for a holistic understanding of biodiversity and climate crisis (Hicks et al. 2010, Craven et al. 2019, Folk and Siniscalchi 2021).
Thus the data elements are not just records or rows in a database or data pipelines going from one repository to another. They have the potential to become self-describing digital artefacts that can revolutionise how machines interpret and work with specimen data. Within this context, the Distributed System of Scientific Collections (DiSSCo), a new European Research Infrastructure for natural science collections, envisions an infrastructure based on FAIR Digital Objects (FDO) that can unify more than 170 European natural science collections under common and FAIR-compliant (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016) access and curation policies and practices. DiSSCo’s key element in achieving FAIR is the implementation of Digital Specimen (a domain specific FDO) that closely aligns with the extended specimen practices. The idea behind Digital Specimen – an FDO that acts as a digital surrogate for a specific physical specimen in a natural science collection – was influenced by global conversations around the implementation of the Digital Object Architecture for biodiversity data (De Smedt et al. 2020, Islam et al. 2020,Hardisty et al. 2020).
The main purpose of this talk is to explain the vision of how FAIR and FDO can create a data infrastructure that can not only take advantage of existing databases and repositories but at the same time provide support for innovative services such as AI and digital twinning. With scientific use cases in mind, the talk will highlight a few key FAIR and FDO components (persistent identifiers, metadata, ontologies) within the collaborative modelling activity of Digital Specimen specification. These components provide the template for specifying how a Digital Specimen should look so DiSSCo can build a FAIR service ecosystem based on FDOs (Addink et al. 2021). We will also give examples of envisioned services that can help with image feature extraction, and model training (Grieb et al. 2021,Hardisty et al. 2022) and digital twinning (Schultes et al. 2022). We believe this is an exciting new paradigm powered by FAIR and FDO that can help both humans and machines to accelerate the use of specimen data. From physical objects curated over hundred years, we have developed data pipelines, aggregators and repositories (Barberousse 2021). Now is the time to look for solutions where these data records can become FAIR Digital Objects to enable wider access and multidisciplinary research.
Collapse
|