1
|
Miralles A, Bruy T, Wolcott K, Scherz MD, Begerow D, Beszteri B, Bonkowski M, Felden J, Gemeinholzer B, Glaw F, Glöckner FO, Hawlitschek O, Kostadinov I, Nattkemper TW, Printzen C, Renz J, Rybalka N, Stadler M, Weibulat T, Wilke T, Renner SS, Vences M. Repositories for Taxonomic Data: Where We Are and What is Missing. Syst Biol 2020; 69:1231-1253. [PMID: 32298457 PMCID: PMC7584136 DOI: 10.1093/sysbio/syaa026] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 02/20/2020] [Accepted: 03/24/2020] [Indexed: 12/05/2022] Open
Abstract
Natural history collections are leading successful large-scale projects of specimen digitization (images, metadata, DNA barcodes), thereby transforming taxonomy into a big data science. Yet, little effort has been directed towards safeguarding and subsequently mobilizing the considerable amount of original data generated during the process of naming 15,000-20,000 species every year. From the perspective of alpha-taxonomists, we provide a review of the properties and diversity of taxonomic data, assess their volume and use, and establish criteria for optimizing data repositories. We surveyed 4113 alpha-taxonomic studies in representative journals for 2002, 2010, and 2018, and found an increasing yet comparatively limited use of molecular data in species diagnosis and description. In 2018, of the 2661 papers published in specialized taxonomic journals, molecular data were widely used in mycology (94%), regularly in vertebrates (53%), but rarely in botany (15%) and entomology (10%). Images play an important role in taxonomic research on all taxa, with photographs used in >80% and drawings in 58% of the surveyed papers. The use of omics (high-throughput) approaches or 3D documentation is still rare. Improved archiving strategies for metabarcoding consensus reads, genome and transcriptome assemblies, and chemical and metabolomic data could help to mobilize the wealth of high-throughput data for alpha-taxonomy. Because long-term-ideally perpetual-data storage is of particular importance for taxonomy, energy footprint reduction via less storage-demanding formats is a priority if their information content suffices for the purpose of taxonomic studies. Whereas taxonomic assignments are quasifacts for most biological disciplines, they remain hypotheses pertaining to evolutionary relatedness of individuals for alpha-taxonomy. For this reason, an improved reuse of taxonomic data, including machine-learning-based species identification and delimitation pipelines, requires a cyberspecimen approach-linking data via unique specimen identifiers, and thereby making them findable, accessible, interoperable, and reusable for taxonomic research. This poses both qualitative challenges to adapt the existing infrastructure of data centers to a specimen-centered concept and quantitative challenges to host and connect an estimated $ \le $2 million images produced per year by alpha-taxonomic studies, plus many millions of images from digitization campaigns. Of the 30,000-40,000 taxonomists globally, many are thought to be nonprofessionals, and capturing the data for online storage and reuse therefore requires low-complexity submission workflows and cost-free repository use. Expert taxonomists are the main stakeholders able to identify and formalize the needs of the discipline; their expertise is needed to implement the envisioned virtual collections of cyberspecimens. [Big data; cyberspecimen; new species; omics; repositories; specimen identifier; taxonomy; taxonomic data.].
Collapse
Affiliation(s)
- Aurélien Miralles
- Departement Origins and Evolution, Institut Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, 57 rue Cuvier, CP50, 75005 Paris, France
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
| | - Teddy Bruy
- Departement Origins and Evolution, Institut Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, 57 rue Cuvier, CP50, 75005 Paris, France
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
| | - Katherine Wolcott
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
- National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Mark D Scherz
- Department of Herpetology, Zoologische Staatssammlung München (ZSM-SNSB), Münchhausenstraße 21, 81247 München, Germany
- Department of Biology, Universität Konstanz, Universitätstraße 10, 78464 Konstanz, Germany
| | - Dominik Begerow
- Department of Geobotany, Ruhr-University Bochum, Universitätsstraße 150, 44780 Bochum, Germany
| | - Bank Beszteri
- Department of Phycology, Faculty of Biology, University of Duisburg-Essen, Universitätsstraße 2, 45141 Essen, Germany
| | - Michael Bonkowski
- Department of Terrestrial Ecology, Center of Excellence in Plant Sciences (CEPLAS), Terrestrial Ecology, Institute of Zoology, University of Cologne, 50674 Köln, Germany
| | - Janine Felden
- MARUM - Center for Marine Environmental Sciences, University of Bremen, Leobenerstraße 8, 28359 Bremen, Germany
- Alfred Wegener Institute - Helmholtz Center for Polar- and Marine Research, Am Handelshafen 12, 27570 Bremerhaven, Germany
| | - Birgit Gemeinholzer
- Department of Systematic Botany, Justus Liebig University Gießen, Heinrich-Buff Ring 38, 35392 Giessen, Germany
| | - Frank Glaw
- Department of Herpetology, Zoologische Staatssammlung München (ZSM-SNSB), Münchhausenstraße 21, 81247 München, Germany
| | - Frank Oliver Glöckner
- Alfred Wegener Institute - Helmholtz Center for Polar- and Marine Research, Am Handelshafen 12, 27570 Bremerhaven, Germany
| | - Oliver Hawlitschek
- Department of Herpetology, Zoologische Staatssammlung München (ZSM-SNSB), Münchhausenstraße 21, 81247 München, Germany
- Department of Scientific Infrastructure, Centrum für Naturkunde (CeNak), Universität Hamburg, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany
| | - Ivaylo Kostadinov
- GFBio - Gesellschaft für Biologische Daten e.V., c/o Research II, Campus Ring 1, 28759 Bremen, Germany
| | - Tim W Nattkemper
- Biodata Mining Group, Center of Biotechnology (CeBiTec), Bielefeld University, PO Box 100131, 33501 Bielefeld, Germany
| | - Christian Printzen
- Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Senckenberganlage 25, 60325 Frankfurt/Main, Germany
| | - Jasmin Renz
- Zooplankton Research Group, DZMB – Senckenberg am Meer, Martin-Luther-King Platz 3, 20146 Hamburg, Germany
| | - Nataliya Rybalka
- Department of Experimental Phycology and Culture Collection of Algae, University Göttingen, Nikolausberger-Weg 18, 37073 Göttingen, Germany
| | - Marc Stadler
- Department Microbial Drugs, Helmholtz Centre for Infection Research (HZI), and German Centre for Infection Research (DZIF), Partner Site Hannover-Braunschweig, Inhoffenstrasse 7, 38124 Braunschweig, Germany
| | - Tanja Weibulat
- GFBio - Gesellschaft für Biologische Daten e.V., c/o Research II, Campus Ring 1, 28759 Bremen, Germany
| | - Thomas Wilke
- Department of Animal Ecology and Systematics, Justus Liebig University Gießen, Heinrich-Buff Ring 26, 35392 Giessen, Germany
| | - Susanne S Renner
- Systematic Botany and Mycology, University of Munich (LMU), Menzingerstraße 67, 80638 Munich, Germany
| | - Miguel Vences
- Department of Evolutionary Biology, Zoological Institute, Technische Universität Braunschweig, Mendelssohnstraße 4, 38106 Braunschweig, Germany
| |
Collapse
|
3
|
Keeling PJ, Burki F, Wilcox HM, Allam B, Allen EE, Amaral-Zettler LA, Armbrust EV, Archibald JM, Bharti AK, Bell CJ, Beszteri B, Bidle KD, Cameron CT, Campbell L, Caron DA, Cattolico RA, Collier JL, Coyne K, Davy SK, Deschamps P, Dyhrman ST, Edvardsen B, Gates RD, Gobler CJ, Greenwood SJ, Guida SM, Jacobi JL, Jakobsen KS, James ER, Jenkins B, John U, Johnson MD, Juhl AR, Kamp A, Katz LA, Kiene R, Kudryavtsev A, Leander BS, Lin S, Lovejoy C, Lynn D, Marchetti A, McManus G, Nedelcu AM, Menden-Deuer S, Miceli C, Mock T, Montresor M, Moran MA, Murray S, Nadathur G, Nagai S, Ngam PB, Palenik B, Pawlowski J, Petroni G, Piganeau G, Posewitz MC, Rengefors K, Romano G, Rumpho ME, Rynearson T, Schilling KB, Schroeder DC, Simpson AGB, Slamovits CH, Smith DR, Smith GJ, Smith SR, Sosik HM, Stief P, Theriot E, Twary SN, Umale PE, Vaulot D, Wawrik B, Wheeler GL, Wilson WH, Xu Y, Zingone A, Worden AZ. The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing. PLoS Biol 2014; 12:e1001889. [PMID: 24959919 PMCID: PMC4068987 DOI: 10.1371/journal.pbio.1001889] [Citation(s) in RCA: 613] [Impact Index Per Article: 61.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Current sampling of genomic sequence data from eukaryotes is relatively poor, biased, and inadequate to address important questions about their biology, evolution, and ecology; this Community Page describes a resource of 700 transcriptomes from marine microbial eukaryotes to help understand their role in the world's oceans.
Collapse
Affiliation(s)
- Patrick J. Keeling
- Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
- Canadian Institute for Advanced Research, Integrated Microbial Biodiversity program, Canada
- * E-mail: (PJK); (AZW)
| | - Fabien Burki
- Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
| | - Heather M. Wilcox
- Monterey Bay Aquarium Research Institute, Moss Landing, California, United States of America
| | - Bassem Allam
- School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, New York, United States of America
| | - Eric E. Allen
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California, United States of America
| | - Linda A. Amaral-Zettler
- The Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts, United States of America
- Department of Geological Sciences, Brown University, Providence, Rhode Island, United States of America
| | - E. Virginia Armbrust
- School of Oceanography, University of Washington, Seattle, Washington, United States of America
| | - John M. Archibald
- Canadian Institute for Advanced Research, Integrated Microbial Biodiversity program, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Arvind K. Bharti
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Callum J. Bell
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Bank Beszteri
- Alfred Wegener Institute Helmholtz Center for Polar and Marine Research, Bremerhaven, Germany
| | - Kay D. Bidle
- Institute of Marine and Coastal Science, Rutgers University, New Brunswick, New Jersey, United States of America
| | - Connor T. Cameron
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Lisa Campbell
- Department of Oceanography, Department of Biology, Texas A&M University, College Station, Texas, United States of America
| | - David A. Caron
- Department of Biology, University of Southern California, Los Angeles, California, United States of America
| | - Rose Ann Cattolico
- Department of Biology, University of Washington, Seattle, Washington, United States of America
| | - Jackie L. Collier
- School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, New York, United States of America
| | - Kathryn Coyne
- University of Delaware, School of Marine Science and Policy, College of Earth, Ocean, and Environment, Lewes, Delaware, United States of America
| | - Simon K. Davy
- School of Biological Sciences, Victoria University of Wellington, Wellington, New Zealand
| | - Phillipe Deschamps
- Unité d'Ecologie, Systematique et Evolution, CNRS UMR8079, Université Paris-Sud, Orsay, France
| | - Sonya T. Dyhrman
- Department of Earth and Environmental Sciences and the Lamont-Doherty Earth Observatory, Columbia University, New York, New York, United States of America
| | | | - Ruth D. Gates
- Hawaii Institute of Marine Biology, University of Hawaii, Hawaii, United States of America
| | - Christopher J. Gobler
- School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, New York, United States of America
| | - Spencer J. Greenwood
- Department of Biomedical Sciences and AVC Lobster Science Centre, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, Prince Edward Island, Canada
| | - Stephanie M. Guida
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Jennifer L. Jacobi
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | | | - Erick R. James
- Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
| | - Bethany Jenkins
- Department of Cell and Molecular Biology, The University of Rhode Island, Kingston, Rhode Island, United States of America
- Graduate School of Oceanography, University of Rhode Island, Narragansett, Rhode Island, United States of America
| | - Uwe John
- Alfred Wegener Institute Helmholtz Center for Polar and Marine Research, Bremerhaven, Germany
| | - Matthew D. Johnson
- Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, United States of America
| | - Andrew R. Juhl
- Department of Earth and Environmental Sciences and the Lamont-Doherty Earth Observatory, Columbia University, New York, New York, United States of America
| | - Anja Kamp
- Max Planck Institute for Marine Microbiology, Bremen, Germany
- Jacobs University Bremen, Molecular Life Science Research Center, Bremen, Germany
| | - Laura A. Katz
- Department of Biological Sciences, Smith College, Northampton, Massachusetts, United States of America
| | - Ronald Kiene
- University of South Alabama, Dauphin Island Sea Lab, Mobile, Alabama, United States of America
| | - Alexander Kudryavtsev
- Department of Invertebrate Zoology, Saint-Petersburg State University, Saint-Petersburg, Russia
- Department of Genetics and Evolution, University of Geneva, Geneva, Switzerland
| | - Brian S. Leander
- Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
| | - Senjie Lin
- Department of Marine Sciences, University of Connecticut, Groton, Connecticut, United States of America
| | - Connie Lovejoy
- Département de Biologie, Université Laval, Québec, Canada
| | - Denis Lynn
- Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Adrian Marchetti
- Department of Marine Sciences, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - George McManus
- Department of Marine Sciences, University of Connecticut, Groton, Connecticut, United States of America
| | - Aurora M. Nedelcu
- University of New Brunswick, Department of Biology, Fredericton, New Brusnswick, Canada
| | - Susanne Menden-Deuer
- Graduate School of Oceanography, University of Rhode Island, Narragansett, Rhode Island, United States of America
| | - Cristina Miceli
- School of Biosciences and Biotechnology, University of Camerino, Camerino, Italy
| | - Thomas Mock
- School of Environmental Sciences, University of East Anglia, Norwich, United Kingdom
| | | | - Mary Ann Moran
- Department of Marine Sciences, University of Georgia, Athens, Georgia, United States of America
| | - Shauna Murray
- Plant Functional Biology and Climate Change Cluster (C3), University of Technology, Sydney, Australia
| | - Govind Nadathur
- Department of Marine Sciences, University of Puerto Rico, Mayaguez, Puerto Rico, United States of America
| | - Satoshi Nagai
- National Research Institute of Fisheries Science, Kanagawa, Japan
| | - Peter B. Ngam
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Brian Palenik
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California, United States of America
| | - Jan Pawlowski
- Department of Genetics and Evolution, University of Geneva, Geneva, Switzerland
| | | | - Gwenael Piganeau
- CNRS, UMR 7232, BIOM, Observatoire Océanologique, Banyuls-sur-Mer, France
- Sorbonne Universités, UPMC Univ Paris 06, UMR 7232, BIOM, Banyuls-sur-Mer, France
| | - Matthew C. Posewitz
- Department of Chemistry and Geochemistry, Colorado School of Mines, Golden, Colorado, United States of America
| | | | | | - Mary E. Rumpho
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut, United States of America
| | - Tatiana Rynearson
- Graduate School of Oceanography, University of Rhode Island, Narragansett, Rhode Island, United States of America
| | - Kelly B. Schilling
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Declan C. Schroeder
- The Marine Biological Association of the United Kingdom, Plymouth, United Kingdom
| | - Alastair G. B. Simpson
- Canadian Institute for Advanced Research, Integrated Microbial Biodiversity program, Canada
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Claudio H. Slamovits
- Canadian Institute for Advanced Research, Integrated Microbial Biodiversity program, Canada
- Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | | | - G. Jason Smith
- Moss Landing Marine Laboratories, Moss Landing, California, United States of America
| | - Sarah R. Smith
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California, United States of America
| | - Heidi M. Sosik
- Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, United States of America
| | - Peter Stief
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Edward Theriot
- Section of Integrative Biology, University of Texas, Austin, Texas, United States of America
| | - Scott N. Twary
- Los Alamos National Laboratory, Biosciences, Los Alamos, New Mexico, United States of America
| | - Pooja E. Umale
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Daniel Vaulot
- UMR714, CNRS and UPMC (Paris-06), Station Biologique, Roscoff, France
| | - Boris Wawrik
- Department of Microbiology and Plant Biology, University of Oklahoma, Norman, Oklahoma, United States of America
| | - Glen L. Wheeler
- The Marine Biological Association of the United Kingdom, Plymouth, United Kingdom
- Plymouth Marine Laboratory, Plymouth, United Kingdom
| | - William H. Wilson
- NCMA, Bigelow Laboratory for Ocean Sciences, East Boothbay, Maine, United States of America
| | - Yan Xu
- Princeton University, Princeton, New Jersey, United States of America
| | | | - Alexandra Z. Worden
- Canadian Institute for Advanced Research, Integrated Microbial Biodiversity program, Canada
- Monterey Bay Aquarium Research Institute, Moss Landing, California, United States of America
- * E-mail: (PJK); (AZW)
| |
Collapse
|