1
|
Quaresma A, Ankenbrand MJ, Garcia CAY, Rufino J, Honrado M, Amaral J, Brodschneider R, Brusbardis V, Gratzer K, Hatjina F, Kilpinen O, Pietropaoli M, Roessink I, van der Steen J, Vejsnæs F, Pinto MA, Keller A. Semi-automated sequence curation for reliable reference datasets in ITS2 vascular plant DNA (meta-)barcoding. Sci Data 2024; 11:129. [PMID: 38272945 PMCID: PMC10810873 DOI: 10.1038/s41597-024-02962-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 01/12/2024] [Indexed: 01/27/2024] Open
Abstract
One of the most critical steps for accurate taxonomic identification in DNA (meta)-barcoding is to have an accurate DNA reference sequence dataset for the marker of choice. Therefore, developing such a dataset has been a long-term ambition, especially in the Viridiplantae kingdom. Typically, reference datasets are constructed with sequences downloaded from general public databases, which can carry taxonomic and other relevant errors. Herein, we constructed a curated (i) global dataset, (ii) European crop dataset, and (iii) 27 datasets for the EU countries for the ITS2 barcoding marker of vascular plants. To that end, we first developed a pipeline script that entails (i) an automated curation stage comprising five filters, (ii) manual taxonomic correction for misclassified taxa, and (iii) manual addition of newly sequenced species. The pipeline allows easy updating of the curated datasets. With this approach, 13% of the sequences, corresponding to 7% of species originally imported from GenBank, were discarded. Further, 259 sequences were manually added to the curated global dataset, which now comprises 307,977 sequences of 111,382 plant species.
Collapse
Affiliation(s)
- Andreia Quaresma
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
- Laboratório Associado para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Rua do Campo Alegre, S/N, Edifício FC4, 4169-007, Porto, Portugal
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661, Vairão, Vila do Conde, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, 4485-661, Vairão, Vila do Conde, Portugal
| | - Markus J Ankenbrand
- Center for Computational and Theoretical Biology, Faculty of Biology, Julius-Maximilians-Universität Würzburg, Klara-Oppenheimer-Weg 32, 97074, Würzburg, Germany
| | - Carlos Ariel Yadró Garcia
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
- Laboratório Associado para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
| | - José Rufino
- Laboratório Associado para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
- Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Politécnico de Bragança, Bragança, Portugal
| | - Mónica Honrado
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
- Laboratório Associado para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
| | - Joana Amaral
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
- Laboratório Associado para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
| | - Robert Brodschneider
- Institute of Biology, University of Graz, Universitätsplatz 2, 8010, Graz, Austria
| | - Valters Brusbardis
- Latvian Beekeepers' Association (LBA), Rigas iela 22, LV-3004, Jelgava, Latvia
| | - Kristina Gratzer
- Institute of Biology, University of Graz, Universitätsplatz 2, 8010, Graz, Austria
| | - Fani Hatjina
- Ellinikos Georgikos Organismos DIMITRA (ELGO- DIMITRA), Kourtidou 56-58, GR-11145, Athina, Greece
| | - Ole Kilpinen
- Danish Beekeepers Association (DBF), Fulbyvej 15, DK-4180, Sorø, Denmark
| | - Marco Pietropaoli
- Istituto Zooprofilattico Sperimentale del Lazio e della Toscana "M. Aleandri" (IZSLT), Via Appia Nuova 1411, IT-00178, Roma, Italy
| | - Ivo Roessink
- Wageningen Environmental Research, WageningenUniversity&Research, Droevendaalsesteeg 3, 6700 AA, Wageningen, Netherlands
| | | | - Flemming Vejsnæs
- Danish Beekeepers Association (DBF), Fulbyvej 15, DK-4180, Sorø, Denmark
| | - M Alice Pinto
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
- Laboratório Associado para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC), Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
| | - Alexander Keller
- Cellular and Organismic Interactions, Biocenter, Faculty of Biology, Ludwig-Maximilians-Universität München, Großhaderner Str. 2-4, 82152, Planegg-Martinsried, Germany.
| |
Collapse
|
2
|
Atteia A, Bec B, Gianaroli C, Serais O, Quétel I, Lagarde F, Gobet A. Evaluation of sequential filtration and centrifugation to capture environmental DNA and survey microbial eukaryotic communities in aquatic environments. Mol Ecol Resour 2024; 24:e13887. [PMID: 37899641 DOI: 10.1111/1755-0998.13887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 10/10/2023] [Accepted: 10/16/2023] [Indexed: 10/31/2023]
Abstract
Sequential membrane filtration of water samples is commonly used to monitor the diversity of aquatic microbial eukaryotes. This capture method is efficient to focus on specific taxonomic groups within a size fraction, but it is time-consuming. Centrifugation, often used to collect microorganisms from pure culture, could be seen as an alternative to capture microbial eukaryotic communities from environmental samples. Here, we compared the two capture methods to assess diversity and ecological patterns of eukaryotic communities in the Thau lagoon, France. Water samples were taken twice a month over a full year and sequential filtration targeting the picoplankton (0.2-3 μm) and larger organisms (>3 μm) was used in parallel to centrifugation. The microbial eukaryotic community in the samples was described using an environmental DNA approach targeting the V4 region of the 18S rRNA gene. The most abundant divisions in the filtration fractions and the centrifugation pellet were Dinoflagellata, Metazoa, Ochrophyta, Cryptophyta. Chlorophyta were dominant in the centrifugation pellet and the picoplankton fraction but not in the larger fraction. Diversity indices and structuring patterns of the community in the two size fractions and the centrifugation pellet were comparable. Twenty amplicon sequence variants were significantly differentially abundant between the two size fractions and the centrifugation pellet, and their temporal patterns of abundance in the two fractions combined were similar to those obtained with centrifugation. Overall, centrifugation led to similar ecological conclusions as the two filtrated fractions combined, thus making it an attractive time-efficient alternative to sequential filtration.
Collapse
Affiliation(s)
- Ariane Atteia
- MARBEC, Univ Montpellier, CNRS, Ifremer, IRD, Sète, France
| | - Béatrice Bec
- MARBEC, Univ Montpellier, CNRS, Ifremer, IRD, Montpellier, France
| | | | - Ophélie Serais
- MARBEC, Univ Montpellier, CNRS, Ifremer, IRD, Sète, France
| | - Isaure Quétel
- MARBEC, Univ Montpellier, CNRS, Ifremer, IRD, Sète, France
| | - Franck Lagarde
- MARBEC, Univ Montpellier, CNRS, Ifremer, IRD, Sète, France
| | | |
Collapse
|
3
|
Vaulot D, Sim CWH, Ong D, Teo B, Biwer C, Jamy M, Lopes dos Santos A. metaPR 2 : A database of eukaryotic 18S rRNA metabarcodes with an emphasis on protists. Mol Ecol Resour 2022; 22:3188-3201. [PMID: 35762265 PMCID: PMC9796713 DOI: 10.1111/1755-0998.13674] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/26/2022] [Accepted: 06/20/2022] [Indexed: 01/07/2023]
Abstract
In recent years, metabarcoding has become the method of choice for investigating the composition and assembly of microbial eukaryotic communities. The number of environmental data sets published has increased very rapidly. Although unprocessed sequence files are often publicly available, processed data, in particular clustered sequences, are rarely available in a usable format. Clustered sequences are reported as operational taxonomic units (OTUs) with different similarity levels or more recently as amplicon sequence variants (ASVs). This hampers comparative studies between different environments and data sets, for example examining the biogeographical patterns of specific groups/species, as well analysing the genetic microdiversity within these groups. Here, we present a newly-assembled database of processed 18S rRNA metabarcodes that are annotated with the PR2 reference sequence database. This database, called metaPR2 , contains 41 data sets corresponding to more than 4000 samples and 90,000 ASVs. The database, which is accessible through both a web-based interface (https://shiny.metapr2.org) and an R package, should prove very useful to all researchers working on protist diversity in a variety of systems.
Collapse
Affiliation(s)
- Daniel Vaulot
- UMR 7144, ECOMAP, CNRSSorbonne Université, Station Biologique de RoscoffRoscoffFrance
| | | | - Denise Ong
- Asian School of the EnvironmentNanyang Technological UniversitySingapore
| | - Bryan Teo
- Asian School of the EnvironmentNanyang Technological UniversitySingapore
| | - Charlie Biwer
- Department of Organismal Biology (Systematic Biology)Uppsala UniversityUppsalaSweden
| | - Mahwash Jamy
- Department of Organismal Biology (Systematic Biology)Uppsala UniversityUppsalaSweden
| | | |
Collapse
|
4
|
Yung CCM, Rey Redondo E, Sanchez F, Yau S, Piganeau G. Diversity and Evolution of Mamiellophyceae: Early-Diverging Phytoplanktonic Green Algae Containing Many Cosmopolitan Species. JMSE 2022; 10:240. [DOI: 10.3390/jmse10020240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The genomic revolution has bridged a gap in our knowledge about the diversity, biology and evolution of unicellular photosynthetic eukaryotes, which bear very few discriminating morphological features among species from the same genus. The high-quality genome resources available in the class Mamiellophyceae (Chlorophyta) have been paramount to estimate species diversity and screen available metagenomic data to assess the biogeography and ecological niches of different species on a global scale. Here we review the current knowledge about the diversity, ecology and evolution of the Mamiellophyceae and the large double-stranded DNA prasinoviruses infecting them, brought by the combination of genomic and metagenomic analyses, including 26 metabarcoding environmental studies, as well as the pan-oceanic GOS and the Tara Oceans expeditions.
Collapse
|
5
|
De Luca D, Piredda R, Sarno D, Kooistra WHCF. Resolving cryptic species complexes in marine protists: phylogenetic haplotype networks meet global DNA metabarcoding datasets. ISME J 2021; 15:1931-1942. [PMID: 33589768 PMCID: PMC8245484 DOI: 10.1038/s41396-021-00895-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 12/23/2020] [Accepted: 01/14/2021] [Indexed: 12/21/2022]
Abstract
Marine protists have traditionally been assumed to be lowly diverse and cosmopolitan. Yet, several recent studies have shown that many protist species actually consist of cryptic complexes of species whose members are often restricted to particular biogeographic regions. Nonetheless, detection of cryptic species is usually hampered by sampling coverage and application of methods (e.g. phylogenetic trees) that are not well suited to identify relatively recent divergence and ongoing gene flow. In this paper, we show how these issues can be overcome by inferring phylogenetic haplotype networks from global metabarcoding datasets. We use the Chaetoceros curvisetus (Bacillariophyta) species complex as study case. Using two complementary metabarcoding datasets (Ocean Sampling Day and Tara Oceans), we equally resolve the cryptic complex in terms of number of inferred species. We detect new hypothetical species in both datasets. Gene flow between most of species is absent, but no barcoding gap exists. Some species have restricted distribution patterns whereas others are widely distributed. Closely related taxa occupy contrasting biogeographic regions, suggesting that geographic and ecological differentiation drive speciation. In conclusion, we show the potential of the analysis of metabarcoding data with evolutionary approaches for systematic and phylogeographic studies of marine protists.
Collapse
Affiliation(s)
- Daniele De Luca
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Naples, Italy
- Department of Biology, Botanical Garden of Naples, University of Naples Federico II, Naples, Italy
| | - Roberta Piredda
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Diana Sarno
- Department of Research Infrastructure for Marine Biological Resources, Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Wiebe H C F Kooistra
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Naples, Italy.
| |
Collapse
|
6
|
Belevich TA, Milyutina IA, Abyzova GA, Troitsky AV. The pico-sized Mamiellophyceae and a novel Bathycoccus clade from the summer plankton of Russian Arctic Seas and adjacent waters. FEMS Microbiol Ecol 2021; 97:6031321. [PMID: 33307552 DOI: 10.1093/femsec/fiaa251] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 12/09/2020] [Indexed: 12/15/2022] Open
Abstract
Global climate changes and anthropogenic activity greatly impact Arctic marine biodiversity including phytoplankton which contribute greatly to atmospheric oxygen production. Thus the study of microalgae has rising topicality. Class Mamiellophyceae is an important component of phototrophic picoplankton. To gain more knowledge about Mamiellophyceae distribution and diversity special studies were performed in such remote areas as the Russian Arctic seas. A metabarcoding of pico-sized Mamiellophyceae was undertaken by high-throughput sequencing of the 18S rRNA gene sequence V4 region from samples collected in July-September 2017 in the Barents, Kara and Laptev seas, and in the adjacent waters of the Norwegian Sea. Our study is the first to show that Mamiellophyceae among the summer picoplankton of Russian Arctic seas are diverse and represented by 16 algae species/phylotypes. We discovered a new candidate species of Bathycoccus assigned to a new Bathycoccus clade A-uncultured Bathycoccus Kara 2017. It was found that several Micromonas species can co-exist, with Micromonas polaris dominating north of 72°N. The presence of Ostreococcus tauri, Ostreococcus lucimarinus and Ostreococcus mediterraneus at high latitudes beyond 65°N was documented for the first time, similar to findings for some other taxa. Our results will be important for obtaining a global view of Mamiellophyceae community dynamics.
Collapse
Affiliation(s)
- Tatiana A Belevich
- Lomonosov Moscow State University, Biological Faculty, Moscow, Russia.,Lomonosov Moscow State University, Belozersky Institute of Physico-Chemical Biology, Moscow, Russia
| | - Irina A Milyutina
- Lomonosov Moscow State University, Belozersky Institute of Physico-Chemical Biology, Moscow, Russia
| | - Galina A Abyzova
- Shirshov Institute of Oceanology, Russian Academy of Science, Moscow, Russia
| | - Aleksey V Troitsky
- Lomonosov Moscow State University, Belozersky Institute of Physico-Chemical Biology, Moscow, Russia
| |
Collapse
|