1
|
Phillips JD, Athey TB, McNicholas PD, Hanner RH. VLF: An R package for the analysis of very low frequency variants in DNA sequences. Biodivers Data J 2023; 11:e96480. [PMID: 38327328 PMCID: PMC10848336 DOI: 10.3897/bdj.11.e96480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 11/30/2022] [Indexed: 01/27/2023] Open
Abstract
Here, we introduce VLF, an R package to determine the distribution of very low frequency variants (VLFs) in nucleotide and amino acid sequences for the analysis of errors in DNA sequence records. The package allows users to assess VLFs in aligned and trimmed protein-coding sequences by automatically calculating the frequency of nucleotides or amino acids in each sequence position and outputting those that occur under a user-specified frequency (default of p = 0.001). These results can then be used to explore fundamental population genetic and phylogeographic patterns, mechanisms and processes at the microevolutionary level, such as nucleotide and amino acid sequence conservation. Our package extends earlier work pertaining to an implementation of VLF analysis in Microsoft Excel, which was found to be both computationally slow and error prone. We compare those results to our own herein. Results between the two implementations are found to be highly consistent for a large DNA barcode dataset of bird species. Differences in results are readily explained by both manual human error and inadequate Linnean taxonomy (specifically, species synonymy). Here, VLF is also applied to a subset of avian barcodes to assess the extent of biological artifacts at the species level for Canada goose (Branta canadensis), as well as within a large dataset of DNA barcodes for fishes of forensic and regulatory importance. The novelty of VLF and its benefit over the previous implementation include its high level of automation, speed, scalability and ease-of-use, each desirable characteristics which will be extremely valuable as more sequence data are rapidly accumulated in popular reference databases, such as BOLD and GenBank.
Collapse
Affiliation(s)
- Jarrett D. Phillips
- School of Computer Science and Department of Integrative Biology, University of Guelph, Guelph, CanadaSchool of Computer Science and Department of Integrative Biology, University of GuelphGuelphCanada
| | - Taryn B.T. Athey
- Stollery Children's Hospital, Edmonton, CanadaStollery Children's HospitalEdmontonCanada
| | - Paul D. McNicholas
- Department of Mathematics and Statistics, McMaster University, Hamilton, CanadaDepartment of Mathematics and Statistics, McMaster UniversityHamiltonCanada
| | - Robert H. Hanner
- Biodiversity Institute of Ontario and Department of Integrative Biology, University of Guelph, Guelph, CanadaBiodiversity Institute of Ontario and Department of Integrative Biology, University of GuelphGuelphCanada
| |
Collapse
|
2
|
Nugent CM, Elliott TA, Ratnasingham S, Adamowicz SJ. coil: an R package for cytochrome c oxidase I (COI) DNA barcode data cleaning, translation, and error evaluation. Genome 2020; 63:291-305. [DOI: 10.1139/gen-2019-0206] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Biological conclusions based on DNA barcoding and metabarcoding analyses can be strongly influenced by the methods utilized for data generation and curation, leading to varying levels of success in the separation of biological variation from experimental error. The 5′ region of cytochrome c oxidase subunit I (COI-5P) is the most common barcode gene for animals, with conserved structure and function that allows for biologically informed error identification. Here, we present coil ( https://CRAN.R-project.org/package=coil ), an R package for the pre-processing and frameshift error assessment of COI-5P animal barcode and metabarcode sequence data. The package contains functions for placement of barcodes into a common reading frame, accurate translation of sequences to amino acids, and highlighting insertion and deletion errors. The analysis of 10 000 barcode sequences of varying quality demonstrated how coil can place barcode sequences in reading frame and distinguish sequences containing indel errors from error-free sequences with greater than 97.5% accuracy. Package limitations were tested through the analysis of COI-5P sequences from the plant and fungal kingdoms as well as the analysis of potential contaminants: nuclear mitochondrial pseudogenes and Wolbachia COI-5P sequences. Results demonstrated that coil is a strong technical error identification method but is not reliable for detecting all biological contaminants.
Collapse
Affiliation(s)
- Cameron M. Nugent
- Department of Integrative Biology, University of Guelph. Guelph, Ontario, Canada
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph. Guelph, Ontario, Canada
| | - Tyler A. Elliott
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph. Guelph, Ontario, Canada
| | - Sujeevan Ratnasingham
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph. Guelph, Ontario, Canada
| | - Sarah J. Adamowicz
- Department of Integrative Biology, University of Guelph. Guelph, Ontario, Canada
| |
Collapse
|
3
|
Geographically well-distributed citizen science data reveals range-wide variation in the chipping sparrow's simple song. Anim Behav 2020. [DOI: 10.1016/j.anbehav.2019.12.012] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
4
|
Phillips JD, Gillis DJ, Hanner RH. Incomplete estimates of genetic diversity within species: Implications for DNA barcoding. Ecol Evol 2019; 9:2996-3010. [PMID: 30891232 PMCID: PMC6406011 DOI: 10.1002/ece3.4757] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Revised: 09/03/2018] [Accepted: 10/12/2018] [Indexed: 02/01/2023] Open
Abstract
DNA barcoding has greatly accelerated the pace of specimen identification to the species level, as well as species delineation. Whereas the application of DNA barcoding to the matching of unknown specimens to known species is straightforward, its use for species delimitation is more controversial, as species discovery hinges critically on present levels of haplotype diversity, as well as patterning of standing genetic variation that exists within and between species. Typical sample sizes for molecular biodiversity assessment using DNA barcodes range from 5 to 10 individuals per species. However, required levels that are necessary to fully gauge haplotype variation at the species level are presumed to be strongly taxon-specific. Importantly, little attention has been paid to determining appropriate specimen sample sizes that are necessary to reveal the majority of intraspecific haplotype variation within any one species. In this paper, we present a brief outline of the current literature and methods on intraspecific sample size estimation for the assessment of COI DNA barcode haplotype sampling completeness. The importance of adequate sample sizes for studies of molecular biodiversity is stressed, with application to a variety of metazoan taxa, through reviewing foundational statistical and population genetic models, with specific application to ray-finned fishes (Chordata: Actinopterygii). Finally, promising avenues for further research in this area are highlighted.
Collapse
Affiliation(s)
- Jarrett D. Phillips
- School of Computer ScienceUniversity of GuelphGuelphOntarioCanada
- Centre for Biodiversity GenomicsBiodiversity Institute of OntarioUniversity of GuelphGuelphOntarioCanada
| | - Daniel J. Gillis
- School of Computer ScienceUniversity of GuelphGuelphOntarioCanada
| | - Robert H. Hanner
- Centre for Biodiversity GenomicsBiodiversity Institute of OntarioUniversity of GuelphGuelphOntarioCanada
- Department of Integrative BiologyUniversity of GuelphGuelphOntarioCanada
| |
Collapse
|
5
|
Stoeckle MY, Das Mishu M, Charlop-Powers Z. GoFish: A versatile nested PCR strategy for environmental DNA assays for marine vertebrates. PLoS One 2018; 13:e0198717. [PMID: 30533051 PMCID: PMC6289459 DOI: 10.1371/journal.pone.0198717] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 11/21/2018] [Indexed: 12/30/2022] Open
Abstract
Here we describe GoFish, a strategy for single-species environmental DNA (eDNA) presence/absence assays using nested PCR. The assays amplify a mitochondrial 12S rDNA segment with vertebrate metabarcoding primers, followed by nested PCR with M13-tailed, species-specific primers. Sanger sequencing confirms positives detected by gel electrophoresis. We first obtained 12S sequences from 77 fish specimens for 36 northwestern Atlantic taxa not well documented in GenBank. Using these and existing 12S records, we designed GoFish assays for 11 bony fish species common in the lower Hudson River estuary and tested seasonal abundance and habitat preference at two sites. Additional assays detected nine cartilaginous fish species and a marine mammal, bottlenose dolphin, in southern New York Bight. GoFish sensitivity was equivalent to Illumina MiSeq metabarcoding. Unlike quantitative PCR (qPCR), GoFish does not require tissues of target and related species for assay development and a basic thermal cycler is sufficient. Unlike Illumina metabarcoding, indexing and batching samples are unnecessary and advanced bioinformatics expertise is not needed. From water collection to Sanger sequencing results, the assay can be carried out in three days. The main limitations to this approach, which employs metabarcoding primers, are the same as for metabarcoding, namely, inability to distinguish species with shared target sequences and inconsistent amplification of rarer eDNA. In addition, the performance of the 20 assays reported here as compared to other single-species eDNA assays is not known. This approach will be a useful addition to current eDNA methods when analyzing presence/absence of known species, when turnaround time is important, and in educational settings.
Collapse
Affiliation(s)
- Mark Y. Stoeckle
- Program for the Human Environment, The Rockefeller University, New York, New York, United States of America
| | | | - Zachary Charlop-Powers
- Laboratory of Genetically Encoded Small Molecules, The Rockefeller University, New York, NY, United States of America
| |
Collapse
|
6
|
Marizzi C, Florio A, Lee M, Khalfan M, Ghiban C, Nash B, Dorey J, McKenzie S, Mazza C, Cellini F, Baria C, Bepat R, Cosentino L, Dvorak A, Gacevic A, Guzman-Moumtzis C, Heller F, Holt NA, Horenstein J, Joralemon V, Kaur M, Kaur T, Khan A, Kuppan J, Laverty S, Lock C, Pena M, Petrychyn I, Puthenkalam I, Ram D, Ramos A, Scoca N, Sin R, Gonzalez I, Thakur A, Usmanov H, Han K, Wu A, Zhu T, Micklos DA. DNA barcoding Brooklyn (New York): A first assessment of biodiversity in Marine Park by citizen scientists. PLoS One 2018; 13:e0199015. [PMID: 30020927 PMCID: PMC6051577 DOI: 10.1371/journal.pone.0199015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Accepted: 05/30/2018] [Indexed: 11/18/2022] Open
Abstract
DNA barcoding is both an important research and science education tool. The technique allows for quick and accurate species identification using only minimal amounts of tissue samples taken from any organism at any developmental phase. DNA barcoding has many practical applications including furthering the study of taxonomy and monitoring biodiversity. In addition to these uses, DNA barcoding is a powerful tool to empower, engage, and educate students in the scientific method while conducting productive and creative research. The study presented here provides the first assessment of Marine Park (Brooklyn, New York, USA) biodiversity using DNA barcoding. New York City citizen scientists (high school students and their teachers) were trained to identify species using DNA barcoding during a two-week long institute. By performing NCBI GenBank BLAST searches, students taxonomically identified 187 samples (1 fungus, 70 animals and 116 plants) and also published 12 novel DNA barcodes on GenBank. Students also identified 7 ant species and demonstrated the potential of DNA barcoding for identification of this especially diverse group when coupled with traditional taxonomy using morphology. Here we outline how DNA barcoding allows citizen scientists to make preliminary taxonomic identifications and contribute to modern biodiversity research.
Collapse
Affiliation(s)
- Christine Marizzi
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Antonia Florio
- Department of Herpetology, American Museum of Natural History, New York, New York, United States of America
| | - Melissa Lee
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Mohammed Khalfan
- New York University, New York, New York, United States of America
| | - Cornel Ghiban
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Bruce Nash
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Jenna Dorey
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- The New York Botanical Garden, Bronx, New York, United States of America
| | - Sean McKenzie
- The Rockefeller University, New York, New York, United States of America
| | - Christine Mazza
- Genovesi Environmental Study Center, New York City Department of Education, Brooklyn, New York, United States of America
| | - Fabiana Cellini
- Genovesi Environmental Study Center, New York City Department of Education, Brooklyn, New York, United States of America
| | - Carlo Baria
- CSI for International Studies, New York City Department of Education, Staten Island, New York, United States of America
| | - Ron Bepat
- High School for Construction Trades, Engineering and Architecture, New York City Department of Education, Queens, New York, United States of America
| | - Lena Cosentino
- CSI for International Studies, New York City Department of Education, Staten Island, New York, United States of America
| | - Alexander Dvorak
- International High School at Union Square, New York City Department of Education New York, New York, United States of America
| | - Amina Gacevic
- High School for Health Professions and Human Services, New York City Department of Education, New York, New York, United States of America
| | - Cristina Guzman-Moumtzis
- Frank McCourt High School, New York City Department of Education, New York, New York, United States of America
| | - Francesca Heller
- Franklin D. Roosevelt High School, New York City Department of Education, Brooklyn, New York, United States of America
| | - Nicholas Alexander Holt
- High School for Construction Trades, Engineering and Architecture, New York City Department of Education, Queens, New York, United States of America
| | - Jeffrey Horenstein
- Stuyvesant High School, New York City Department of Education, New York, New York, United States of America
| | - Vincent Joralemon
- Frank McCourt High School, New York City Department of Education, New York, New York, United States of America
| | - Manveer Kaur
- High School for Health Professions and Human Services, New York City Department of Education, New York, New York, United States of America
| | - Tanveer Kaur
- High School for Health Professions and Human Services, New York City Department of Education, New York, New York, United States of America
| | - Armani Khan
- High School for Construction Trades, Engineering and Architecture, New York City Department of Education, Queens, New York, United States of America
| | - Jessica Kuppan
- High School for Construction Trades, Engineering and Architecture, New York City Department of Education, Queens, New York, United States of America
| | - Scott Laverty
- CSI for International Studies, New York City Department of Education, Staten Island, New York, United States of America
| | - Camila Lock
- Forest Hills High School, New York City Department of Education, Queens, New York, United States of America
| | - Marianne Pena
- High School for Health Professions and Human Services, New York City Department of Education, New York, New York, United States of America
| | - Ilona Petrychyn
- Forest Hills High School, New York City Department of Education, Queens, New York, United States of America
| | - Indu Puthenkalam
- Forest Hills High School, New York City Department of Education, Queens, New York, United States of America
| | - Daval Ram
- High School for Construction Trades, Engineering and Architecture, New York City Department of Education, Queens, New York, United States of America
| | - Arlene Ramos
- High School for Health Professions and Human Services, New York City Department of Education, New York, New York, United States of America
| | - Noelle Scoca
- Brooklyn International High School, New York City Department of Education, Brooklyn, New York, United States of America
| | - Rachel Sin
- Franklin D. Roosevelt High School, New York City Department of Education, Brooklyn, New York, United States of America
| | - Izabel Gonzalez
- High School for Health Professions and Human Services, New York City Department of Education, New York, New York, United States of America
| | - Akansha Thakur
- Forest Hills High School, New York City Department of Education, Queens, New York, United States of America
| | - Husan Usmanov
- Franklin D. Roosevelt High School, New York City Department of Education, Brooklyn, New York, United States of America
| | - Karen Han
- High School for Construction Trades, Engineering and Architecture, New York City Department of Education, Queens, New York, United States of America
| | - Andy Wu
- Franklin D. Roosevelt High School, New York City Department of Education, Brooklyn, New York, United States of America
| | - Tiger Zhu
- Stuyvesant High School, New York City Department of Education, New York, New York, United States of America
| | - David Andrew Micklos
- DNA Learning Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| |
Collapse
|
7
|
Machado VN, Collins RA, Ota RP, Andrade MC, Farias IP, Hrbek T. One thousand DNA barcodes of piranhas and pacus reveal geographic structure and unrecognised diversity in the Amazon. Sci Rep 2018; 8:8387. [PMID: 29849152 PMCID: PMC5976771 DOI: 10.1038/s41598-018-26550-x] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 05/10/2018] [Indexed: 11/25/2022] Open
Abstract
Piranhas and pacus (Characiformes: Serrasalmidae) are a charismatic but understudied family of Neotropical fishes. Here, we analyse a DNA barcode dataset comprising 1,122 specimens, 69 species, 16 genera, 208 localities, and 34 major river drainages in order to make an inventory of diversity and to highlight taxa and biogeographic areas worthy of further sampling effort and conservation protection. Using four methods of species discovery-incorporating both tree and distance based techniques-we report between 76 and 99 species-like clusters, i.e. between 20% and 33% of a priori identified taxonomic species were represented by more than one mtDNA lineage. There was a high degree of congruence between clusters, with 60% supported by three or four methods. Pacus of the genus Myloplus exhibited the most intraspecific variation, with six of the 13 species sampled found to have multiple lineages. Conversely, piranhas of the Serrasalmus rhombeus group proved difficult to delimit with these methods due to genetic similarity and polyphyly. Overall, our results recognise substantially underestimated diversity in the serrasalmids, and emphasise the Guiana and Brazilian Shield rivers as biogeographically important areas with multiple cases of across-shield and within-shield diversifications. We additionally highlight the distinctiveness and complex phylogeographic history of rheophilic taxa in particular, and suggest multiple colonisations of these habitats by different serrasalmid lineages.
Collapse
Affiliation(s)
- Valeria N Machado
- Laboratório de Evolução e Genétic Animal, Departamento de Genética, Universidade Federal do Amazonas, Av., General Rodrigo Otávio Jordão, 3000, 69077-000, Manaus, AM, Brazil
| | - Rupert A Collins
- Laboratório de Evolução e Genétic Animal, Departamento de Genética, Universidade Federal do Amazonas, Av., General Rodrigo Otávio Jordão, 3000, 69077-000, Manaus, AM, Brazil.
- School of Biological Sciences, University of Bristol, Life Sciences Building, 24 Tyndall Avenue, Bristol, BS8 1TQ, UK.
| | - Rafaela P Ota
- Programa de Pós-Graduação em Biologia de Água Doce e Pesca Interior, Instituto Nacional de Pesquisas da Amazônia, Av. André Araújo, 2936, CP 2223, Petrópolis, 69080-971, Manaus, AM, Brazil
| | - Marcelo C Andrade
- Programa de Pós-Graduação em Ecologia Aquática e Pesca, Instituto de Ciências Biológicas, Universidade Federal do Pará, Av. Perimetral, 2651, Terra Firme, 66040-830, Belém, PA, Brazil
| | - Izeni P Farias
- Laboratório de Evolução e Genétic Animal, Departamento de Genética, Universidade Federal do Amazonas, Av., General Rodrigo Otávio Jordão, 3000, 69077-000, Manaus, AM, Brazil
| | - Tomas Hrbek
- Laboratório de Evolução e Genétic Animal, Departamento de Genética, Universidade Federal do Amazonas, Av., General Rodrigo Otávio Jordão, 3000, 69077-000, Manaus, AM, Brazil.
| |
Collapse
|
8
|
Chakraborty M, Dhar B, Ghosh SK. Design of character-based DNA barcode motif for species identification: A computational approach and its validation in fishes. Mol Ecol Resour 2017; 17:1359-1370. [PMID: 28332322 DOI: 10.1111/1755-0998.12671] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Revised: 01/26/2017] [Accepted: 03/07/2017] [Indexed: 11/29/2022]
Abstract
The DNA barcodes are generally interpreted using distance-based and character-based methods. The former uses clustering of comparable groups, based on the relative genetic distance, while the latter is based on the presence or absence of discrete nucleotide substitutions. The distance-based approach has a limitation in defining a universal species boundary across the taxa as the rate of mtDNA evolution is not constant throughout the taxa. However, character-based approach more accurately defines this using a unique set of nucleotide characters. The character-based analysis of full-length barcode has some inherent limitations, like sequencing of the full-length barcode, use of a sparse-data matrix and lack of a uniform diagnostic position for each group. A short continuous stretch of a fragment can be used to resolve the limitations. Here, we observe that a 154-bp fragment, from the transversion-rich domain of 1367 COI barcode sequences can successfully delimit species in the three most diverse orders of freshwater fishes. This fragment is used to design species-specific barcode motifs for 109 species by the character-based method, which successfully identifies the correct species using a pattern-matching program. The motifs also correctly identify geographically isolated population of the Cypriniformes species. Further, this region is validated as a species-specific mini-barcode for freshwater fishes by successful PCR amplification and sequencing of the motif (154 bp) using the designed primers. We anticipate that use of such motifs will enhance the diagnostic power of DNA barcode, and the mini-barcode approach will greatly benefit the field-based system of rapid species identification.
Collapse
Affiliation(s)
- Mohua Chakraborty
- Department of Biotechnology, Assam University, Silchar, Assam, India
| | - Bishal Dhar
- Department of Biotechnology, Assam University, Silchar, Assam, India
| | - Sankar Kumar Ghosh
- Department of Biotechnology, Assam University, Silchar, Assam, India.,University of Kalyani, Kalyani, West Bengal, India
| |
Collapse
|
9
|
Bourque DA, Naaum AM, Distel DL, Hanner RH. Whole Genome Amplification Provides Suitable Control DNA for Use in DNA Barcoding Applications. Biopreserv Biobank 2017; 15:277-279. [PMID: 28080142 DOI: 10.1089/bio.2016.0078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Danielle A Bourque
- 1 Department of Integrative Biology, Centre for Biodiversity Genomics & Biodiversity Institute of Ontario, The University of Guelph , Ontario, Canada
| | - Amanda M Naaum
- 1 Department of Integrative Biology, Centre for Biodiversity Genomics & Biodiversity Institute of Ontario, The University of Guelph , Ontario, Canada
| | - Dan L Distel
- 2 Ocean Genome Legacy Centre of New England Biolabs, Marine Science Centre, Northeastern University , Nahant, Massachusetts
| | - Robert H Hanner
- 1 Department of Integrative Biology, Centre for Biodiversity Genomics & Biodiversity Institute of Ontario, The University of Guelph , Ontario, Canada
| |
Collapse
|
10
|
Barreira AS, Lijtmaer DA, Tubaro PL. The multiple applications of DNA barcodes in avian evolutionary studies. Genome 2016; 59:899-911. [DOI: 10.1139/gen-2016-0086] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
DNA barcodes of birds are currently available for 41% of known species and for many different geographic areas; therefore, they are a rich data source to answer evolutionary questions. We review studies that have used DNA barcodes to investigate evolutionary processes in birds using diverse approaches. We also review studies that have investigated species in depth where taxonomy and DNA barcodes present inconsistencies. Species that showed low genetic interspecific divergence and lack of reciprocal monophyly either are the result of recent radiation and (or) hybridize, while species with large genetic splits in their COI sequences were determined to be more than one independent evolutionary unit. In addition, we review studies that employed large DNA barcode datasets to study the molecular evolution of mitochondrial genes and the biogeography of islands, continents, and even at a multi-continental scale. These studies showed that DNA barcodes offer high-quality data well beyond their main purpose of serving as a molecular tool for species identification.
Collapse
Affiliation(s)
- Ana S. Barreira
- División Ornitología, Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” - CONICET, Avda. Ángel Gallardo 470, Ciudad Autónoma de Buenos Aires, C1405DJR, Argentina
- División Ornitología, Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” - CONICET, Avda. Ángel Gallardo 470, Ciudad Autónoma de Buenos Aires, C1405DJR, Argentina
| | - Darío A. Lijtmaer
- División Ornitología, Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” - CONICET, Avda. Ángel Gallardo 470, Ciudad Autónoma de Buenos Aires, C1405DJR, Argentina
- División Ornitología, Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” - CONICET, Avda. Ángel Gallardo 470, Ciudad Autónoma de Buenos Aires, C1405DJR, Argentina
| | - Pablo L. Tubaro
- División Ornitología, Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” - CONICET, Avda. Ángel Gallardo 470, Ciudad Autónoma de Buenos Aires, C1405DJR, Argentina
- División Ornitología, Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” - CONICET, Avda. Ángel Gallardo 470, Ciudad Autónoma de Buenos Aires, C1405DJR, Argentina
| |
Collapse
|
11
|
Collins RA, Britz R, Rüber L. Phylogenetic systematics of leaffishes (Teleostei: Polycentridae, Nandidae). J ZOOL SYST EVOL RES 2015. [DOI: 10.1111/jzs.12103] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Rupert A. Collins
- Laboratório de Evolução e Genética Animal; Departamento de Biologia; Universidade Federal do Amazonas; Manaus Amazonas Brasil
| | - Ralf Britz
- Vertebrates Division; Department of Life Sciences; Natural History Museum; London UK
| | - Lukas Rüber
- Naturhistorisches Museum der Burgergemeinde Bern; Bern Switzerland
| |
Collapse
|
12
|
DNA barcoding works in practice but not in (neutral) theory. PLoS One 2014; 9:e100755. [PMID: 24988408 PMCID: PMC4079456 DOI: 10.1371/journal.pone.0100755] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Accepted: 05/30/2014] [Indexed: 11/19/2022] Open
Abstract
Background DNA barcode differences within animal species are usually much less than differences among species, making it generally straightforward to match unknowns to a reference library. Here we aim to better understand the evolutionary mechanisms underlying this usual “barcode gap” pattern. We employ avian barcode libraries to test a central prediction of neutral theory, namely, intraspecific variation equals 2 Nµ, where N is population size and µ is mutations per site per generation. Birds are uniquely suited for this task: they have the best-known species limits, are well represented in barcode libraries, and, most critically, are the only large group with documented census population sizes. In addition, we ask if mitochondrial molecular clock measurements conform to neutral theory prediction of clock rate equals µ. Results Intraspecific COI barcode variation was uniformly low regardless of census population size (n = 142 species in 15 families). Apparent outliers reflected lumping of reproductively isolated populations or hybrid lineages. Re-analysis of a published survey of cytochrome b variation in diverse birds (n = 93 species in 39 families) further confirmed uniformly low intraspecific variation. Hybridization/gene flow among species/populations was the main limitation to DNA barcode identification. Conclusions/Significance To our knowledge, this is the first large study of animal mitochondrial diversity using actual census population sizes and the first to test outliers for population structure. Our finding of universally low intraspecific variation contradicts a central prediction of neutral theory and is not readily accounted for by commonly proposed ad hoc modifications. We argue that the weight of evidence–low intraspecific variation and the molecular clock–indicates neutral evolution plays a minor role in mitochondrial sequence evolution. As an alternate paradigm consistent with empirical data, we propose extreme purifying selection, including at synonymous sites, limits variation within species and continuous adaptive selection drives the molecular clock.
Collapse
|
13
|
Chakraborty M, Ghosh SK. Unraveling the sequence information in COI barcode to achieve higher taxon assignment based on Indian freshwater fishes. ACTA ACUST UNITED AC 2014; 26:175-7. [PMID: 24409929 DOI: 10.3109/19401736.2013.855923] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Efficacy of cytochrome c oxidase subunit I (COI) DNA barcode in higher taxon assignment is still under debate in spite of several attempts, using the conventional DNA barcoding methods, to assign higher taxa. Here we try to understand whether nucleotide and amino acid sequence in COI gene carry sufficient information to assign species to their higher taxonomic rank, using 160 species of Indian freshwater fishes. Our results reveal that with increase in the taxonomic rank, sequence conservation decreases for both nucleotides and amino acids. Order level exhibits lowest conservation with 50% of the nucleotides and amino acids being conserved. Among the variable sites, 30-50% were found to carry high information content within an order, while it was 70-80% within a family and 80-99% within a genus. High information content shows sites with almost conserved sequence but varying at one or two locations, which can be due to variations at species or population level. Thus, the potential of COI gene in higher taxon assignment is revealed with validation of ample inherent signals latent in the gene.
Collapse
Affiliation(s)
- Mohua Chakraborty
- Department of Biotechnology, Assam University , Silchar, Assam , India
| | | |
Collapse
|
14
|
Fietz K, Graves JA, Olsen MT. Control control control: a reassessment and comparison of GenBank and chromatogram mtDNA sequence variation in Baltic grey seals (Halichoerus grypus). PLoS One 2013; 8:e72853. [PMID: 23977362 PMCID: PMC3745392 DOI: 10.1371/journal.pone.0072853] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Accepted: 07/06/2013] [Indexed: 11/18/2022] Open
Abstract
Genetic data can provide a powerful tool for those interested in the biology, management and conservation of wildlife, but also lead to erroneous conclusions if appropriate controls are not taken at all steps of the analytical process. This particularly applies to data deposited in public repositories such as GenBank, whose utility relies heavily on the assumption of high data quality. Here we report on an in-depth reassessment and comparison of GenBank and chromatogram mtDNA sequence data generated in a previous study of Baltic grey seals. By re-editing the original chromatogram data we found that approximately 40% of the grey seal mtDNA haplotype sequences posted in GenBank contained errors. The re-analysis of the edited chromatogram data yielded overall similar results and conclusions as the original study. However, a significantly different outcome was observed when using the uncorrected dataset based on the GenBank haplotypes. We therefore suggest disregarding the existing GenBank data and instead using the correct haplotypes reported here. Our study serves as an illustrative example reiterating the importance of quality control through every step of a research project, from data generation to interpretation and submission to an online repository. Errors conducted in any step may lead to biased results and conclusions, and could impact management decisions.
Collapse
Affiliation(s)
- Katharina Fietz
- Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen, Denmark
- * E-mail: (FK); (MTO)
| | - Jeff A. Graves
- School of Biology, University of St Andrews, St Andrews, Scotland, United Kingdom
| | - Morten Tange Olsen
- Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen, Denmark
- Department of Bioscience, Aarhus University, Roskilde, Denmark
- * E-mail: (FK); (MTO)
| |
Collapse
|
15
|
Stoeckle MY, Coffran C. TreeParser-aided Klee diagrams display taxonomic clusters in DNA barcode and nuclear gene datasets. Sci Rep 2013; 3:2635. [PMID: 24022383 PMCID: PMC3769653 DOI: 10.1038/srep02635] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Accepted: 08/23/2013] [Indexed: 01/08/2023] Open
Abstract
Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood.
Collapse
Affiliation(s)
- Mark Y. Stoeckle
- Program for the Human Environment, The Rockefeller University, New York, NY 10065
| | - Cameron Coffran
- Program for the Human Environment, The Rockefeller University, New York, NY 10065
| |
Collapse
|