1
|
Carugati L, Melis R, Cariani A, Cau A, Crobe V, Ferrari A, Follesa MC, Geraci ML, Iglésias SP, Pesci P, Tinti F, Cannas R. Combined COI barcode‐based methods to avoid mislabelling of threatened species of deep‐sea skates. Anim Conserv 2021. [DOI: 10.1111/acv.12716] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- L. Carugati
- Department of Life and Environmental Sciences University of Cagliari Cagliari Italy
| | - R. Melis
- Department of Life and Environmental Sciences University of Cagliari Cagliari Italy
| | - A. Cariani
- Department of Biological, Geological and Environmental Sciences (BiGeA) University of Bologna Bologna Italy
| | - A. Cau
- Department of Life and Environmental Sciences University of Cagliari Cagliari Italy
| | - V. Crobe
- Department of Biological, Geological and Environmental Sciences (BiGeA) University of Bologna Bologna Italy
| | - A. Ferrari
- Department of Biological, Geological and Environmental Sciences (BiGeA) University of Bologna Bologna Italy
| | - M. C. Follesa
- Department of Life and Environmental Sciences University of Cagliari Cagliari Italy
| | - M. L. Geraci
- Department of Biological Geological and Environmental Sciences (BiGeA) – Marine biology and fisheries laboratory University of Bologna Fano (PU) Italy
- Institute for Biological Resources and Marine Biotechnologies (IRBIM) National Research Council (CNR) Mazara del Vallo (TP) Italy
| | - S. P. Iglésias
- Institut de Systématique, Evolution, Biodiversité (ISYEB) Muséum national d’Histoire naturelleCNRSSorbonne UniversitéEPHEUniversité des AntillesStation Marine de Concarneau Concarneau France
| | - P. Pesci
- Department of Life and Environmental Sciences University of Cagliari Cagliari Italy
| | - F. Tinti
- Department of Biological, Geological and Environmental Sciences (BiGeA) University of Bologna Bologna Italy
| | - R. Cannas
- Department of Life and Environmental Sciences University of Cagliari Cagliari Italy
| |
Collapse
|
2
|
Delrieu‐Trottin E, Durand J, Limmon G, Sukmono T, Kadarusman, Sugeha HY, Chen W, Busson F, Borsa P, Dahruddin H, Sauri S, Fitriana Y, Zein MSA, Hocdé R, Pouyaud L, Keith P, Wowor D, Steinke D, Hanner R, Hubert N. Biodiversity inventory of the grey mullets (Actinopterygii: Mugilidae) of the Indo-Australian Archipelago through the iterative use of DNA-based species delimitation and specimen assignment methods. Evol Appl 2020; 13:1451-1467. [PMID: 32684969 PMCID: PMC7359824 DOI: 10.1111/eva.12926] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 01/16/2020] [Accepted: 01/20/2020] [Indexed: 12/25/2022] Open
Abstract
DNA barcoding opens new perspectives on the way we document biodiversity. Initially proposed to circumvent the limits of morphological characters to assign unknown individuals to known species, DNA barcoding has been used in a wide array of studies where collecting species identity constitutes a crucial step. The assignment of unknowns to knowns assumes that species are already well identified and delineated, making the assignment performed reliable. Here, we used DNA-based species delimitation and specimen assignment methods iteratively to tackle the inventory of the Indo-Australian Archipelago grey mullets, a notorious case of taxonomic complexity that requires DNA-based identification methods considering that traditional morphological identifications are usually not repeatable and sequence mislabeling is common in international sequence repositories. We first revisited a DNA barcode reference library available at the global scale for Mugilidae through different DNA-based species delimitation methods to produce a robust consensus scheme of species delineation. We then used this curated library to assign unknown specimens collected throughout the Indo-Australian Archipelago to known species. A second iteration of OTU delimitation and specimen assignment was then performed. We show the benefits of using species delimitation and specimen assignment methods iteratively to improve the accuracy of specimen identification and propose a workflow to do so.
Collapse
Affiliation(s)
- Erwan Delrieu‐Trottin
- UMR 5554 ISEM (IRD, UM, CNRS, EPHE)Université de MontpellierMontpellier CedexFrance
- Museum für NaturkundeLeibniz Institute for Evolution and Biodiversity ScienceBerlinGermany
| | - Jean‐Dominique Durand
- UMR 9190 MARBEC (IRD, UM, CNRS, IFREMER)Université de MontpellierMontpellier CedexFrance
| | - Gino Limmon
- Maritime and Marine Science Center of ExcellenceUniversitas PattimuraAmbonIndonesia
| | - Tedjo Sukmono
- Department of BiologyUniversitas JambiJambiIndonesia
| | - Kadarusman
- Politeknik Kelautan dan Perikanan SorongKota SorongIndonesia
| | - Hagi Yulia Sugeha
- Research Center for OceanographyIndonesian Institute of SciencesJakartaIndonesia
| | - Wei‐Jen Chen
- Institute of OceanographyNational Taiwan UniversityTaipeiTaiwan
| | - Frédéric Busson
- UMR 5554 ISEM (IRD, UM, CNRS, EPHE)Université de MontpellierMontpellier CedexFrance
- UMR 7208 BOREA (MNHN, CNRS, UPMC, IRD, UCBN)Muséum National d’Histoire NaturelleParis CedexFrance
| | - Philippe Borsa
- UMR 250 ENTROPIE (IRD, UR, UNC, CNRS, IFREMER), Centre IRD‐OccitanieMontpellierFrance
| | - Hadi Dahruddin
- UMR 5554 ISEM (IRD, UM, CNRS, EPHE)Université de MontpellierMontpellier CedexFrance
- Division of ZoologyResearch Center for BiologyIndonesian Institute of Sciences (LIPI)CibinongIndonesia
| | - Sopian Sauri
- Division of ZoologyResearch Center for BiologyIndonesian Institute of Sciences (LIPI)CibinongIndonesia
| | - Yuli Fitriana
- Division of ZoologyResearch Center for BiologyIndonesian Institute of Sciences (LIPI)CibinongIndonesia
| | | | - Régis Hocdé
- UMR 9190 MARBEC (IRD, UM, CNRS, IFREMER)Université de MontpellierMontpellier CedexFrance
| | - Laurent Pouyaud
- UMR 5554 ISEM (IRD, UM, CNRS, EPHE)Université de MontpellierMontpellier CedexFrance
| | - Philippe Keith
- UMR 7208 BOREA (MNHN, CNRS, UPMC, IRD, UCBN)Muséum National d’Histoire NaturelleParis CedexFrance
| | - Daisy Wowor
- Division of ZoologyResearch Center for BiologyIndonesian Institute of Sciences (LIPI)CibinongIndonesia
| | - Dirk Steinke
- Centre for Biodiversity GenomicsUniversity of GuelphGuelphONCanada
- Department of Integrative BiologyUniversity of GuelphGuelphONCanada
| | - Robert Hanner
- Centre for Biodiversity GenomicsUniversity of GuelphGuelphONCanada
- Department of Integrative BiologyUniversity of GuelphGuelphONCanada
| | - Nicolas Hubert
- UMR 5554 ISEM (IRD, UM, CNRS, EPHE)Université de MontpellierMontpellier CedexFrance
| |
Collapse
|
3
|
Li B, Li T, Jiang Q, Huang H, Zhang Z, Wei Y, Sun B, Jia X, Li B, Yin Y. Prediction of Cleaning Loss of Combine Harvester Based on Neural Network. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s0218001420590211] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
This paper explores the performance and obtains a reasonable cleaning effect of the cleaning system of combine harvester and studies the relationship between the cleaning effect of the combine harvester cleaning system and its influencing factors. We established a neural network model between the cleaning loss rate and the clean system parameters. First, we tested the results of the cleaning performance of each group under different combinations of conditions, and analyzed the direct or indirect relationship between the cleaning loss rate and the parameters in the experiment under each working condition. Then, according to the experimental data obtained in the experiment, we predict the clearance loss rate for several sets of conditions by this model. The experimental results show that the prediction results of the model can meet the experimental requirements under the condition that the accuracy is not very high.
Collapse
Affiliation(s)
- Bo Li
- East China Jiaotong University, Software School, Nanchang Jiangxi 330013, P. R China
| | - Tingting Li
- East China Jiaotong University, Software School, Nanchang Jiangxi 330013, P. R China
| | - Qing Jiang
- Institute of Intelligent Machines, CAS, Hefei 230031, P. R. China
| | - He Huang
- Institute of Intelligent Machines, CAS, Hefei 230031, P. R. China
| | - Zhengyong Zhang
- Institute of Intelligent Machines, CAS, Hefei 230031, P. R. China
| | - Yuanyuan Wei
- Institute of Intelligent Machines, CAS, Hefei 230031, P. R. China
| | - BingYu Sun
- Institute of Intelligent Machines, CAS, Hefei 230031, P. R. China
| | - Xiufang Jia
- Institute of Intelligent Machines, CAS, Hefei 230031, P. R. China
| | - Bin Li
- Beijing Research Center of Intelligent Equipment for Agriculture, Beijing, P. R. China
| | - Yanxin Yin
- Beijing Research Center of Intelligent Equipment for Agriculture, Beijing, P. R. China
| |
Collapse
|
4
|
Yang CH, Wu KC, Chuang LY, Chang HW. Decision Theory-Based COI-SNP Tagging Approach for 126 Scombriformes Species Tagging. Front Genet 2019; 10:259. [PMID: 31001317 PMCID: PMC6456664 DOI: 10.3389/fgene.2019.00259] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2018] [Accepted: 03/08/2019] [Indexed: 12/02/2022] Open
Abstract
The mitochondrial gene cytochrome c oxidase I (COI) is commonly used for DNA barcoding in animals. However, most of the COI barcode nucleotides are conserved and sequences longer than about 650 base pairs increase the computational burden for species identification. To solve this problem, we propose a decision theory-based COI SNP tagging (DCST) approach that focuses on the discrimination of species using single nucleotide polymorphisms (SNPs) as the variable nucleotides of the sequences of a group of species. Using the example of 126 teleost mackerel fish species (order: Scombriformes), we identified 281 SNPs by alignment and trimming of their COI sequences. After decision rule making, 49 SNPs in 126 fish species were determined using the scoring system of the DCST approach. These COI-SNP barcodes were finally transformed into one-dimensional barcode images. Our proposed DCST approach simplifies the computational complexity and identifies the most effective and fewest SNPs to resolve or discriminate species for species tagging.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.,Biomedical Engineering, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Kuo-Chuan Wu
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.,Department of Computer Science and Information Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
| | - Li-Yeh Chuang
- Department of Chemical Engineering and Institute of Biotechnology and Chemical Engineering, I-Shou University, Kaohsiung, Taiwan
| | - Hsueh-Wei Chang
- Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung, Taiwan.,Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan.,Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, Taiwan
| |
Collapse
|
5
|
Meher PK, Sahu TK, Gahoi S, Tomar R, Rao AR. funbarRF: DNA barcode-based fungal species prediction using multiclass Random Forest supervised learning model. BMC Genet 2019; 20:2. [PMID: 30616524 PMCID: PMC6323839 DOI: 10.1186/s12863-018-0710-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 12/26/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identification of unknown fungal species aids to the conservation of fungal diversity. As many fungal species cannot be cultured, morphological identification of those species is almost impossible. But, DNA barcoding technique can be employed for identification of such species. For fungal taxonomy prediction, the ITS (internal transcribed spacer) region of rDNA (ribosomal DNA) is used as barcode. Though the computational prediction of fungal species has become feasible with the availability of huge volume of barcode sequences in public domain, prediction of fungal species is challenging due to high degree of variability among ITS regions within species. RESULTS A Random Forest (RF)-based predictor was built for identification of unknown fungal species. The reference and query sequences were mapped onto numeric features based on gapped base pair compositions, and then used as training and test sets respectively for prediction of fungal species using RF. More than 85% accuracy was found when 4 sequences per species in the reference set were utilized; whereas it was seen to be stabilized at ~88% if ≥7 sequence per species in the reference set were used for training of the model. The proposed model achieved comparable accuracy, while evaluated against existing methods through cross-validation procedure. The proposed model also outperformed several existing models used for identification of different species other than fungi. CONCLUSIONS An online prediction server "funbarRF" is established at http://cabgrid.res.in:8080/funbarrf/ for fungal species identification. Besides, an R-package funbarRF ( https://cran.r-project.org/web/packages/funbarRF/ ) is also available for prediction using high throughput sequence data. The effort put in this work will certainly supplement the future endeavors in the direction of fungal taxonomy assignments based on DNA barcode.
Collapse
Affiliation(s)
- Prabina Kumar Meher
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012 India
| | - Tanmaya Kumar Sahu
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012 India
| | - Shachi Gahoi
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012 India
| | - Ruchi Tomar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012 India
- Department of Bioinformatics, Janta Vedic College, Baraut, Baghpat, Uttar Pradesh 250611 India
| | - Atmakuri Ramakrishna Rao
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012 India
| |
Collapse
|
6
|
Zhang A, Hao M, Yang C, Shi Z. BarcodingR: an integrated
r
package for species identification using
DNA
barcodes. Methods Ecol Evol 2016. [DOI: 10.1111/2041-210x.12682] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Ai‐bing Zhang
- College of Life Sciences Capital Normal University Beijing 100048 China
| | - Meng‐di Hao
- College of Life Sciences Capital Normal University Beijing 100048 China
| | - Cai‐qing Yang
- College of Life Sciences Capital Normal University Beijing 100048 China
| | - Zhi‐yong Shi
- College of Life Sciences Capital Normal University Beijing 100048 China
| |
Collapse
|
7
|
Liu XF, Yang CH, Han HL, Ward RD, Zhang AB. Identifying species of moths (Lepidoptera) from Baihua Mountain, Beijing, China, using DNA barcodes. Ecol Evol 2014; 4:2472-87. [PMID: 25360280 PMCID: PMC4203292 DOI: 10.1002/ece3.1110] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2013] [Revised: 04/19/2014] [Accepted: 04/23/2014] [Indexed: 12/28/2022] Open
Abstract
DNA barcoding has become a promising means for the identification of organisms of all life-history stages. Currently, distance-based and tree-based methods are most widely used to define species boundaries and uncover cryptic species. However, there is no universal threshold of genetic distance values that can be used to distinguish taxonomic groups. Alternatively, DNA barcoding can deploy a "character-based" method, whereby species are identified through the discrete nucleotide substitutions. Our research focuses on the delimitation of moth species using DNA-barcoding methods. We analyzed 393 Lepidopteran specimens belonging to 80 morphologically recognized species with a standard cytochrome c oxidase subunit I (COI) sequencing approach, and deployed tree-based, distance-based, and diagnostic character-based methods to identify the taxa. The tree-based method divided the 393 specimens into 79 taxa (species), and the distance-based method divided them into 84 taxa (species). Although the diagnostic character-based method found only 39 so-identifiable species in the 80 species, with a reduction in sample size the accuracy rate substantially improved. For example, in the Arctiidae subset, all 12 species had diagnostics characteristics. Compared with traditional morphological method, molecular taxonomy performed well. All three methods enable the rapid delimitation of species, although they have different characteristics and different strengths. The tree-based and distance-based methods can be used for accurate species identification and biodiversity studies in large data sets, while the character-based method performs well in small data sets and can also be used as the foundation of species-specific biochips.
Collapse
Affiliation(s)
- Xiao F Liu
- College of Life Sciences, Capital Normal UniversityBeijing, 100048, China
| | - Cong H Yang
- College of Life Sciences, Capital Normal UniversityBeijing, 100048, China
| | - Hui L Han
- School of Forestry, Experiment Center, Northeast Forestry UniversityHaerbin, 150040, China
| | - Robert D Ward
- Wealth from Oceans Flagship, CSIRO Marine and Atmospheric ResearchGPO Box 1538, Hobart, Tasmania, 7001, Australia
| | - Ai-bing Zhang
- College of Life Sciences, Capital Normal UniversityBeijing, 100048, China
| |
Collapse
|
8
|
Huang J, Zhang A, Mao S, Huang Y. DNA barcoding and species boundary delimitation of selected species of Chinese Acridoidea (Orthoptera: Caelifera). PLoS One 2013; 8:e82400. [PMID: 24376533 PMCID: PMC3869712 DOI: 10.1371/journal.pone.0082400] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Accepted: 10/22/2013] [Indexed: 11/26/2022] Open
Abstract
We tested the performance of DNA barcoding in Acridoidea and attempted to solve species boundary delimitation problems in selected groups using COI barcodes. Three analysis methods were applied to reconstruct the phylogeny. K2P distances were used to assess the overlap range between intraspecific variation and interspecific divergence. "Best match (BM)", "best close match (BCM)", "all species barcodes (ASB)" and "back-propagation neural networks (BP-based method)" were utilized to test the success rate of species identification. Phylogenetic species concept and network analysis were employed to delimitate the species boundary in eight selected species groups. The results demonstrated that the COI barcode region performed better in phylogenetic reconstruction at genus and species levels than at higher-levels, but showed a little improvement in resolving the higher-level relationships when the third base data or both first and third base data were excluded. Most overlaps and incorrect identifications may be due to imperfect taxonomy, indicating the critical role of taxonomic revision in DNA barcoding study. Species boundary delimitation confirmed the presence of oversplitting in six species groups and suggested that each group should be treated as a single species.
Collapse
Affiliation(s)
- Jianhua Huang
- College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China
- College of Life Sciences, Guangxi Normal University, Guilin, People's Republic of China
| | - Aibing Zhang
- College of Life Sciences, Capital Normal University, Beijing, People's Republic of China
| | - Shaoli Mao
- College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China
| | - Yuan Huang
- College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China
| |
Collapse
|
9
|
Bhargava M, Sharma A. DNA barcoding in plants: evolution and applications of in silico approaches and resources. Mol Phylogenet Evol 2013; 67:631-41. [PMID: 23500333 DOI: 10.1016/j.ympev.2013.03.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2012] [Revised: 02/28/2013] [Accepted: 03/01/2013] [Indexed: 02/03/2023]
Abstract
Bioinformatics has played an important role in the analysis of DNA barcoding data. The process of DNA barcoding initially involves the available data collection from the existing databases. Many databases have been developed in recent years, e.g. MMDBD [Medicinal Materials DNA Barcode Database], BioBarcode, etc. In case of non-availability of sequences, sequencing has to be done in vitro for which a recently developed software ecoPrimers can be helpful. This is followed by multiple sequence alignment. Further, basic sequence statistics computation and phylogenetic analysis can be performed by MEGA and PHYLIP/PAUP tools respectively. Some of the recent tools for in silico and statistical analysis specifically designed for barcoding viz. CAOS (Character Based DNA Barcoding), BRONX (DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability), Spider (Analysis of species identity and evolution, particularly DNA barcoding), jMOTU and Taxonerator (Turning DNA Barcode Sequences into Annotated OTUs), OTUbase (Analysis of OTU data and taxonomic data), SAP (Statistical Assignment Package), etc. have been discussed and analysed in this review. The paper presents a comprehensive overview of the various in silico methods, tools, softwares and databases used for DNA barcoding of plants.
Collapse
Affiliation(s)
- Mili Bhargava
- Biotechnology Division, Central Institute of Medicinal and Aromatic Plants, Council of Scientific and Industrial Research, PO, Lucknow 226 015, India.
| | | |
Collapse
|
10
|
Dai QY, Gao Q, Wu CS, Chesters D, Zhu CD, Zhang AB. Phylogenetic reconstruction and DNA barcoding for closely related pine moth species (Dendrolimus) in China with multiple gene markers. PLoS One 2012; 7:e32544. [PMID: 22509245 PMCID: PMC3317921 DOI: 10.1371/journal.pone.0032544] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 01/27/2012] [Indexed: 11/25/2022] Open
Abstract
Unlike distinct species, closely related species offer a great challenge for phylogeny reconstruction and species identification with DNA barcoding due to their often overlapping genetic variation. We tested a sibling species group of pine moth pests in China with a standard cytochrome c oxidase subunit I (COI) gene and two alternative internal transcribed spacer (ITS) genes (ITS1 and ITS2). Five different phylogenetic/DNA barcoding analysis methods (Maximum likelihood (ML)/Neighbor-joining (NJ), "best close match" (BCM), Minimum distance (MD), and BP-based method (BP)), representing commonly used methodology (tree-based and non-tree based) in the field, were applied to both single-gene and multiple-gene analyses. Our results demonstrated clear reciprocal species monophyly for three relatively distant related species, Dendrolimus superans, D. houi, D. kikuchii, as recovered by both single and multiple genes while the phylogenetic relationship of three closely related species, D. punctatus, D. tabulaeformis, D. spectabilis, could not be resolved with the traditional tree-building methods. Additionally, we find the standard COI barcode outperforms two nuclear ITS genes, whatever the methods used. On average, the COI barcode achieved a success rate of 94.10-97.40%, while ITS1 and ITS2 obtained a success rate of 64.70-81.60%, indicating ITS genes are less suitable for species identification in this case. We propose the use of an overall success rate of species identification that takes both sequencing success and assignation success into account, since species identification success rates with multiple-gene barcoding system were generally overestimated, especially by tree-based methods, where only successfully sequenced DNA sequences were used to construct a phylogenetic tree. Non-tree based methods, such as MD, BCM, and BP approaches, presented advantages over tree-based methods by reporting the overall success rates with statistical significance. In addition, our results indicate that the most closely related species D. punctatus, D. tabulaeformis, and D. spectabilis, may be still in the process of incomplete lineage sorting, with occasional hybridizations occurring among them.
Collapse
Affiliation(s)
- Qing-Yan Dai
- College of Life Sciences, Capital Normal University, Beijing, People’s Republic of China
| | - Qiang Gao
- College of Life Sciences, Capital Normal University, Beijing, People’s Republic of China
| | - Chun-Sheng Wu
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People’s Republic of China
| | - Douglas Chesters
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People’s Republic of China
| | - Chao-Dong Zhu
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People’s Republic of China
| | - Ai-Bing Zhang
- College of Life Sciences, Capital Normal University, Beijing, People’s Republic of China
| |
Collapse
|
11
|
Zhang AB, Feng J, Ward RD, Wan P, Gao Q, Wu J, Zhao WZ. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods. PLoS One 2012; 7:e30986. [PMID: 22363527 PMCID: PMC3282726 DOI: 10.1371/journal.pone.0030986] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2011] [Accepted: 12/29/2011] [Indexed: 11/19/2022] Open
Abstract
Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.
Collapse
Affiliation(s)
- Ai-bing Zhang
- College of Life Sciences, Capital Normal University, Beijing, People's Republic of China.
| | | | | | | | | | | | | |
Collapse
|