1
|
Moore‐Pollard ER, Jones DS, Mandel JR. Compositae-ParaLoss-1272: A complementary sunflower-specific probe set reduces paralogs in phylogenomic analyses of complex systems. APPLICATIONS IN PLANT SCIENCES 2024; 12:e11568. [PMID: 38369976 PMCID: PMC10873820 DOI: 10.1002/aps3.11568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 10/30/2023] [Accepted: 11/12/2023] [Indexed: 02/20/2024]
Abstract
Premise A family-specific probe set for sunflowers, Compositae-1061, enables family-wide phylogenomic studies and investigations at lower taxonomic levels, but may lack resolution at genus to species levels, especially in groups complicated by polyploidy and hybridization. Methods We developed a Hyb-Seq probe set, Compositae-ParaLoss-1272, that targets orthologous loci in Asteraceae. We tested its efficiency across the family by simulating target enrichment sequencing in silico. Additionally, we tested its effectiveness at lower taxonomic levels in the historically complex genus Packera. We performed Hyb-Seq with Compositae-ParaLoss-1272 for 19 Packera taxa that were previously studied using Compositae-1061. The resulting sequences from each probe set, plus a combination of both, were used to generate phylogenies, compare topologies, and assess node support. Results We report that Compositae-ParaLoss-1272 captured loci across all tested Asteraceae members, had less gene tree discordance, and retained longer loci than Compositae-1061. Most notably, Compositae-ParaLoss-1272 recovered substantially fewer paralogous sequences than Compositae-1061, with only ~5% of the recovered loci reporting as paralogous, compared to ~59% with Compositae-1061. Discussion Given the complexity of plant evolutionary histories, assigning orthology for phylogenomic analyses will continue to be challenging. However, we anticipate Compositae-ParaLoss-1272 will provide improved resolution and utility for studies of complex groups and lower taxonomic levels in the sunflower family.
Collapse
Affiliation(s)
- Erika R. Moore‐Pollard
- Department of Biological SciencesUniversity of Memphis3700 Walker Ave.MemphisTennessee38152USA
| | - Daniel S. Jones
- Department of Biological SciencesAuburn University101 Rouse Life SciencesAuburnAlabama36849USA
| | - Jennifer R. Mandel
- Department of Biological SciencesUniversity of Memphis3700 Walker Ave.MemphisTennessee38152USA
| |
Collapse
|
2
|
Pezzini FF, Ferrari G, Forrest LL, Hart ML, Nishii K, Kidner CA. Target capture and genome skimming for plant diversity studies. APPLICATIONS IN PLANT SCIENCES 2023; 11:e11537. [PMID: 37601316 PMCID: PMC10439825 DOI: 10.1002/aps3.11537] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 06/16/2023] [Accepted: 07/10/2023] [Indexed: 08/22/2023]
Abstract
Recent technological advances in long-read high-throughput sequencing and assembly methods have facilitated the generation of annotated chromosome-scale whole-genome sequence data for evolutionary studies; however, generating such data can still be difficult for many plant species. For example, obtaining high-molecular-weight DNA is typically impossible for samples in historical herbarium collections, which often have degraded DNA. The need to fast-freeze newly collected living samples to conserve high-quality DNA can be complicated when plants are only found in remote areas. Therefore, short-read reduced-genome representations, such as target capture and genome skimming, remain important for evolutionary studies. Here, we review the pros and cons of each technique for non-model plant taxa. We provide guidance related to logistics, budget, the genomic resources previously available for the target clade, and the nature of the study. Furthermore, we assess the available bioinformatic analyses, detailing best practices and pitfalls, and suggest pathways to combine newly generated data with legacy data. Finally, we explore the possible downstream analyses allowed by the type of data generated using each technique. We provide a practical guide to help researchers make the best-informed choice regarding reduced genome representation for evolutionary studies of non-model plants in cases where whole-genome sequencing remains impractical.
Collapse
Affiliation(s)
| | - Giada Ferrari
- Royal Botanic Garden Edinburgh Edinburgh United Kingdom
| | | | | | - Kanae Nishii
- Royal Botanic Garden Edinburgh Edinburgh United Kingdom
| | - Catherine A Kidner
- Royal Botanic Garden Edinburgh Edinburgh United Kingdom
- School of Biological Sciences University of Edinburgh Edinburgh United Kingdom
| |
Collapse
|
3
|
Maurin KJL, Smissen RD, Lusk CH. A dated phylogeny shows Plio-Pleistocene climates spurred evolution of antibrowsing defences in the New Zealand flora. THE NEW PHYTOLOGIST 2022; 233:546-554. [PMID: 34610149 PMCID: PMC9298021 DOI: 10.1111/nph.17766] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 09/24/2021] [Indexed: 06/13/2023]
Abstract
Some plant traits may be legacies of coevolution with extinct megafauna. One example is the convergent evolution of 'divaricate' cage architectures in many New Zealand lineages, interpreted as a response to recently extinct flightless avian browsers whose ancestors arrived during the Paleogene period. Although experiments have confirmed that divaricate habit deters extant browsers, its abundance on frosty, droughty sites appears consistent with an earlier interpretation as a response to cold, dry Plio-Pleistocene climates. We used 45 protein-coding sequences from plastid genomes to reconstruct the evolutionary history of the divaricate habit in extant New Zealand lineages. Our dated phylogeny of 215 species included 91% of New Zealand eudicot divaricate species. We show that 86% of extant divaricate plants diverged from non-divaricate sisters within the last 5 Ma, implicating Plio-Pleistocene climates in the proliferation of cage architectures in New Zealand. Our results, combined with other recent findings, are consistent with the synthetic hypothesis that the browser-deterrent effect of cage architectures was strongly selected only when Plio-Pleistocene climatic constraints prevented woody plants from growing quickly out of reach of browsers. This is consistent with the abundance of cage architectures in other regions where plant growth is restricted by aridity or short frost-free periods.
Collapse
Affiliation(s)
| | - Rob D. Smissen
- Allan HerbariumManaaki Whenua – Landcare ResearchLincoln7640New Zealand
| | - Christopher H. Lusk
- Environmental Research InstituteThe University of WaikatoHamilton3240New Zealand
| |
Collapse
|
4
|
Daniell H, Jin S, Zhu X, Gitzendanner MA, Soltis DE, Soltis PS. Green giant-a tiny chloroplast genome with mighty power to produce high-value proteins: history and phylogeny. PLANT BIOTECHNOLOGY JOURNAL 2021; 19:430-447. [PMID: 33484606 PMCID: PMC7955891 DOI: 10.1111/pbi.13556] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 01/11/2021] [Accepted: 01/16/2021] [Indexed: 05/04/2023]
Abstract
Free-living cyanobacteria were entrapped by eukaryotic cells ~2 billion years ago, ultimately giving rise to chloroplasts. After a century of debate, the presence of chloroplast DNA was demonstrated in the 1960s. The first chloroplast genomes were sequenced in the 1980s, followed by ~100 vegetable, fruit, cereal, beverage, oil and starch/sugar crop chloroplast genomes in the past three decades. Foreign genes were expressed in isolated chloroplasts or intact plant cells in the late 1980s and stably integrated into chloroplast genomes, with typically maternal inheritance shown in the 1990s. Since then, chloroplast genomes conferred the highest reported levels of tolerance or resistance to biotic or abiotic stress. Although launching products with agronomic traits in important crops using this concept has been elusive, commercial products developed include enzymes used in everyday life from processing fruit juice, to enhancing water absorption of cotton fibre or removal of stains as laundry detergents and in dye removal in the textile industry. Plastid genome sequences have revealed the framework of green plant phylogeny as well as the intricate history of plastid genome transfer events to other eukaryotes. Discordant historical signals among plastid genes suggest possible variable constraints across the plastome and further understanding and mitigation of these constraints may yield new opportunities for bioengineering. In this review, we trace the evolutionary history of chloroplasts, status of autonomy and recent advances in products developed for everyday use or those advanced to the clinic, including treatment of COVID-19 patients and SARS-CoV-2 vaccine.
Collapse
Affiliation(s)
- Henry Daniell
- Department of Basic and Translational SciencesSchool of Dental MedicineUniversity of PennsylvaniaPhiladelphiaPAUSA
| | - Shuangxia Jin
- National Key Laboratory of Crop Genetic ImprovementHuazhong Agricultural UniversityWuhanChina
| | - Xin‐Guang Zhu
- State Key Laboratory for Plant Molecular Genetics and Center of Excellence for Molecular Plant SciencesChinese Academy of SciencesShanghaiChina
| | | | - Douglas E. Soltis
- Florida Museum of Natural History and Department of BiologyUniversity of FloridaGainesvilleFLUSA
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFLUSA
| | - Pamela S. Soltis
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFLUSA
| |
Collapse
|
5
|
Loeuille B, Thode V, Siniscalchi C, Andrade S, Rossi M, Pirani JR. Extremely low nucleotide diversity among thirty-six new chloroplast genome sequences from Aldama (Heliantheae, Asteraceae) and comparative chloroplast genomics analyses with closely related genera. PeerJ 2021; 9:e10886. [PMID: 33665028 PMCID: PMC7912680 DOI: 10.7717/peerj.10886] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 01/12/2021] [Indexed: 01/23/2023] Open
Abstract
Aldama (Heliantheae, Asteraceae) is a diverse genus in the sunflower family. To date, nearly 200 Asteraceae chloroplast genomes have been sequenced, but the plastomes of Aldama remain undescribed. Plastomes in Asteraceae usually show little sequence divergence, consequently, our hypothesis is that species of Aldama will be overall conserved. In this study, we newly sequenced 36 plastomes of Aldama and of five species belonging to other Heliantheae genera selected as outgroups (i.e., Dimerostemma asperatum, Helianthus tuberosus, Iostephane heterophylla, Pappobolus lanatus var. lanatus, and Tithonia diversifolia). We analyzed the structure and gene content of the assembled plastomes and performed comparative analyses within Aldama and with other closely related genera. As expected, Aldama plastomes are very conserved, with the overall gene content and orientation being similar in all studied species. The length of the plastome is also consistent and the junction between regions usually contain the same genes and have similar lengths. A large ∼20 kb and a small ∼3 kb inversion were detected in the Large Single Copy (LSC) regions of all assembled plastomes, similarly to other Asteraceae species. The nucleotide diversity is very low, with only 1,509 variable sites in 127,466 bp (i.e., 1.18% of the sites in the alignment of 36 Aldama plastomes, with one of the IRs removed, is variable). Only one gene, rbcL, shows signatures of positive selection. The plastomes of the selected outgroups feature a similar gene content and structure compared to Aldama and also present the two inversions in the LSC region. Deletions of different lengths were observed in the gene ycf2. Multiple SSRs were identified for the sequenced Aldama and outgroups. The phylogenetic analysis shows that Aldama is not monophyletic due to the position of the Mexican species A. dentata. All Brazilian species form a strongly supported clade. Our results bring new understandings into the evolution and diversity of plastomes at the species level.
Collapse
Affiliation(s)
- Benoit Loeuille
- Departamento de Botânica, Universidade Federal de Pernambuco, Recife, Pernambuco, Brazil
| | - Verônica Thode
- Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
| | - Carolina Siniscalchi
- Department of Biological Sciences, Mississippi State University, Mississippi State, MS, United States of America
| | - Sonia Andrade
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, São Paulo, São Paulo, Brazil
| | - Magdalena Rossi
- Departamento de Botânica, Universidade de São Paulo, São Paulo, São Paulo, Brazil
| | - José Rubens Pirani
- Departamento de Botânica, Universidade de São Paulo, São Paulo, São Paulo, Brazil
| |
Collapse
|
6
|
Folk RA, Kates HR, LaFrance R, Soltis DE, Soltis PS, Guralnick RP. High-throughput methods for efficiently building massive phylogenies from natural history collections. APPLICATIONS IN PLANT SCIENCES 2021; 9:e11410. [PMID: 33680581 PMCID: PMC7910806 DOI: 10.1002/aps3.11410] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 12/20/2020] [Indexed: 05/10/2023]
Abstract
PREMISE Large phylogenetic data sets have often been restricted to small numbers of loci from GenBank, and a vetted sampling-to-sequencing phylogenomic protocol scaling to thousands of species is not yet available. Here, we report a high-throughput collections-based approach that empowers researchers to explore more branches of the tree of life with numerous loci. METHODS We developed an integrated Specimen-to-Laboratory Information Management System (SLIMS), connecting sampling and wet lab efforts with progress tracking at each stage. Using unique identifiers encoded in QR codes and a taxonomic database, a research team can sample herbarium specimens, efficiently record the sampling event, and capture specimen images. After sampling in herbaria, images are uploaded to a citizen science platform for metadata generation, and tissue samples are moved through a simple, high-throughput, plate-based herbarium DNA extraction and sequencing protocol. RESULTS We applied this sampling-to-sequencing workflow to ~15,000 species, producing for the first time a data set with ~50% taxonomic representation of the "nitrogen-fixing clade" of angiosperms. DISCUSSION The approach we present is appropriate at any taxonomic scale and is extensible to other collection types. The widespread use of large-scale sampling strategies repositions herbaria as accessible but largely untapped resources for broad taxonomic sampling with thousands of species.
Collapse
Affiliation(s)
- Ryan A. Folk
- Department of Biological SciencesMississippi State UniversityMississippi StateMississippiUSA
| | - Heather R. Kates
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFloridaUSA
| | - Raphael LaFrance
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFloridaUSA
| | - Douglas E. Soltis
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFloridaUSA
- Department of BiologyUniversity of FloridaGainesvilleFloridaUSA
- Genetics InstituteUniversity of FloridaGainesvilleFloridaUSA
- Biodiversity InstituteUniversity of FloridaGainesvilleFloridaUSA
| | - Pamela S. Soltis
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFloridaUSA
- Genetics InstituteUniversity of FloridaGainesvilleFloridaUSA
- Biodiversity InstituteUniversity of FloridaGainesvilleFloridaUSA
| | - Robert P. Guralnick
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFloridaUSA
- Biodiversity InstituteUniversity of FloridaGainesvilleFloridaUSA
| |
Collapse
|
7
|
Maurin KJL. A dated phylogeny of the genus Pennantia (Pennantiaceae) based on whole chloroplast genome and nuclear ribosomal 18S-26S repeat region sequences. PHYTOKEYS 2020; 155:15-32. [PMID: 32863722 PMCID: PMC7428460 DOI: 10.3897/phytokeys.155.53460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 07/13/2020] [Indexed: 06/11/2023]
Abstract
Pennantia, which comprises four species distributed in Australasia, was the subject of a monographic taxonomic treatment based on morphological characters in 2002. When this genus has been included in molecular phylogenies, it has usually been represented by a single species, P. corymbosa J.R.Forst. & G.Forst., or occasionally also by P. cunninghamii Miers. This study presents the first dated phylogenetic analysis encompassing all species of the genus Pennantia and using chloroplast DNA. The nuclear ribosomal 18S-26S repeat region is also investigated, using a chimeric reference sequence against which reads not mapping to the chloroplast genome were aligned. This mapping of off-target reads proved valuable in exploiting otherwise discarded data, but with rather variable success. The trees based on chloroplast DNA and the nuclear markers are congruent but the relationships among the members of the latter are less strongly supported overall, certainly due to the presence of ambiguous characters in the alignment resulting from low coverage. The dated chloroplast DNA phylogeny suggests that Pennantia has diversified within the last 20 My, with the lineages represented by P. baylisiana (W.R.B.Oliv.) G.T.S.Baylis, P. endlicheri Reissek and P. corymbosa diversifying within the last 9 My. The analyses presented here also confirm previous molecular work based on the nuclear internal transcribed spacer region showing that P. baylisiana and P. endlicheri, which were sometimes considered synonyms, are not sister taxa and therefore support their recognition as distinct species.
Collapse
Affiliation(s)
- Kévin J. L. Maurin
- The University of Waikato – School of Science, Private Bag 3105, Hamilton 3240, New ZealandThe University of WaikatoHamiltonNew Zealand
| |
Collapse
|
8
|
Zhang X, Sun Y, Landis JB, Lv Z, Shen J, Zhang H, Lin N, Li L, Sun J, Deng T, Sun H, Wang H. Plastome phylogenomic study of Gentianeae (Gentianaceae): widespread gene tree discordance and its association with evolutionary rate heterogeneity of plastid genes. BMC PLANT BIOLOGY 2020; 20:340. [PMID: 32680458 PMCID: PMC7368685 DOI: 10.1186/s12870-020-02518-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 06/24/2020] [Indexed: 05/10/2023]
Abstract
BACKGROUND Plastome-scale data have been prevalent in reconstructing the plant Tree of Life. However, phylogenomic studies currently based on plastomes rely primarily on maximum likelihood inference of concatenated alignments of plastid genes, and thus phylogenetic discordance produced by individual plastid genes has generally been ignored. Moreover, structural and functional characteristics of plastomes indicate that plastid genes may not evolve as a single locus and are experiencing different evolutionary forces, yet the genetic characteristics of plastid genes within a lineage remain poorly studied. RESULTS We sequenced and annotated 10 plastome sequences of Gentianeae. Phylogenomic analyses yielded robust relationships among genera within Gentianeae. We detected great variation of gene tree topologies and revealed that more than half of the genes, including one (atpB) of the three widely used plastid markers (rbcL, atpB and matK) in phylogenetic inference of Gentianeae, are likely contributing to phylogenetic ambiguity of Gentianeae. Estimation of nucleotide substitution rates showed extensive rate heterogeneity among different plastid genes and among different functional groups of genes. Comparative analysis suggested that the ribosomal protein (RPL and RPS) genes and the RNA polymerase (RPO) genes have higher substitution rates and genetic variations among plastid genes in Gentianeae. Our study revealed that just one (matK) of the three (matK, ndhB and rbcL) widely used markers show high phylogenetic informativeness (PI) value. Due to the high PI and lowest gene-tree discordance, rpoC2 is advocated as a promising plastid DNA barcode for taxonomic studies of Gentianeae. Furthermore, our analyses revealed a positive correlation of evolutionary rates with genetic variation of plastid genes, but a negative correlation with gene-tree discordance under purifying selection. CONCLUSIONS Overall, our results demonstrate the heterogeneity of nucleotide substitution rates and genetic characteristics among plastid genes providing new insights into plastome evolution, while highlighting the necessity of considering gene-tree discordance into phylogenomic studies based on plastome-scale data.
Collapse
Affiliation(s)
- Xu Zhang
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China.
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Yanxia Sun
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
| | - Jacob B Landis
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA, 92507, USA
- School of Integrative Plant Science, Section of Plant Biology and the L.H. Bailey Hortorium, Cornell University, Ithaca, NY, 14850, USA
| | - Zhenyu Lv
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, Yunnan, China
| | - Jun Shen
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Huajie Zhang
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
| | - Nan Lin
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Lijuan Li
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiao Sun
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Tao Deng
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, Yunnan, China
| | - Hang Sun
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, Yunnan, China.
| | - Hengchang Wang
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China.
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, 430074, Hubei, China.
| |
Collapse
|
9
|
Stull GW, Soltis PS, Soltis DE, Gitzendanner MA, Smith SA. Nuclear phylogenomic analyses of asterids conflict with plastome trees and support novel relationships among major lineages. AMERICAN JOURNAL OF BOTANY 2020; 107:790-805. [PMID: 32406108 DOI: 10.1002/ajb2.1468] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2019] [Accepted: 02/26/2020] [Indexed: 05/14/2023]
Abstract
PREMISE Discordance between nuclear and organellar phylogenies (cytonuclear discordance) is a well-documented phenomenon at shallow evolutionary levels but has been poorly investigated at deep levels of plant phylogeny. Determining the extent of cytonuclear discordance across major plant lineages is essential not only for elucidating evolutionary processes, but also for evaluating the currently used framework of plant phylogeny, which is largely based on the plastid genome. METHODS We present a phylogenomic examination of a major angiosperm clade (Asteridae) based on sequence data from the nuclear, plastid, and mitochondrial genomes as a means of evaluating currently accepted relationships inferred from the plastome and exploring potential sources of genomic conflict in this group. RESULTS We recovered at least five instances of well-supported cytonuclear discordance concerning the placements of major asterid lineages (i.e., Ericales, Oncothecaceae, Aquifoliales, Cassinopsis, and Icacinaceae). We attribute this conflict to a combination of incomplete lineage sorting and hybridization, the latter supported in part by previously inferred whole-genome duplications. CONCLUSIONS Our results challenge several long-standing hypotheses of asterid relationships and have implications for morphological character evolution and for the importance of ancient whole-genome duplications in early asterid evolution. These findings also highlight the value of reevaluating broad-scale angiosperm and green-plant phylogeny with nuclear genomic data.
Collapse
Affiliation(s)
- Gregory W Stull
- Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650204, China
- Department of Botany, Smithsonian Institution, Washington, D.C., 20013, USA
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, Florida, 32611, USA
- Biodiversity Institute, University of Florida, Gainesville, Florida, 32611, USA
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, Florida, 32611, USA
- Biodiversity Institute, University of Florida, Gainesville, Florida, 32611, USA
- Department of Biology, University of Florida, Gainesville, Florida, 32611, USA
| | | | - Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, 48109, USA
| |
Collapse
|
10
|
Armijos Carrion AD, Hinsinger DD, Strijk JS. ECuADOR-Easy Curation of Angiosperm Duplicated Organellar Regions, a tool for cleaning and curating plastomes assembled from next generation sequencing pipelines. PeerJ 2020; 8:e8699. [PMID: 32292644 PMCID: PMC7147433 DOI: 10.7717/peerj.8699] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 02/06/2020] [Indexed: 11/25/2022] Open
Abstract
Background With the rapid increase in availability of genomic resources offered by Next-Generation Sequencing (NGS) and the availability of free online genomic databases, efficient and standardized metadata curation approaches have become increasingly critical for the post-processing stages of biological data. Especially in organelle-based studies using circular chloroplast genome datasets, the assembly of the main structural regions in random order and orientation represents a major limitation in our ability to easily generate “ready-to-align” datasets for phylogenetic reconstruction, at both small and large taxonomic scales. In addition, current practices discard the most variable regions of the genomes to facilitate the alignment of the remaining coding regions. Nevertheless, no software is currently available to perform curation to such a degree, through simple detection, organization and positioning of the main plastome regions, making it a time-consuming and error-prone process. Here we introduce a fast and user friendly software ECuADOR, a Perl script specifically designed to automate the detection and reorganization of newly assembled plastomes obtained from any source available (NGS, sanger sequencing or assembler output). Methods ECuADOR uses a sliding-window approach to detect long repeated sequences in draft sequences, which then identifies the inverted repeat regions (IRs), even in case of artifactual breaks or sequencing errors and automates the rearrangement of the sequence to the widely used LSC–Irb–SSC–IRa order. This facilitates rapid post-editing steps such as creation of genome alignments, detection of variable regions, SNP detection and phylogenomic analyses. Results ECuADOR was successfully tested on plant families throughout the angiosperm phylogeny by curating 161 chloroplast datasets. ECuADOR first identified and reordered the central regions (LSC–Irb–SSC–IRa) for each dataset and then produced a new annotation for the chloroplast sequences. The process took less than 20 min with a maximum memory requirement of 150 MB and an accuracy of over 99%. Conclusions ECuADOR is the sole de novo one-step recognition and re-ordination tool that provides facilitation in the post-processing analysis of the extra nuclear genomes from NGS data. The program is available at https://github.com/BiodivGenomic/ECuADOR/.
Collapse
Affiliation(s)
- Angelo D Armijos Carrion
- Biodiversity Genomics Team, Plant Ecophysiology & Evolution Group, Guangxi Key Laboratory of Forest Ecology and Conservation, College of Forestry, Guangxi University, Nanning, Guangxi, PR China
| | - Damien D Hinsinger
- Biodiversity Genomics Team, Plant Ecophysiology & Evolution Group, Guangxi Key Laboratory of Forest Ecology and Conservation, College of Forestry, Guangxi University, Nanning, Guangxi, PR China.,Alliance for Conservation Tree Genomics, Pha Tad Ke Botanical Garden, Luang Prabang, Laos
| | - Joeri S Strijk
- Biodiversity Genomics Team, Plant Ecophysiology & Evolution Group, Guangxi Key Laboratory of Forest Ecology and Conservation, College of Forestry, Guangxi University, Nanning, Guangxi, PR China.,Alliance for Conservation Tree Genomics, Pha Tad Ke Botanical Garden, Luang Prabang, Laos.,State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi University, Nanning, Guangxi, PR China
| |
Collapse
|
11
|
Granados Mendoza C, Jost M, Hágsater E, Magallón S, van den Berg C, Lemmon EM, Lemmon AR, Salazar GA, Wanke S. Target Nuclear and Off-Target Plastid Hybrid Enrichment Data Inform a Range of Evolutionary Depths in the Orchid Genus Epidendrum. FRONTIERS IN PLANT SCIENCE 2020; 10:1761. [PMID: 32063915 PMCID: PMC7000662 DOI: 10.3389/fpls.2019.01761] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Accepted: 12/16/2019] [Indexed: 05/12/2023]
Abstract
Universal angiosperm enrichment probe sets designed to enrich hundreds of putatively orthologous nuclear single-copy loci are increasingly being applied to infer phylogenetic relationships of different lineages of angiosperms at a range of evolutionary depths. Studies applying such probe sets have focused on testing the universality and performance of the target nuclear loci, but they have not taken advantage of off-target data from other genome compartments generated alongside the nuclear loci. Here we do so to infer phylogenetic relationships in the orchid genus Epidendrum and closely related genera of subtribe Laeliinae. Our aims are to: 1) test the technical viability of applying the plant anchored hybrid enrichment (AHE) method (Angiosperm v.1 probe kit) to our focal group, 2) mine plastid protein coding genes from off-target reads; and 3) evaluate the performance of the target nuclear and off-target plastid loci in resolving and supporting phylogenetic relationships along a range of taxonomical depths. Phylogenetic relationships were inferred from the nuclear data set through coalescent summary and site-based methods, whereas plastid loci were analyzed in a concatenated partitioned matrix under maximum likelihood. The usefulness of target and flanking non-target nuclear regions and plastid loci was assessed through the estimation of their phylogenetic informativeness. Our study successfully applied the plant AHE probe kit to Epidendrum, supporting the universality of this kit in angiosperms. Moreover, it demonstrated the feasibility of mining plastome loci from off-target reads generated with the Angiosperm v.1 probe kit to obtain additional, uniparentally inherited sequence data at no extra sequencing cost. Our analyses detected some strongly supported incongruences between nuclear and plastid data sets at shallow divergences, an indication of potential lineage sorting, hybridization, or introgression events in the group. Lastly, we found that the per site phylogenetic informativeness of the ycf1 plastid gene surpasses that of all other plastid genes and several nuclear loci, making it an excellent candidate for assessing phylogenetic relationships at medium to low taxonomic levels in orchids.
Collapse
Affiliation(s)
- Carolina Granados Mendoza
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Matthias Jost
- Institut für Botanik, Technische Universität Dresden, Dresden, Germany
| | - Eric Hágsater
- Herbario AMO, Instituto Chinoin, A.C., Mexico City, Mexico
| | - Susana Magallón
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Cássio van den Berg
- Departamento de Ciências Biológicas, Universidade Estadual de Feira de Santana, Feira de Santana, Brazil
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, Tallahassee, FL, United States
| | - Alan R. Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL, United States
| | - Gerardo A. Salazar
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Stefan Wanke
- Institut für Botanik, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
12
|
Lo YT, Shaw PC. Application of next-generation sequencing for the identification of herbal products. Biotechnol Adv 2019; 37:107450. [DOI: 10.1016/j.biotechadv.2019.107450] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Revised: 09/10/2019] [Accepted: 09/10/2019] [Indexed: 12/17/2022]
|
13
|
Walker JF, Walker-Hale N, Vargas OM, Larson DA, Stull GW. Characterizing gene tree conflict in plastome-inferred phylogenies. PeerJ 2019; 7:e7747. [PMID: 31579615 PMCID: PMC6764362 DOI: 10.7717/peerj.7747] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 08/25/2019] [Indexed: 11/20/2022] Open
Abstract
Evolutionary relationships among plants have been inferred primarily using chloroplast data. To date, no study has comprehensively examined the plastome for gene tree conflict. Using a broad sampling of angiosperm plastomes, we characterize gene tree conflict among plastid genes at various time scales and explore correlates to conflict (e.g., evolutionary rate, gene length, molecule type). We uncover notable gene tree conflict against a backdrop of largely uninformative genes. We find alignment length and tree length are strong predictors of concordance, and that nucleotides outperform amino acids. Of the most commonly used markers, matK, greatly outperforms rbcL; however, the rarely used gene rpoC2 is the top-performing gene in every analysis. We find that rpoC2 reconstructs angiosperm phylogeny as well as the entire concatenated set of protein-coding chloroplast genes. Our results suggest that longer genes are superior for phylogeny reconstruction. The alleviation of some conflict through the use of nucleotides suggests that stochastic and systematic error is likely the root of most of the observed conflict, but further research on biological conflict within plastome is warranted given documented cases of heteroplasmic recombination. We suggest that researchers should filter genes for topological concordance when performing downstream comparative analyses on phylogenetic data, even when using chloroplast genomes.
Collapse
Affiliation(s)
- Joseph F. Walker
- Sainsbury Laboratory (SLCU), University of Cambridge, Cambridge, United Kingdom
| | - Nathanael Walker-Hale
- Department of Plant Sciences, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
| | - Oscar M. Vargas
- University of California, Santa Cruz, Santa Cruz, United States of America
| | - Drew A. Larson
- University of Michigan—Ann Arbor, Ann Arbor, MI, United States of America
| | - Gregory W. Stull
- Department of Botany, Smithsonian Institution, Washington, United States of America
| |
Collapse
|
14
|
Genotyping by Sequencing and Plastome Analysis Finds High Genetic Variability and Geographical Structure in Dactylis glomerata L. in Northwest Europe Despite Lack of Ploidy Variation. AGRONOMY-BASEL 2019. [DOI: 10.3390/agronomy9070342] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Large collections of the forage and bioenergy grass Dactylis glomerata were made in northwest (NW) Europe along east to west and north to south clines for genetic resource conservation and to inform breeding programmes of genetic diversity, genepools, and ploidy. Leaves were sampled for genetic analysis and seed and rhizome for ex-situ conservation. Genotyping by sequencing (GBS) was used to assay nuclear DNA diversity and plastome single nucleotide polymorphism (SNP) discovery was undertaken using a long-read PCR and MiSeq approach. Nuclear and plastid SNPs were analysed by principal component analysis (PCA) to compare genotypes. Flow cytometry revealed that all samples were tetraploid, but some genome size variation was recorded. GBS detected an average of approximately 10,000 to 15,000 SNPs per country sampled. The highest average number of private SNPs was recorded in Poland (median ca. 2000). Plastid DNA variation was also high (1466 SNPs, 17 SNPs/kbp). GBS data, and to a lesser extent plastome data, also show that genetic variation is structured geographically in NW Europe with loose clustering matching the country of plant origin. The results reveal extensive genetic diversity and genetic structuring in this versatile allogamous species despite lack of ploidy variation and high levels of human mediated geneflow via planting.
Collapse
|
15
|
Plastid phylogenomic insights into the evolution of Caryophyllales. Mol Phylogenet Evol 2019; 134:74-86. [DOI: 10.1016/j.ympev.2018.12.023] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2018] [Revised: 12/17/2018] [Accepted: 12/19/2018] [Indexed: 11/22/2022]
|
16
|
Bethune K, Mariac C, Couderc M, Scarcelli N, Santoni S, Ardisson M, Martin J, Montúfar R, Klein V, Sabot F, Vigouroux Y, Couvreur TLP. Long-fragment targeted capture for long-read sequencing of plastomes. APPLICATIONS IN PLANT SCIENCES 2019; 7:e1243. [PMID: 31139509 PMCID: PMC6526642 DOI: 10.1002/aps3.1243] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 03/21/2019] [Indexed: 05/09/2023]
Abstract
PREMISE Third-generation sequencing methods generate significantly longer reads than those produced using alternative sequencing methods. This provides increased possibilities for the study of biodiversity, phylogeography, and population genetics. We developed a protocol for in-solution enrichment hybridization capture of long DNA fragments applicable to complete plastid genomes. METHODS AND RESULTS The protocol uses cost-effective in-house probes developed via long-range PCR and was used in six non-model monocot species (Poaceae: African rice, pearl millet, fonio; and three palm species). DNA was extracted from fresh and silica gel-dried leaves. Our protocol successfully captured long-read plastome fragments (3151 bp median on average), with an enrichment rate ranging from 15% to 98%. DNA extracted from silica gel-dried leaves led to low-quality plastome assemblies when compared to DNA extracted from fresh tissue. CONCLUSIONS Our protocol could also be generalized to capture long sequences from specific nuclear fragments.
Collapse
Affiliation(s)
| | | | | | | | - Sylvain Santoni
- UMR AGAP, Equipe Diversité et Adaptation de la Vigne et des Espèces MéditerranéennesINRA2 Place Viala34060MontpellierFrance
| | - Morgane Ardisson
- UMR AGAP, Equipe Diversité et Adaptation de la Vigne et des Espèces MéditerranéennesINRA2 Place Viala34060MontpellierFrance
| | | | - Rommel Montúfar
- Facultad de Ciencias Exactas y NaturalesPontificia Universidad Católica del EcuadorQuitoEcuador
| | | | | | | | | |
Collapse
|
17
|
Bragina MK, Afonnikov DA, Salina EA. Progress in plant genome sequencing: research directions. Vavilovskii Zhurnal Genet Selektsii 2019. [DOI: 10.18699/vj19.459] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Since the first plant genome of Arabidopsis thaliana has been sequenced and published, genome sequencing technologies have undergone significant changes. New algorithms, sequencing technologies and bioinformatic approaches were adopted to obtain genome, transcriptome and exome sequences for model and crop species, which have permitted deep inferences into plant biology. As a result of an improved genome assembly and analysis methods, genome sequencing costs plummeted and the number of high-quality plant genome sequences is constantly growing. Consequently, more than 300 plant genome sequences have been published over the past twenty years. Although many of the published genomes are considered incomplete, they proved to be a valuable tool for identifying genes involved in the formation of economically valuable plant traits, for marker-assisted and genomic selection and for comparative analysis of plant genomes in order to determine the basic patterns of origin of various plant species. Since a high coverage and resolution of a genome sequence is not enough to detect all changes in complex samples, targeted sequencing, which consists in the isolation and sequencing of a specific region of the genome, has begun to develop. Targeted sequencing has a higher detection power (the ability to identify new differences/variants) and resolution (up to one basis). In addition, exome sequencing (the method of sequencing only protein-coding genes regions) is actively developed, which allows for the sequencing of non-expressed alleles and genes that cannot be found with RNA-seq. In this review, an analysis of sequencing technologies development and the construction of “reference” genomes of plants is performed. A comparison of the methods of targeted sequencing based on the use of the reference DNA sequence is accomplished.
Collapse
Affiliation(s)
| | - D. A. Afonnikov
- Institute of Cytology and Genetics, SB RAS; Novosibirsk State University
| | | |
Collapse
|
18
|
Couvreur TLP, Helmstetter AJ, Koenen EJM, Bethune K, Brandão RD, Little SA, Sauquet H, Erkens RHJ. Phylogenomics of the Major Tropical Plant Family Annonaceae Using Targeted Enrichment of Nuclear Genes. FRONTIERS IN PLANT SCIENCE 2019; 9:1941. [PMID: 30687347 PMCID: PMC6334231 DOI: 10.3389/fpls.2018.01941] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 12/13/2018] [Indexed: 05/19/2023]
Abstract
Targeted enrichment and sequencing of hundreds of nuclear loci for phylogenetic reconstruction is becoming an important tool for plant systematics and evolution. Annonaceae is a major pantropical plant family with 110 genera and ca. 2,450 species, occurring across all major and minor tropical forests of the world. Baits were designed by sequencing the transcriptomes of five species from two of the largest Annonaceae subfamilies. Orthologous loci were identified. The resulting baiting kit was used to reconstruct phylogenetic relationships at two different levels using concatenated and gene tree approaches: a family wide Annonaceae analysis sampling 65 genera and a species level analysis of tribe Piptostigmateae sampling 29 species with multiple individuals per species. DNA extraction was undertaken mainly on silicagel dried leaves, with two samples from herbarium dried leaves. Our kit targets 469 exons (364,653 bp of sequence data), successfully capturing sequences from across Annonaceae. Silicagel dried and herbarium DNA worked equally well. We present for the first time a nuclear gene-based phylogenetic tree at the generic level based on 317 supercontigs. Results mainly confirm previous chloroplast based studies. However, several new relationships are found and discussed. We show significant differences in branch lengths between the two large subfamilies Annonoideae and Malmeoideae. A new tribe, Annickieae, is erected containing a single African genus Annickia. We also reconstructed a well-resolved species-level phylogenetic tree of the Piptostigmteae tribe. Our baiting kit is useful for reconstructing well-supported phylogenetic relationships within Annonaceae at different taxonomic levels. The nuclear genome is mainly concordant with plastome information with a few exceptions. Moreover, we find that substitution rate heterogeneity between the two subfamilies is also found within the nuclear compartment, and not just plastomes and ribosomal DNA as previously shown. Our results have implications for understanding the biogeography, molecular dating and evolution of Annonaceae.
Collapse
Affiliation(s)
| | | | - Erik J. M. Koenen
- Institute of Systematic Botany, University of Zurich, Zurich, Switzerland
| | - Kevin Bethune
- IRD, UMR DIADE, Univ. Montpellier, Montpellier, France
| | - Rita D. Brandão
- Maastricht Science Programme, Maastricht University, Maastricht, Netherlands
| | - Stefan A. Little
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech, Université-Paris Saclay, Orsay, France
| | - Hervé Sauquet
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech, Université-Paris Saclay, Orsay, France
- National Herbarium of New South Wales (NSW), Royal Botanic Gardens and Domain Trust, Sydney, NSW, Australia
| | - Roy H. J. Erkens
- Maastricht Science Programme, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
19
|
Walker JF, Walker-Hale N, Vargas OM, Larson DA, Stull GW. Characterizing gene tree conflict in plastome-inferred phylogenies. PeerJ 2019. [PMID: 31579615 DOI: 10.1101/512079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023] Open
Abstract
Evolutionary relationships among plants have been inferred primarily using chloroplast data. To date, no study has comprehensively examined the plastome for gene tree conflict. Using a broad sampling of angiosperm plastomes, we characterize gene tree conflict among plastid genes at various time scales and explore correlates to conflict (e.g., evolutionary rate, gene length, molecule type). We uncover notable gene tree conflict against a backdrop of largely uninformative genes. We find alignment length and tree length are strong predictors of concordance, and that nucleotides outperform amino acids. Of the most commonly used markers, matK, greatly outperforms rbcL; however, the rarely used gene rpoC2 is the top-performing gene in every analysis. We find that rpoC2 reconstructs angiosperm phylogeny as well as the entire concatenated set of protein-coding chloroplast genes. Our results suggest that longer genes are superior for phylogeny reconstruction. The alleviation of some conflict through the use of nucleotides suggests that stochastic and systematic error is likely the root of most of the observed conflict, but further research on biological conflict within plastome is warranted given documented cases of heteroplasmic recombination. We suggest that researchers should filter genes for topological concordance when performing downstream comparative analyses on phylogenetic data, even when using chloroplast genomes.
Collapse
Affiliation(s)
- Joseph F Walker
- Sainsbury Laboratory (SLCU), University of Cambridge, Cambridge, United Kingdom
| | - Nathanael Walker-Hale
- Department of Plant Sciences, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
| | - Oscar M Vargas
- University of California, Santa Cruz, Santa Cruz, United States of America
| | - Drew A Larson
- University of Michigan-Ann Arbor, Ann Arbor, MI, United States of America
| | - Gregory W Stull
- Department of Botany, Smithsonian Institution, Washington, United States of America
| |
Collapse
|
20
|
Comparative Chloroplast Genome Analysis of Rhubarb Botanical Origins and the Development of Specific Identification Markers. Molecules 2018; 23:molecules23112811. [PMID: 30380708 PMCID: PMC6278470 DOI: 10.3390/molecules23112811] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Revised: 10/21/2018] [Accepted: 10/27/2018] [Indexed: 11/18/2022] Open
Abstract
Rhubarb is an important ingredient in traditional Chinese medicine known as Rhei radix et rhizome. However, this common name refers to three different botanical species with different pharmacological effects. To facilitate the genetic identification of these three species for their more precise application in Chinese medicine we here want to provide chloroplast sequences with specific identification sites that are easy to amplify. We therefore sequenced the complete chloroplast genomes of all three species and then screened those for suitable sequences describing the three species. The length of the three chloroplast genomes ranged from 161,053 bp to 161,541 bp, with a total of 131 encoded genes including 31 tRNA, eight rRNA and 92 protein-coding sequences. The simple repeat sequence analysis indicated the differences existed in these species, phylogenetic analyses showed the chloroplast genome can be used as an ultra-barcode to distinguish the three botanical species of rhubarb, the variation of the non-coding regions is higher than that of the protein coding regions, and the variations in single-copy region are higher than that in inverted repeat. Twenty-one specific primer pairs were designed and eight specific identification sites were experimentally confirmed that can be used as special DNA barcodes for the identification of the three species based on the highly variable regions. This study provides a molecular basis for precise medicinal plant selection, and supplies the groundwork for the next investigation of the closely related Rheum species comparing and correctly identification on these important medicinal species.
Collapse
|
21
|
Hogers RCJ, de Ruiter M, Huvenaars KHJ, van der Poel H, Janssen A, van Eijk MJT, van Orsouw NJ. SNPSelect: A scalable and flexible targeted sequence-based genotyping solution. PLoS One 2018; 13:e0205577. [PMID: 30312324 PMCID: PMC6185863 DOI: 10.1371/journal.pone.0205577] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 09/27/2018] [Indexed: 11/22/2022] Open
Abstract
In plant breeding the use of molecular markers has resulted in tremendous improvement of the speed with which new crop varieties are introduced into the market. Single Nucleotide Polymorphism (SNP) genotyping is routinely used for association studies, Linkage Disequilibrium (LD) and Quantitative Trait Locus (QTL) mapping studies, marker-assisted backcrosses and validation of large numbers of novel SNPs. Here we present the KeyGene SNPSelect technology, a scalable and flexible multiplexed, targeted sequence-based, genotyping solution. The multiplex composition of SNPSelect assays can be easily changed between experiments by adding or removing loci, demonstrating their content flexibility. To demonstrate this versatility, we first designed a 1,056-plex maize assay and genotyped a total of 374 samples originating from an F2 and a Recombinant Inbred Line (RIL) population and a maize germplasm collection. Next, subsets of the most informative SNP loci were assembled in 384-plex and 768-plex assays for further genotyping. Indeed, selection of the most informative SNPs allows cost-efficient yet highly informative genotyping in a custom-made fashion, with average call rates between 88.1% (1,056-plex assay) and 99.4% (384-plex assay), and average reproducibility rates between duplicate samples ranging from 98.2% (1056-plex assay) to 99.9% (384-plex assay). The SNPSelect workflow can be completed from a DNA sample to a genotype dataset in less than three days. We propose SNPSelect as an attractive and competitive genotyping solution to meet the targeted genotyping needs in fields such as plant breeding.
Collapse
|
22
|
Gates DJ, Pilson D, Smith SD. Filtering of target sequence capture individuals facilitates species tree construction in the plant subtribe Iochrominae (Solanaceae). Mol Phylogenet Evol 2018; 123:26-34. [DOI: 10.1016/j.ympev.2018.02.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2017] [Revised: 01/30/2018] [Accepted: 02/01/2018] [Indexed: 10/18/2022]
|
23
|
Barrett CF, Wicke S, Sass C. Dense infraspecific sampling reveals rapid and independent trajectories of plastome degradation in a heterotrophic orchid complex. THE NEW PHYTOLOGIST 2018; 218:1192-1204. [PMID: 29502351 PMCID: PMC5902423 DOI: 10.1111/nph.15072] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 01/23/2018] [Indexed: 05/08/2023]
Abstract
Heterotrophic plants provide excellent opportunities to study the effects of altered selective regimes on genome evolution. Plastid genome (plastome) studies in heterotrophic plants are often based on one or a few highly divergent species or sequences as representatives of an entire lineage, thus missing important evolutionary-transitory events. Here, we present the first infraspecific analysis of plastome evolution in any heterotrophic plant. By combining genome skimming and targeted sequence capture, we address hypotheses on the degree and rate of plastome degradation in a complex of leafless orchids (Corallorhiza striata) across its geographic range. Plastomes provide strong support for relationships and evidence of reciprocal monophyly between C. involuta and the endangered C. bentleyi. Plastome degradation is extensive, occurring rapidly over a few million years, with evidence of differing rates of genomic change among the two principal clades of the complex. Genome skimming and targeted sequence capture differ widely in coverage depth overall, with depth in targeted sequence capture datasets varying immensely across the plastome as a function of GC content. These findings will help to fill a knowledge gap in models of heterotrophic plastid genome evolution, and have implications for future studies in heterotrophs.
Collapse
Affiliation(s)
- Craig F. Barrett
- Department of Biology, West Virginia University, 5218 Life Sciences Building, 53 Campus Drive, Morgantown, WV 26501, USA
| | - Susann Wicke
- Institute for Evolution and Biodiversity, University of Muenster, Huefferstr. 1, 48149 Muenster, Germany
| | - Chodon Sass
- Department of Plant and Microbial Biology, University of California, Berkeley, 431 Koshland Hall, Berkeley, California 94720, USA
| |
Collapse
|
24
|
Viljoen E, Odeny DA, Coetzee MPA, Berger DK, Rees DJG. Application of Chloroplast Phylogenomics to Resolve Species Relationships Within the Plant Genus Amaranthus. J Mol Evol 2018; 86:216-239. [PMID: 29556741 DOI: 10.1007/s00239-018-9837-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 03/16/2018] [Indexed: 02/06/2023]
Abstract
Amaranthus species are an emerging and promising nutritious traditional vegetable food source. Morphological plasticity and poorly resolved dendrograms have led to the need for well resolved species phylogenies. We hypothesized that whole chloroplast phylogenomics would result in more reliable differentiation between closely related amaranth species. The aims of the study were therefore: to construct a fully assembled, annotated chloroplast genome sequence of Amaranthus tricolor; to characterize Amaranthus accessions phylogenetically by comparing barcoding genes (matK, rbcL, ITS) with whole chloroplast sequencing; and to use whole chloroplast phylogenomics to resolve deeper phylogenetic relationships. We generated a complete A. tricolor chloroplast sequence of 150,027 bp. The three barcoding genes revealed poor inter- and intra-species resolution with low bootstrap support. Whole chloroplast phylogenomics of 59 Amaranthus accessions increased the number of parsimoniously informative sites from 92 to 481 compared to the barcoding genes, allowing improved separation of amaranth species. Our results support previous findings that two geographically independent domestication events of Amaranthus hybridus likely gave rise to several species within the Hybridus complex, namely Amaranthus dubius, Amaranthus quitensis, Amaranthus caudatus, Amaranthus cruentus and Amaranthus hypochondriacus. Poor resolution of species within the Hybridus complex supports the recent and ongoing domestication within the complex, and highlights the limitation of chloroplast data for resolving recent evolution. The weedy Amaranthus retroflexus and Amaranthus powellii was found to share a common ancestor with the Hybridus complex. Leafy amaranth, Amaranthus tricolor, Amaranthus blitum, Amaranthus viridis and Amaranthus graecizans formed a stable sister lineage to the aforementioned species across the phylogenetic trees. This study demonstrates the power of next-generation sequencing data and reference-based assemblies to resolve phylogenies, and also facilitated the identification of unknown Amaranthus accessions from a local genebank. The informative phylogeny of the Amaranthus genus will aid in selecting accessions for breeding advanced genotypes to satisfy global food demand.
Collapse
Affiliation(s)
- Erika Viljoen
- Biotechnology Platform, Agricultural Research Council, Onderstepoort, Pretoria, 0110, South Africa.,Department of Plant and Soil Sciences, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Hatfield, 0083, South Africa
| | - Damaris A Odeny
- International Crops Research Institute for the Semi-Arid Tropics, Nairobi, Kenya
| | - Martin P A Coetzee
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Hatfield, 0083, South Africa
| | - Dave K Berger
- Department of Plant and Soil Sciences, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Hatfield, 0083, South Africa.
| | - David J G Rees
- Biotechnology Platform, Agricultural Research Council, Onderstepoort, Pretoria, 0110, South Africa.,Department of Life and Consumer Sciences, College of Agricultural and Environmental Sciences, University of South Africa, Florida, 1710, South Africa
| |
Collapse
|
25
|
McKain MR, Johnson MG, Uribe‐Convers S, Eaton D, Yang Y. Practical considerations for plant phylogenomics. APPLICATIONS IN PLANT SCIENCES 2018; 6:e1038. [PMID: 29732268 PMCID: PMC5895195 DOI: 10.1002/aps3.1038] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 03/13/2018] [Indexed: 05/10/2023]
Abstract
The past decade has seen a major breakthrough in our ability to easily and inexpensively sequence genome-scale data from diverse lineages. The development of high-throughput sequencing and long-read technologies has ushered in the era of phylogenomics, where hundreds to thousands of nuclear genes and whole organellar genomes are routinely used to reconstruct evolutionary relationships. As a result, understanding which options are best suited for a particular set of questions can be difficult, especially for those just starting in the field. Here, we review the most recent advances in plant phylogenomic methods and make recommendations for project-dependent best practices and considerations. We focus on the costs and benefits of different approaches in regard to the information they provide researchers and the questions they can address. We also highlight unique challenges and opportunities in plant systems, such as polyploidy, reticulate evolution, and the use of herbarium materials, identifying optimal methodologies for each. Finally, we draw attention to lingering challenges in the field of plant phylogenomics, such as reusability of data sets, and look at some up-and-coming technologies that may help propel the field even further.
Collapse
Affiliation(s)
- Michael R. McKain
- Department of Biological SciencesThe University of AlabamaBox 870344TuscaloosaAlabama35487USA
| | - Matthew G. Johnson
- Department of Biological SciencesTexas Tech University2901 Main Street, Box 43131LubbockTexas79409USA
| | - Simon Uribe‐Convers
- Department of Ecology and Evolutionary BiologyUniversity of Michigan830 North UniversityAnn ArborMichigan48109USA
| | - Deren Eaton
- Department of Ecology, Evolution, and Environmental BiologyColumbia University1200 Amsterdam AvenueNew YorkNew York10027USA
| | - Ya Yang
- Department of Plant and Microbial BiologyUniversity of Minnesota–Twin Cities1445 Gortner AvenueSt. PaulMinnesota55108USA
| |
Collapse
|
26
|
A pilot study applying the plant Anchored Hybrid Enrichment method to New World sages (Salvia subgenus Calosphace; Lamiaceae). Mol Phylogenet Evol 2017; 117:124-134. [DOI: 10.1016/j.ympev.2017.02.006] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Revised: 02/06/2017] [Accepted: 02/06/2017] [Indexed: 11/18/2022]
|
27
|
Kohrn BF, Persinger JM, Cruzan MB. An efficient pipeline to generate data for studies in plastid population genomics and phylogeography. APPLICATIONS IN PLANT SCIENCES 2017; 5:apps1700053. [PMID: 29188144 PMCID: PMC5703179 DOI: 10.3732/apps.1700053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 09/15/2017] [Indexed: 05/22/2023]
Abstract
PREMISE OF THE STUDY Seed dispersal contributes to gene flow and is responsible for colonization of new sites and range expansion. Sequencing chloroplast haplotypes offers a way to estimate contributions of seed dispersal to population genetic structure and enables studies of population history. Whole-genome sequencing is expensive, but resources can be conserved by pooling samples. Unfortunately, haplotype associations among single-nucleotide polymorphisms (SNPs) are lost in pooled samples, and treating SNP allele frequencies as independent markers provides biased estimates of genetic structure. METHODS We developed sampling methodologies and an application, CallHap, that uses a least-squares algorithm to evaluate the fit between observed and predicted SNP allele frequencies from pooled samples based on haplotype network phylogeny structure, thus enabling pooling for chloroplast sequencing for large-scale studies of chloroplast genomic variation. This method was tested using artificially constructed test networks and pools, and pooled samples of Lasthenia californica (California goldfields) from southern Oregon, USA. RESULTS CallHap reliably recovered network topologies and haplotype frequencies from pooled samples. DISCUSSION The CallHap pipeline allows for the efficient use of resources for estimation of genetic structure for studies using nonrecombining haplotypes such as intraspecific variation in chloroplast, mitochondrial, bacterial, or viral DNA.
Collapse
Affiliation(s)
- Brendan F. Kohrn
- Department of Biology, Portland State University, 1719 SW 10th Avenue, Portland, Oregon 97201 USA
| | - Jessica M. Persinger
- Department of Biology, Portland State University, 1719 SW 10th Avenue, Portland, Oregon 97201 USA
| | - Mitchell B. Cruzan
- Department of Biology, Portland State University, 1719 SW 10th Avenue, Portland, Oregon 97201 USA
- Author for correspondence:
| |
Collapse
|
28
|
Twyford AD, Ness RW. Strategies for complete plastid genome sequencing. Mol Ecol Resour 2017; 17:858-868. [PMID: 27790830 PMCID: PMC6849563 DOI: 10.1111/1755-0998.12626] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 10/14/2016] [Accepted: 10/21/2016] [Indexed: 12/01/2022]
Abstract
Plastid sequencing is an essential tool in the study of plant evolution. This high-copy organelle is one of the most technically accessible regions of the genome, and its sequence conservation makes it a valuable region for comparative genome evolution, phylogenetic analysis and population studies. Here, we discuss recent innovations and approaches for de novo plastid assembly that harness genomic tools. We focus on technical developments including low-cost sequence library preparation approaches for genome skimming, enrichment via hybrid baits and methylation-sensitive capture, sequence platforms with higher read outputs and longer read lengths, and automated tools for assembly. These developments allow for a much more streamlined assembly than via conventional short-range PCR. Although newer methods make complete plastid sequencing possible for any land plant or green alga, there are still challenges for producing finished plastomes particularly from herbarium material or from structurally divergent plastids such as those of parasitic plants.
Collapse
Affiliation(s)
- Alex D. Twyford
- Institute of Evolutionary BiologyAshworth LaboratoriesUniversity of EdinburghEdinburghEH9 3FLUK
| | - Rob W. Ness
- Department of BiologyUniversity of Toronto MississaugaMississaugaONCanada
| |
Collapse
|
29
|
Tonti-Filippini J, Nevill PG, Dixon K, Small I. What can we do with 1000 plastid genomes? THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 90:808-818. [PMID: 28112435 DOI: 10.1111/tpj.13491] [Citation(s) in RCA: 132] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Revised: 01/17/2017] [Accepted: 01/17/2017] [Indexed: 05/21/2023]
Abstract
The plastid genome of plants is the smallest and most gene-rich of the three genomes in each cell and the one generally present in the highest copy number. As a result, obtaining plastid DNA sequence is a particularly cost-effective way of discovering genetic information about a plant. Until recently, the sequence information gathered in this way was generally limited to small portions of the genome amplified by polymerase chain reaction, but recent advances in sequencing technology have stimulated a substantial rate of increase in the sequencing of complete plastid genomes. Within the last year, the number of complete plastid genomes accessible in public sequence repositories has exceeded 1000. This sudden flood of data raises numerous challenges in data analysis and interpretation, but also offers the keys to potential insights across large swathes of plant biology. We examine what has been learnt so far, what more could be learnt if we look at the data in the right way, and what we might gain from the tens of thousands more genome sequences that will surely arrive in the next few years. The most exciting new discoveries are likely to be made at the interdisciplinary interfaces between molecular biology and ecology.
Collapse
Affiliation(s)
- Julian Tonti-Filippini
- ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, 35 Stirling Highway, Crawley, WA, 6009, Australia
| | - Paul G Nevill
- Department of Environment and Agriculture, ARC Centre for Mine Site Restoration, Curtin University, Kent Street, Bentley, WA, 6102, Australia
| | - Kingsley Dixon
- Department of Environment and Agriculture, ARC Centre for Mine Site Restoration, Curtin University, Kent Street, Bentley, WA, 6102, Australia
| | - Ian Small
- ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, 35 Stirling Highway, Crawley, WA, 6009, Australia
| |
Collapse
|
30
|
Sakaguchi S, Ueno S, Tsumura Y, Setoguchi H, Ito M, Hattori C, Nozoe S, Takahashi D, Nakamasu R, Sakagami T, Lannuzel G, Fogliani B, Wulff AS, L’Huillier L, Isagi Y. Application of a simplified method of chloroplast enrichment to small amounts of tissue for chloroplast genome sequencing. APPLICATIONS IN PLANT SCIENCES 2017; 5:apps.1700002. [PMID: 28529832 PMCID: PMC5435405 DOI: 10.3732/apps.1700002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Accepted: 04/09/2017] [Indexed: 05/30/2023]
Abstract
PREMISE OF THE STUDY High-throughput sequencing of genomic DNA can recover complete chloroplast genome sequences, but the sequence data are usually dominated by sequences from nuclear/mitochondrial genomes. To overcome this deficiency, a simple enrichment method for chloroplast DNA from small amounts of plant tissue was tested for eight plant species including a gymnosperm and various angiosperms. METHODS Chloroplasts were enriched using a high-salt isolation buffer without any step gradient procedures, and enriched chloroplast DNA was sequenced by multiplexed high-throughput sequencing. RESULTS Using this simple method, significant enrichment of chloroplast DNA-derived reads was attained, allowing deep sequencing of chloroplast genomes. As an example, the chloroplast genome of the conifer Callitris sulcata was assembled, from which polymorphic microsatellite loci were isolated successfully. DISCUSSION This chloroplast enrichment method from small amounts of plant tissue will be particularly useful for studies that use sequencers with relatively small throughput and that cannot use large amounts of tissue (e.g., for endangered species).
Collapse
Affiliation(s)
- Shota Sakaguchi
- Graduate School of Human and Environmental Studies, Kyoto University, Yoshida-nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Saneyoshi Ueno
- Tree Genetics Laboratory, Department of Forest Genetics, Forestry and Forest Products Research Institute, 1 Matsunosato, Tsukuba, Ibaraki 305-8687, Japan
| | - Yoshihiko Tsumura
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki 3058572, Japan
| | - Hiroaki Setoguchi
- Graduate School of Human and Environmental Studies, Kyoto University, Yoshida-nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Motomi Ito
- Graduate School of Arts and Sciences, University of Tokyo, Tokyo 153-8902, Japan
| | - Chie Hattori
- Graduate School of Human and Environmental Studies, Kyoto University, Yoshida-nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Shogo Nozoe
- Graduate School of Human and Environmental Studies, Kyoto University, Yoshida-nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Daiki Takahashi
- Graduate School of Human and Environmental Studies, Kyoto University, Yoshida-nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Riku Nakamasu
- Faculty of Integrated Human Studies, Kyoto University, Yoshida-nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Taishi Sakagami
- Faculty of Integrated Human Studies, Kyoto University, Yoshida-nihonmatsu-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | - Guillaume Lannuzel
- Agronomic Institute of New Caledonia (IAC), Diversités biologique et fonctionnelle des écosystèmes terrestres, BP 73, Port Laguerre, Païta 98890, New Caledonia
| | - Bruno Fogliani
- Agronomic Institute of New Caledonia (IAC), Diversités biologique et fonctionnelle des écosystèmes terrestres, BP 73, Port Laguerre, Païta 98890, New Caledonia
| | - Adrien S. Wulff
- Agronomic Institute of New Caledonia (IAC), Diversités biologique et fonctionnelle des écosystèmes terrestres, BP 73, Port Laguerre, Païta 98890, New Caledonia
- SoREco-NC, 57 Route de l’Anse Vata, 98800 Nouméa, New Caledonia
| | - Laurent L’Huillier
- Agronomic Institute of New Caledonia (IAC), Diversités biologique et fonctionnelle des écosystèmes terrestres, BP 73, Port Laguerre, Païta 98890, New Caledonia
| | - Yuji Isagi
- Division of Forest and Biomaterials Science, Graduate School of Agriculture, Kyoto University, Kyoto 6068502, Japan
| |
Collapse
|
31
|
Kamenova S, Bartley T, Bohan D, Boutain J, Colautti R, Domaizon I, Fontaine C, Lemainque A, Le Viol I, Mollot G, Perga ME, Ravigné V, Massol F. Invasions Toolkit. ADV ECOL RES 2017. [DOI: 10.1016/bs.aecr.2016.10.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
32
|
Fisher AE, Hasenstab KM, Bell HL, Blaine E, Ingram AL, Columbus JT. Evolutionary history of chloridoid grasses estimated from 122 nuclear loci. Mol Phylogenet Evol 2016; 105:1-14. [DOI: 10.1016/j.ympev.2016.08.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2016] [Revised: 08/09/2016] [Accepted: 08/18/2016] [Indexed: 10/25/2022]
|
33
|
Hollingsworth PM, Li DZ, van der Bank M, Twyford AD. Telling plant species apart with DNA: from barcodes to genomes. Philos Trans R Soc Lond B Biol Sci 2016; 371:20150338. [PMID: 27481790 PMCID: PMC4971190 DOI: 10.1098/rstb.2015.0338] [Citation(s) in RCA: 141] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/01/2016] [Indexed: 12/17/2022] Open
Abstract
Land plants underpin a multitude of ecosystem functions, support human livelihoods and represent a critically important component of terrestrial biodiversity-yet many tens of thousands of species await discovery, and plant identification remains a substantial challenge, especially where material is juvenile, fragmented or processed. In this opinion article, we tackle two main topics. Firstly, we provide a short summary of the strengths and limitations of plant DNA barcoding for addressing these issues. Secondly, we discuss options for enhancing current plant barcodes, focusing on increasing discriminatory power via either gene capture of nuclear markers or genome skimming. The former has the advantage of establishing a defined set of target loci maximizing efficiency of sequencing effort, data storage and analysis. The challenge is developing a probe set for large numbers of nuclear markers that works over sufficient phylogenetic breadth. Genome skimming has the advantage of using existing protocols and being backward compatible with existing barcodes; and the depth of sequence coverage can be increased as sequencing costs fall. Its non-targeted nature does, however, present a major informatics challenge for upscaling to large sample sets.This article is part of the themed issue 'From DNA barcodes to biomes'.
Collapse
Affiliation(s)
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, 132 Lanhei Road, Heilongtan, Kunming, Yunnan 650201, People's Republic of China
| | - Michelle van der Bank
- Department of Botany and Plant Biotechnology, University of Johannesburg, Auckland park, Johannesburg PO Box 524, South Africa
| | - Alex D Twyford
- Ashworth Laboratories, Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3FL, UK
| |
Collapse
|
34
|
Shetty SM, Md Shah MU, Makale K, Mohd-Yusuf Y, Khalid N, Othman RY. Complete Chloroplast Genome Sequence of Corroborates Structural Heterogeneity of Inverted Repeats in Wild Progenitors of Cultivated Bananas and Plantains. THE PLANT GENOME 2016; 9. [PMID: 27898825 DOI: 10.3835/plantgenome2015.09.0089] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Complete genome sequencing of cytoplasmically inherited chloroplast DNA provides novel insights into the origins of clonally propagated crops such as banana and plantain ( spp.). This study describes the structural organization of the chloroplast genome of Colla and its phylogenetic relationship with other wild progenitors of the domesticated banana cultivars. The chloroplast genome was sequenced using Illumina HiSeq 2000 platform, followed by a combination of de novo short-read assembly and reference-guided mapping of contigs to generate complete plastome sequence. The chloroplast genome is 169,503 bp in length, exhibits a typical quadripartite structural organization with a large single-copy (LSC; 87,828 bp) region and a small single-copy (SSC; 11,547 bp) region interspersed between inverted repeat (IRa/b; 35,064 bp) regions. Overall, its gene content, size, and gene order were identical to that of Colla with extensive expansion of the inverted repeat-small single-copy (IR-SSC) junctions. Comparative analyses revealed the conserved IRa-SSC expansion in three wild species and members of the order Zingiberales. In contrast, IRb-SSC expansion was conspicuously absent in the sister taxon Nee and related species of Zingiberales. Interestingly, phylogenomic assessment based on whole-plastome and protein-coding gene sets have provided robust support for the association of and as a sister group, despite the variation in IRb-SSC expansion. Although the current study substantiates the infrageneric IRb-SSC fluctuations in Musaceae, extensive taxon sampling is necessary to confirm whether the accessions of section have undergone independent IRb-SSC expansion relative to section .
Collapse
|
35
|
Johnson MG, Gardner EM, Liu Y, Medina R, Goffinet B, Shaw AJ, Zerega NJC, Wickett NJ. HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment. APPLICATIONS IN PLANT SCIENCES 2016; 4:apps1600016. [PMID: 27437175 PMCID: PMC4948903 DOI: 10.3732/apps.1600016] [Citation(s) in RCA: 262] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Accepted: 06/01/2016] [Indexed: 05/18/2023]
Abstract
PREMISE OF THE STUDY Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). METHODS AND RESULTS HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. CONCLUSIONS HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.
Collapse
Affiliation(s)
- Matthew G. Johnson
- Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, Illinois 60022 USA
- Author for correspondence:
| | - Elliot M. Gardner
- Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, Illinois 60022 USA
- Plant Biology and Conservation, Northwestern University, 2205 Tech Drive, Evanston, Illinois 60208 USA
| | - Yang Liu
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Storrs, Connecticut 06269 USA
| | - Rafael Medina
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Storrs, Connecticut 06269 USA
| | - Bernard Goffinet
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Storrs, Connecticut 06269 USA
| | - A. Jonathan Shaw
- Department of Biology, Duke University, Box 90338, Durham, North Carolina 27708 USA
| | - Nyree J. C. Zerega
- Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, Illinois 60022 USA
- Plant Biology and Conservation, Northwestern University, 2205 Tech Drive, Evanston, Illinois 60208 USA
| | - Norman J. Wickett
- Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, Illinois 60022 USA
- Plant Biology and Conservation, Northwestern University, 2205 Tech Drive, Evanston, Illinois 60208 USA
| |
Collapse
|
36
|
The quest to resolve recent radiations: Plastid phylogenomics of extinct and endangered Hawaiian endemic mints (Lamiaceae). Mol Phylogenet Evol 2016; 99:16-33. [DOI: 10.1016/j.ympev.2016.02.024] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 02/26/2016] [Accepted: 02/28/2016] [Indexed: 11/17/2022]
|
37
|
Ivanova NV, Kuzmina ML, Braukmann TWA, Borisenko AV, Zakharov EV. Authentication of Herbal Supplements Using Next-Generation Sequencing. PLoS One 2016; 11:e0156426. [PMID: 27227830 PMCID: PMC4882080 DOI: 10.1371/journal.pone.0156426] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 05/14/2016] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. METHODS We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. RESULTS All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven-by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. CONCLUSION Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should involve an interdisciplinary approach taking into account the processes involved in production of herbal supplements, as well as biocomplexity of plant-plant and plant-fungal biological interactions.
Collapse
Affiliation(s)
- Natalia V. Ivanova
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada
| | - Maria L. Kuzmina
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada
| | - Thomas W. A. Braukmann
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada
| | - Alex V. Borisenko
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada
| | - Evgeny V. Zakharov
- Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada
| |
Collapse
|
38
|
Twyford AD. Will Benchtop Sequencers Resolve the Sequencing Trade-off in Plant Genetics? FRONTIERS IN PLANT SCIENCE 2016; 7:433. [PMID: 27092154 PMCID: PMC4822345 DOI: 10.3389/fpls.2016.00433] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 03/21/2016] [Indexed: 06/05/2023]
|
39
|
Coissac E, Hollingsworth PM, Lavergne S, Taberlet P. From barcodes to genomes: extending the concept of DNA barcoding. Mol Ecol 2016; 25:1423-8. [DOI: 10.1111/mec.13549] [Citation(s) in RCA: 233] [Impact Index Per Article: 29.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 12/28/2015] [Accepted: 01/19/2016] [Indexed: 12/20/2022]
Affiliation(s)
- Eric Coissac
- CNRS; LECA; F-38000 Grenoble France
- Univ. Grenoble Alpes; LECA; F-38000 Grenoble France
| | | | - Sébastien Lavergne
- CNRS; LECA; F-38000 Grenoble France
- Univ. Grenoble Alpes; LECA; F-38000 Grenoble France
| | - Pierre Taberlet
- CNRS; LECA; F-38000 Grenoble France
- Univ. Grenoble Alpes; LECA; F-38000 Grenoble France
| |
Collapse
|
40
|
Uribe-Convers S, Settles ML, Tank DC. A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae). PLoS One 2016; 11:e0148203. [PMID: 26828929 PMCID: PMC4734709 DOI: 10.1371/journal.pone.0148203] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 01/14/2016] [Indexed: 11/30/2022] Open
Abstract
Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses the Fluidigm Access Array System (Fluidigm, San Francisco, CA, USA) and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per Access Array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or—more importantly—recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our approach and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500bp for 349 samples.
Collapse
Affiliation(s)
- Simon Uribe-Convers
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America
- Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, United States of America
- Stillinger Herbarium, University of Idaho, Moscow, Idaho, United States of America
- * E-mail:
| | - Matthew L. Settles
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America
- Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, United States of America
| | - David C. Tank
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America
- Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, United States of America
- Stillinger Herbarium, University of Idaho, Moscow, Idaho, United States of America
| |
Collapse
|
41
|
Stull GW, Duno de Stefano R, Soltis DE, Soltis PS. Resolving basal lamiid phylogeny and the circumscription of Icacinaceae with a plastome-scale data set. AMERICAN JOURNAL OF BOTANY 2015; 102:1794-813. [PMID: 26507112 DOI: 10.3732/ajb.1500298] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 09/16/2015] [Indexed: 05/08/2023]
Abstract
PREMISE OF THE STUDY Major relationships within Lamiidae, an asterid clade with ∼40000 species, have largely eluded resolution despite two decades of intensive study. The phylogenetic positions of Icacinaceae and other early-diverging lamiid clades (Garryales, Metteniusaceae, and Oncothecaceae) have been particularly problematic, hindering classification and impeding our understanding of early lamiid (and euasterid) character evolution. METHODS To resolve basal lamiid phylogeny, we sequenced 50 plastid genomes using the Illumina sequencing platform and combined these with available asterid plastome sequence data for more comprehensive phylogenetic analyses. KEY RESULTS Our analyses resolved basal lamiid relationships with strong support, including the circumscription and phylogenetic position of the enigmatic Icacinaceae. This greatly improved basal lamiid phylogeny offers insight into character evolution and facilitates an updated classification for this clade, which we present here, including phylogenetic definitions for 10 new or converted clade names. We also offer recommendations for applying this classification to the Angiosperm Phylogeny Group (APG) system, including the recognition of a reduced Icacinaceae, an expanded Metteniusaceae, and two orders new to APG: Icacinales (Icacinaceae + Oncothecaceae) and Metteniusales (Metteniusaceae). CONCLUSIONS The lamiids possibly radiated from an ancestry of tropical trees with inconspicuous flowers and large, drupaceous fruits, given that these morphological characters are distributed across a grade of lineages (Icacinaceae, Oncothecaceae, Metteniusaceae) subtending the core lamiid clade (Boraginales, Gentianales, Lamiales, Solanales, Vahlia). Furthermore, the presence of similar morphological features among members of Aquifoliales suggests these characters might be ancestral for the Gentianidae (euasterids) as a whole.
Collapse
Affiliation(s)
- Gregory W Stull
- Department of Biology, University of Florida, Gainesville, Florida 32611-8525 USA Florida Museum of Natural History, University of Florida, Gainesville, Florida 32611-7800 USA
| | - Rodrigo Duno de Stefano
- Herbario CICY, Centro de Investigación Científicas de Yucatán A. C., Mérida, Yucatán 97200 Mexico
| | - Douglas E Soltis
- Department of Biology, University of Florida, Gainesville, Florida 32611-8525 USA Florida Museum of Natural History, University of Florida, Gainesville, Florida 32611-7800 USA
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, Florida 32611-7800 USA
| |
Collapse
|
42
|
Garaycochea S, Speranza P, Alvarez-Valin F. A strategy to recover a high-quality, complete plastid sequence from low-coverage whole-genome sequencing. APPLICATIONS IN PLANT SCIENCES 2015; 3:apps1500022. [PMID: 26504677 PMCID: PMC4610308 DOI: 10.3732/apps.1500022] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 08/28/2015] [Indexed: 06/05/2023]
Abstract
PREMISE OF THE STUDY We developed a bioinformatic strategy to recover and assemble a chloroplast genome using data derived from low-coverage 454 GS FLX/Roche whole-genome sequencing. METHODS A comparative genomics approach was applied to obtain the complete chloroplast genome from a weedy biotype of rice from Uruguay. We also applied appropriate filters to discriminate reads representing novel DNA transfer events between the chloroplast and nuclear genomes. RESULTS From a set of 295,159 reads (96 Mb data), we assembled the chloroplast genome into two contigs. This weedy rice was classified based on 23 polymorphic regions identified by comparison with reference chloroplast genomes. We detected recent and past events of genetic material transfer between the chloroplast and nuclear genomes and estimated their occurrence frequency. DISCUSSION We obtained a high-quality complete chloroplast genome sequence from low-coverage sequencing data. Intergenome DNA transfer appears to be more frequent than previously thought.
Collapse
Affiliation(s)
- Silvia Garaycochea
- Unidad de Biotecnología, Instituto Nacional de Investigación Agropecuaria (INIA), Rincón del Colorado, Canelones, Uruguay
| | - Pablo Speranza
- Departamento de Biología Vegetal, Facultad de Agronomía, Universidad de la República, Montevideo, Uruguay
| | - Fernando Alvarez-Valin
- Sección Biomatemática, Instituto de Biología, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| |
Collapse
|
43
|
Nicholls JA, Pennington RT, Koenen EJM, Hughes CE, Hearn J, Bunnefeld L, Dexter KG, Stone GN, Kidner CA. Using targeted enrichment of nuclear genes to increase phylogenetic resolution in the neotropical rain forest genus Inga (Leguminosae: Mimosoideae). FRONTIERS IN PLANT SCIENCE 2015; 6:710. [PMID: 26442024 PMCID: PMC4584976 DOI: 10.3389/fpls.2015.00710] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Accepted: 08/25/2015] [Indexed: 05/20/2023]
Abstract
Evolutionary radiations are prominent and pervasive across many plant lineages in diverse geographical and ecological settings; in neotropical rainforests there is growing evidence suggesting that a significant fraction of species richness is the result of recent radiations. Understanding the evolutionary trajectories and mechanisms underlying these radiations demands much greater phylogenetic resolution than is currently available for these groups. The neotropical tree genus Inga (Leguminosae) is a good example, with ~300 extant species and a crown age of 2-10 MY, yet over 6 kb of plastid and nuclear DNA sequence data gives only poor phylogenetic resolution among species. Here we explore the use of larger-scale nuclear gene data obtained though targeted enrichment to increase phylogenetic resolution within Inga. Transcriptome data from three Inga species were used to select 264 nuclear loci for targeted enrichment and sequencing. Following quality control to remove probable paralogs from these sequence data, the final dataset comprised 259,313 bases from 194 loci for 24 accessions representing 22 Inga species and an outgroup (Zygia). Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies. We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera. In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing. We conclude that targeted enrichment provides the large volumes of phylogenetically-informative sequence data required to resolve relationships within recent plant species radiations, both at the species level and for within-species phylogeographic studies.
Collapse
Affiliation(s)
- James A. Nicholls
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of EdinburghEdinburgh, UK
- Royal Botanic Garden EdinburghEdinburgh, UK
| | | | - Erik J. M. Koenen
- Institute of Systematic Botany, University of ZurichZürich, Switzerland
| | - Colin E. Hughes
- Institute of Systematic Botany, University of ZurichZürich, Switzerland
| | - Jack Hearn
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of EdinburghEdinburgh, UK
| | - Lynsey Bunnefeld
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of EdinburghEdinburgh, UK
| | - Kyle G. Dexter
- School of Geosciences, University of EdinburghEdinburgh, UK
| | - Graham N. Stone
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of EdinburghEdinburgh, UK
| | - Catherine A. Kidner
- Royal Botanic Garden EdinburghEdinburgh, UK
- Institute of Molecular Plant Sciences, School of Biological Sciences, University of EdinburghEdinburgh, UK
| |
Collapse
|
44
|
Beck JB, Semple JC. Next-generation sampling: Pairing genomics with herbarium specimens provides species-level signal in Solidago (Asteraceae). APPLICATIONS IN PLANT SCIENCES 2015. [PMID: 26082877 DOI: 10.5061/dryad.16pj5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
PREMISE OF THE STUDY The ability to conduct species delimitation and phylogeny reconstruction with genomic data sets obtained exclusively from herbarium specimens would rapidly enhance our knowledge of large, taxonomically contentious plant genera. In this study, the utility of genotyping by sequencing is assessed in the notoriously difficult genus Solidago (Asteraceae) by attempting to obtain an informative single-nucleotide polymorphism data set from a set of specimens collected between 1970 and 2010. METHODS Reduced representation libraries were prepared and Illumina-sequenced from 95 Solidago herbarium specimen DNAs, and resulting reads were processed with the nonreference Universal Network-Enabled Analysis Kit (UNEAK) pipeline. Multidimensional clustering was used to assess the correspondence between genetic groups and morphologically defined species. RESULTS Library construction and sequencing were successful in 93 of 95 samples. The UNEAK pipeline identified 8470 single-nucleotide polymorphisms, and a filtered data set was analyzed for each of three Solidago subsections. Although results varied, clustering identified genomic groups that often corresponded to currently recognized species or groups of closely related species. DISCUSSION These results suggest that genotyping by sequencing is broadly applicable to DNAs obtained from herbarium specimens. The data obtained and their biological signal suggest that pairing genomics with large-scale herbarium sampling is a promising strategy in species-rich plant groups.
Collapse
Affiliation(s)
- James B Beck
- Department of Biological Sciences, Wichita State University, 537 Hubbard Hall, Wichita, Kansas 67260 USA ; Botanical Research Institute of Texas, 1700 University Drive, Fort Worth, Texas 76107 USA
| | - John C Semple
- Department of Biology, University of Waterloo, Waterloo, Ontario NL2 3G1 Canada
| |
Collapse
|
45
|
Beck JB, Semple JC. Next-generation sampling: Pairing genomics with herbarium specimens provides species-level signal in Solidago (Asteraceae). APPLICATIONS IN PLANT SCIENCES 2015; 3:apps1500014. [PMID: 26082877 PMCID: PMC4467758 DOI: 10.3732/apps.1500014] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Accepted: 04/28/2015] [Indexed: 05/11/2023]
Abstract
PREMISE OF THE STUDY The ability to conduct species delimitation and phylogeny reconstruction with genomic data sets obtained exclusively from herbarium specimens would rapidly enhance our knowledge of large, taxonomically contentious plant genera. In this study, the utility of genotyping by sequencing is assessed in the notoriously difficult genus Solidago (Asteraceae) by attempting to obtain an informative single-nucleotide polymorphism data set from a set of specimens collected between 1970 and 2010. METHODS Reduced representation libraries were prepared and Illumina-sequenced from 95 Solidago herbarium specimen DNAs, and resulting reads were processed with the nonreference Universal Network-Enabled Analysis Kit (UNEAK) pipeline. Multidimensional clustering was used to assess the correspondence between genetic groups and morphologically defined species. RESULTS Library construction and sequencing were successful in 93 of 95 samples. The UNEAK pipeline identified 8470 single-nucleotide polymorphisms, and a filtered data set was analyzed for each of three Solidago subsections. Although results varied, clustering identified genomic groups that often corresponded to currently recognized species or groups of closely related species. DISCUSSION These results suggest that genotyping by sequencing is broadly applicable to DNAs obtained from herbarium specimens. The data obtained and their biological signal suggest that pairing genomics with large-scale herbarium sampling is a promising strategy in species-rich plant groups.
Collapse
Affiliation(s)
- James B. Beck
- Department of Biological Sciences, Wichita State University, 537 Hubbard Hall, Wichita, Kansas 67260 USA
- Botanical Research Institute of Texas, 1700 University Drive, Fort Worth, Texas 76107 USA
- Author for correspondence:
| | - John C. Semple
- Department of Biology, University of Waterloo, Waterloo, Ontario NL2 3G1 Canada
| |
Collapse
|
46
|
Nicholls JA, Pennington RT, Koenen EJM, Hughes CE, Hearn J, Bunnefeld L, Dexter KG, Stone GN, Kidner CA. Using targeted enrichment of nuclear genes to increase phylogenetic resolution in the neotropical rain forest genus Inga (Leguminosae: Mimosoideae). FRONTIERS IN PLANT SCIENCE 2015. [PMID: 26442024 DOI: 10.5061/dryad.r9c12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Evolutionary radiations are prominent and pervasive across many plant lineages in diverse geographical and ecological settings; in neotropical rainforests there is growing evidence suggesting that a significant fraction of species richness is the result of recent radiations. Understanding the evolutionary trajectories and mechanisms underlying these radiations demands much greater phylogenetic resolution than is currently available for these groups. The neotropical tree genus Inga (Leguminosae) is a good example, with ~300 extant species and a crown age of 2-10 MY, yet over 6 kb of plastid and nuclear DNA sequence data gives only poor phylogenetic resolution among species. Here we explore the use of larger-scale nuclear gene data obtained though targeted enrichment to increase phylogenetic resolution within Inga. Transcriptome data from three Inga species were used to select 264 nuclear loci for targeted enrichment and sequencing. Following quality control to remove probable paralogs from these sequence data, the final dataset comprised 259,313 bases from 194 loci for 24 accessions representing 22 Inga species and an outgroup (Zygia). Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies. We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera. In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing. We conclude that targeted enrichment provides the large volumes of phylogenetically-informative sequence data required to resolve relationships within recent plant species radiations, both at the species level and for within-species phylogeographic studies.
Collapse
Affiliation(s)
- James A Nicholls
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK ; Royal Botanic Garden Edinburgh Edinburgh, UK
| | | | - Erik J M Koenen
- Institute of Systematic Botany, University of Zurich Zürich, Switzerland
| | - Colin E Hughes
- Institute of Systematic Botany, University of Zurich Zürich, Switzerland
| | - Jack Hearn
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK
| | - Lynsey Bunnefeld
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK
| | - Kyle G Dexter
- School of Geosciences, University of Edinburgh Edinburgh, UK
| | - Graham N Stone
- Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK
| | - Catherine A Kidner
- Royal Botanic Garden Edinburgh Edinburgh, UK ; Institute of Molecular Plant Sciences, School of Biological Sciences, University of Edinburgh Edinburgh, UK
| |
Collapse
|
47
|
Nock CJ, Baten A, King GJ. Complete chloroplast genome of Macadamia integrifolia confirms the position of the Gondwanan early-diverging eudicot family Proteaceae. BMC Genomics 2014; 15 Suppl 9:S13. [PMID: 25522147 PMCID: PMC4290595 DOI: 10.1186/1471-2164-15-s9-s13] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Sequence data from the chloroplast genome have played a central role in elucidating the evolutionary history of flowering plants, Angiospermae. In the past decade, the number of complete chloroplast genomes has burgeoned, leading to well-supported angiosperm phylogenies. However, some relationships, particulary among early-diverging lineages, remain unresolved. The diverse Southern Hemisphere plant family Proteaceae arose on the ancient supercontinent Gondwana early in angiosperm history and is a model group for adaptive radiation in response to changing climatic conditions. Genomic resources for the family are limited, and until now it is one of the few early-diverging 'basal eudicot' lineages not represented in chloroplast phylogenomic analyses. RESULTS The chloroplast genome of the Australian nut crop tree Macadamia integrifolia was assembled de novo from Illumina paired-end sequence reads. Three contigs, corresponding to a collapsed inverted repeat, a large and a small single copy region were identified, and used for genome reconstruction. The complete genome is 159,714 bp in length and was assembled at deep coverage (3.29 million reads; ~2000 x). Phylogenetic analyses based on 83-gene and inverted repeat region alignments, the largest sequence-rich datasets to include the basal eudicot family Proteaceae, provide strong support for a Proteales clade that includes Macadamia, Platanus and Nelumbo. Genome structure and content followed the ancestral angiosperm pattern and were highly conserved in the Proteales, whilst size differences were largely explained by the relative contraction of the single copy regions and expansion of the inverted repeats in Macadamia. CONCLUSIONS The Macadamia chloroplast genome presented here is the first in the Proteaceae, and confirms the placement of this family with the morphologically divergent Plantanaceae (plane tree family) and Nelumbonaceae (sacred lotus family) in the basal eudicot order Proteales. It provides a high-quality reference genome for future evolutionary studies and will be of benefit for taxon-rich phylogenomic analyses aimed at resolving relationships among early-diverging angiosperms, and more broadly across the plant tree of life.
Collapse
|
48
|
Li Q, Li Y, Song J, Xu H, Xu J, Zhu Y, Li X, Gao H, Dong L, Qian J, Sun C, Chen S. High-accuracy de novo assembly and SNP detection of chloroplast genomes using a SMRT circular consensus sequencing strategy. THE NEW PHYTOLOGIST 2014; 204:1041-9. [PMID: 25103547 DOI: 10.1111/nph.12966] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2014] [Accepted: 06/29/2014] [Indexed: 05/21/2023]
Abstract
A circular consensus sequencing (CCS) strategy involving single molecule, real-time (SMRT) DNA sequencing technology was applied to de novo assembly and single nucleotide polymorphism (SNP) detection of chloroplast genomes. Chloroplast DNA was purified from enriched chloroplasts of pooled individuals to construct a shotgun library for each species. The sequencing reactions were performed on a PacBio RS platform. CCS sub-reads were generated from polymerase reads that passed the native dumbbell-shaped DNA templates multiple times. The complete chloroplast genome sequence was generated by mapping all reads to the draft sequence constructed in a step-by-step manner. The full-chain, PCR-free approach eliminates the possible context-specific biases in library construction and sequencing reaction. The chloroplast genome was easily and completely assembled using the data generated from one SMRT Cell without requiring a reference genome. Comparisons of the three assembled Fritillaria genomes to 34.1 kb of validation Sanger sequences revealed 100% concordance, and the detected intraspecies SNPs at a minimum variant frequency of 15% were all confirmed. This simple approach with potential for parallel sequencing yields high-quality chloroplast genomes for sensitive SNP detection and comparative analyses. We recommend this approach for its powerful applicability for evolutionary genetics and genomics studies in plants based on the sequences of chloroplast genomes.
Collapse
Affiliation(s)
- Qiushi Li
- The National Engineering Laboratory for Breeding of Endangered Medicinal Materials, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100193, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Ripma LA, Simpson MG, Hasenstab-Lehman K. Geneious! Simplified genome skimming methods for phylogenetic systematic studies: A case study in Oreocarya (Boraginaceae). APPLICATIONS IN PLANT SCIENCES 2014. [PMID: 25506521 DOI: 10.5061/dryad.50536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
PREMISE OF THE STUDY As systematists grapple with how to best harness the power of next-generation sequencing (NGS), a deluge of review papers, methods, and analytical tools make choosing the right method difficult. Oreocarya (Boraginaceae), a genus of 63 species, is a good example of a group lacking both species-level resolution and genomic resources. The use of Geneious removes bioinformatic barriers and makes NGS genome skimming accessible to even the least tech-savvy systematists. • METHODS A combination of de novo and reference-guided assemblies was used to process 100-bp single-end Illumina HiSeq 2000 reads. A subset of 25 taxa was used to test the suitability of genome skimming for future systematic studies in recalcitrant lineages like Oreocarya. • RESULTS The nuclear ribosomal cistron, the plastome, and 12 mitochondrial genes were recovered from all 25 taxa. All data processing and phylogenomic analyses were performed in Geneious. We report possible future multiplexing levels and published low-copy nuclear genes represented within de novo contigs. • DISCUSSION Genome skimming represents a much-improved primary data collection over PCR+Sanger sequencing when chloroplast DNA (cpDNA), nuclear ribosomal DNA (nrDNA), and mitochondrial DNA (mtDNA) are the target sequences. This study details methods that plant systematists can employ to study their own taxa of interest.
Collapse
Affiliation(s)
- Lee A Ripma
- Department of Biology, San Diego State University, San Diego, California 92182-4614 USA
| | - Michael G Simpson
- Department of Biology, San Diego State University, San Diego, California 92182-4614 USA
| | | |
Collapse
|
50
|
Ripma LA, Simpson MG, Hasenstab-Lehman K. Geneious! Simplified genome skimming methods for phylogenetic systematic studies: A case study in Oreocarya (Boraginaceae). APPLICATIONS IN PLANT SCIENCES 2014; 2:apps1400062. [PMID: 25506521 PMCID: PMC4259456 DOI: 10.3732/apps.1400062] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Accepted: 11/07/2014] [Indexed: 05/24/2023]
Abstract
PREMISE OF THE STUDY As systematists grapple with how to best harness the power of next-generation sequencing (NGS), a deluge of review papers, methods, and analytical tools make choosing the right method difficult. Oreocarya (Boraginaceae), a genus of 63 species, is a good example of a group lacking both species-level resolution and genomic resources. The use of Geneious removes bioinformatic barriers and makes NGS genome skimming accessible to even the least tech-savvy systematists. • METHODS A combination of de novo and reference-guided assemblies was used to process 100-bp single-end Illumina HiSeq 2000 reads. A subset of 25 taxa was used to test the suitability of genome skimming for future systematic studies in recalcitrant lineages like Oreocarya. • RESULTS The nuclear ribosomal cistron, the plastome, and 12 mitochondrial genes were recovered from all 25 taxa. All data processing and phylogenomic analyses were performed in Geneious. We report possible future multiplexing levels and published low-copy nuclear genes represented within de novo contigs. • DISCUSSION Genome skimming represents a much-improved primary data collection over PCR+Sanger sequencing when chloroplast DNA (cpDNA), nuclear ribosomal DNA (nrDNA), and mitochondrial DNA (mtDNA) are the target sequences. This study details methods that plant systematists can employ to study their own taxa of interest.
Collapse
Affiliation(s)
- Lee A. Ripma
- Department of Biology, San Diego State University, San Diego, California 92182-4614 USA
| | - Michael G. Simpson
- Department of Biology, San Diego State University, San Diego, California 92182-4614 USA
| | | |
Collapse
|