1
|
Bossert S, Pauly A, Danforth BN, Orr MC, Murray EA. Lessons from assembling UCEs: A comparison of common methods and the case of Clavinomia (Halictidae). Mol Ecol Resour 2024; 24:e13925. [PMID: 38183389 DOI: 10.1111/1755-0998.13925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 12/08/2023] [Accepted: 12/21/2023] [Indexed: 01/08/2024]
Abstract
Sequence data assembly is a foundational step in high-throughput sequencing, with untold consequences for downstream analyses. Despite this, few studies have interrogated the many methods for assembling phylogenomic UCE data for their comparative efficacy, or for how outputs may be impacted. We study this by comparing the most commonly used assembly methods for UCEs in the under-studied bee lineage Nomiinae and a representative sampling of relatives. Data for 63 UCE-only and 75 mixed taxa were assembled with five methods, including ABySS, HybPiper, SPAdes, Trinity and Velvet, and then benchmarked for their relative performance in terms of locus capture parameters and phylogenetic reconstruction. Unexpectedly, Trinity and Velvet trailed the other methods in terms of locus capture and DNA matrix density, whereas SPAdes performed favourably in most assessed metrics. In comparison with SPAdes, the guided-assembly approach HybPiper generally recovered the highest quality loci but in lower numbers. Based on our results, we formally move Clavinomia to Dieunomiini and render Epinomia once more a subgenus of Dieunomia. We strongly advise that future studies more closely examine the influence of assembly approach on their results, or, minimally, use better-performing assembly methods such as SPAdes or HybPiper. In this way, we can move forward with phylogenomic studies in a more standardized, comparable manner.
Collapse
Affiliation(s)
- Silas Bossert
- Department of Entomology, Washington State University, Pullman, Washington, USA
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Alain Pauly
- Royal Belgian Institute of Natural Sciences, O.D. Taxonomy and Phylogeny, Brussels, Belgium
| | - Bryan N Danforth
- Department of Entomology, Cornell University, Ithaca, New York, USA
| | - Michael C Orr
- Entomologie, Staatliches Museum für Naturkunde Stuttgart, Stuttgart, Germany
| | - Elizabeth A Murray
- Department of Entomology, Washington State University, Pullman, Washington, USA
| |
Collapse
|
2
|
Raza M, Ortiz EM, Schwung L, Shigita G, Schaefer H. Resolving the phylogeny of Thladiantha (Cucurbitaceae) with three different target capture pipelines. BMC Ecol Evol 2023; 23:75. [PMID: 38087247 PMCID: PMC10714463 DOI: 10.1186/s12862-023-02185-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 12/05/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Despite recent advances, reliable tools to simultaneously handle different types of sequencing data (e.g., target capture, genome skimming) for phylogenomics are still scarce. Here, we evaluate the performance of the recently developed pipeline Captus in comparison with the well-known target capture pipelines HybPiper and SECAPR. As test data, we analyzed newly generated sequences for the genus Thladiantha (Cucurbitaceae) for which no well-resolved phylogeny estimate has been available so far, as well as simulated reads derived from the genome of Arabidopsis thaliana. RESULTS Our pipeline comparisons are based on (1) the time needed for data assembly and locus extraction, (2) locus recovery per sample, (3) the number of informative sites in nucleotide alignments, and (4) the topology of the nuclear and plastid phylogenies. Additionally, the simulated reads derived from the genome of Arabidopsis thaliana were used to evaluate the accuracy and completeness of the recovered loci. In terms of computation time, locus recovery per sample, and informative sites, Captus outperforms HybPiper and SECAPR. The resulting topologies of Captus and SECAPR are identical for coalescent trees but differ when trees are inferred from concatenated alignments. The HybPiper phylogeny is similar to Captus in both methods. The nuclear genes recover a deep split of Thladiantha in two clades, but this is not supported by the plastid data. CONCLUSIONS Captus is the best choice among the three pipelines in terms of computation time and locus recovery. Even though there is no significant topological difference between the Thladiantha species trees produced by the three pipelines, Captus yields a higher number of gene trees in agreement with the topology of the species tree (i.e., fewer genes in conflict with the species tree topology).
Collapse
Affiliation(s)
- Mustafa Raza
- Plant Biodiversity Research, Dept. Life Science Systems, Technical University of Munich (TUM), Emil-Ramann-Str. 2, D-85354, Freising, Germany
| | - Edgardo M Ortiz
- Plant Biodiversity Research, Dept. Life Science Systems, Technical University of Munich (TUM), Emil-Ramann-Str. 2, D-85354, Freising, Germany
| | - Lea Schwung
- Plant Biodiversity Research, Dept. Life Science Systems, Technical University of Munich (TUM), Emil-Ramann-Str. 2, D-85354, Freising, Germany
| | - Gentaro Shigita
- Plant Biodiversity Research, Dept. Life Science Systems, Technical University of Munich (TUM), Emil-Ramann-Str. 2, D-85354, Freising, Germany
| | - Hanno Schaefer
- Plant Biodiversity Research, Dept. Life Science Systems, Technical University of Munich (TUM), Emil-Ramann-Str. 2, D-85354, Freising, Germany.
| |
Collapse
|
3
|
Jackson C, McLay T, Schmidt‐Lebuhn AN. hybpiper-nf and paragone-nf: Containerization and additional options for target capture assembly and paralog resolution. Appl Plant Sci 2023; 11:e11532. [PMID: 37601313 PMCID: PMC10439820 DOI: 10.1002/aps3.11532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 01/25/2023] [Accepted: 01/30/2023] [Indexed: 08/22/2023]
Abstract
Premise The HybPiper pipeline has become one of the most widely used tools for the assembly of target capture data for phylogenomic analysis. After the production of locus sequences and before phylogenetic analysis, the identification of paralogs is a critical step for ensuring the accurate inference of evolutionary relationships. Algorithmic approaches using gene tree topologies for the inference of ortholog groups are computationally efficient and broadly applicable to non-model organisms, especially in the absence of a known species tree. Methods and Results We containerized and expanded the functionality of both HybPiper and a pipeline for the inference of ortholog groups, providing novel options for the treatment of target capture sequence data, and allowing seamless use of the outputs of the former as inputs for the latter. The Singularity container presented here includes all dependencies, and the corresponding pipelines (hybpiper-nf and paragone-nf, respectively) are implemented via two Nextflow scripts for easier deployment and to vastly reduce the number of commands required for their use. Conclusions The hybpiper-nf and paragone-nf pipelines are easily installed and provide a user-friendly experience and robust results to the phylogenetic community. They are used by the Australian Angiosperm Tree of Life project. The pipelines are available at https://github.com/chrisjackson-pellicle/hybpiper-nf and https://github.com/chrisjackson-pellicle/paragone-nf.
Collapse
Affiliation(s)
- Chris Jackson
- Royal Botanic Gardens Victoria, Birdwood Avenue, MelbourneVictoria3004Australia
| | - Todd McLay
- Royal Botanic Gardens Victoria, Birdwood Avenue, MelbourneVictoria3004Australia
- Centre for Australian National Biodiversity ResearchCSIRO, Clunies Ross StreetCanberra2601Australian Capital TerritoryAustralia
- School of BiosciencesThe University of Melbourne, Parkville, MelbourneVictoria3010Australia
| | - Alexander N. Schmidt‐Lebuhn
- Centre for Australian National Biodiversity ResearchCSIRO, Clunies Ross StreetCanberra2601Australian Capital TerritoryAustralia
| |
Collapse
|
4
|
Fonseca LHM, Carlsen MM, Fine PVA, Lohmann LG. A nuclear target sequence capture probe set for phylogeny reconstruction of the charismatic plant family Bignoniaceae. Front Genet 2023; 13:1085692. [PMID: 36699458 PMCID: PMC9869424 DOI: 10.3389/fgene.2022.1085692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 12/12/2022] [Indexed: 01/11/2023] Open
Abstract
The plant family Bignoniaceae is a conspicuous and charismatic element of the tropical flora. The family has a complex taxonomic history, with substantial changes in the classification of the group during the past two centuries. Recent re-classifications at the tribal and generic levels have been largely possible by the availability of molecular phylogenies reconstructed using Sanger sequencing data. However, our complete understanding of the systematics, evolution, and biogeography of the family remains incomplete, especially due to the low resolution and support of different portions of the Bignoniaceae phylogeny. To overcome these limitations and increase the amount of molecular data available for phylogeny reconstruction within this plant family, we developed a bait kit targeting 762 nuclear genes, including 329 genes selected specifically for the Bignoniaceae; 348 genes obtained from the Angiosperms353 with baits designed specifically for the family; and, 85 low-copy genes of known function. On average, 77.4% of the reads mapped to the targets, and 755 genes were obtained per species. After removing genes with putative paralogs, 677 loci were used for phylogenetic analyses. On-target genes were compared and combined in the Exon-Only dataset, and on-target + off-target regions were combined in the Supercontig dataset. We tested the performance of the bait kit at different taxonomic levels, from family to species-level, using 38 specimens of 36 different species of Bignoniaceae, representing: 1) six (out of eight) tribal level-clades (e.g., Bignonieae, Oroxyleae, Tabebuia Alliance, Paleotropical Clade, Tecomeae, and Jacarandeae), only Tourrettieae and Catalpeae were not sampled; 2) all 20 genera of Bignonieae; 3) seven (out of nine) species of Dolichandra (e.g., D. chodatii, D. cynanchoides, D. dentata, D. hispida, D. quadrivalvis, D. uncata, and D. uniguis-cati), only D. steyermarkii and D. unguiculata were not sampled; and 4) three individuals of Dolichandra unguis-cati. Our data reconstructed a well-supported phylogeny of the Bignoniaceae at different taxonomic scales, opening new perspectives for a comprehensive phylogenetic framework for the family as a whole.
Collapse
Affiliation(s)
- Luiz Henrique M. Fonseca
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil,Systematic and Evolutionary Botany Laboratory, Department of Biology, Ghent University, Ghent, Belgium,*Correspondence: Luiz Henrique M. Fonseca, ; Lúcia G. Lohmann,
| | | | - Paul V. A. Fine
- University and Jepson Herbaria, and Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, United States
| | - Lúcia G. Lohmann
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil,University and Jepson Herbaria, and Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, United States,*Correspondence: Luiz Henrique M. Fonseca, ; Lúcia G. Lohmann,
| |
Collapse
|
5
|
Wang Y, Ruhsam M, Milne R, Graham SW, Li J, Tao T, Zhang Y, Mao K. Incomplete lineage sorting and local extinction shaped the complex evolutionary history of the Paleogene relict conifer genus, Chamaecyparis (Cupressaceae). Mol Phylogenet Evol 2022; 172:107485. [PMID: 35452840 DOI: 10.1016/j.ympev.2022.107485] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 03/26/2022] [Accepted: 04/05/2022] [Indexed: 11/24/2022]
Abstract
Inferring accurate biogeographic history of plant taxa with an East Asia (EA)-North America (NA) is usually hindered by conflicting phylogenies and a poor fossil record. The current distribution of Chamaecyparis (false cypress; Cupressaceae) with four species in EA, and one each in western and eastern NA, and its relatively rich fossil record, make it an excellent model for studying the EA-NA disjunction. Here we reconstruct phylogenomic relationships within Chamaecyparis using > 1400 homologous nuclear and 61 plastid genes. Our phylogenomic analyses using concatenated and coalescent approaches revealed strong cytonuclear discordance and conflicting topologies between nuclear gene trees. Incomplete lineage sorting (ILS) and hybridization are possible explanations of conflict; however, our coalescent analyses and simulations suggest that ILS is the major contributor to the observed phylogenetic discrepancies. Based on a well-resolved species tree and four fossil calibrations, the crown lineage of Chamaecyparis is estimated to have originated in the upper Cretaceous, followed by diversification events in the early and middle Paleogene. Ancestral area reconstructions suggest that Chamaecyparis had an ancestral range spanning both EA and NA. Fossil records further indicate that this genus is a relict of the "boreotropical" flora, and that local extinctions of European species were caused by global cooling. Overall, our results unravel a complex evolutionary history of a Paleogene relict conifer genus, which may have involved ILS, hybridization and the extinction of local species.
Collapse
Affiliation(s)
- Yi Wang
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu 610065, Sichuan, China
| | - Markus Ruhsam
- Royal Botanic Garden Edinburgh, 20A Inverleith Row, Edinburgh EH3 5LR, UK
| | - Richard Milne
- Institute of Molecular Plant Science, School of Biological Science, University of Edinburgh, Edinburgh EH9 3BF, UK
| | - Sean W Graham
- Department of Botany, University of British Columbia, Vancouver, V6T 1Z4, Canada
| | - Jialiang Li
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu 610065, Sichuan, China
| | - Tongzhou Tao
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu 610065, Sichuan, China
| | - Yujiao Zhang
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu 610065, Sichuan, China
| | - Kangshan Mao
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu 610065, Sichuan, China; College of Science, Tibet University, Lhasa 850000, Xizang Autonomous Region, PR China.
| |
Collapse
|
6
|
McLay TGB, Birch JL, Gunn BF, Ning W, Tate JA, Nauheimer L, Joyce EM, Simpson L, Schmidt‐Lebuhn AN, Baker WJ, Forest F, Jackson CJ. New targets acquired: Improving locus recovery from the Angiosperms353 probe set. Appl Plant Sci 2021; 9:APS311420. [PMID: 34336399 PMCID: PMC8312740 DOI: 10.1002/aps3.11420] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 03/15/2021] [Indexed: 05/10/2023]
Abstract
PREMISE Universal target enrichment kits maximize utility across wide evolutionary breadth while minimizing the number of baits required to create a cost-efficient kit. The Angiosperms353 kit has been successfully used to capture loci throughout the angiosperms, but the default target reference file includes sequence information from only 6-18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, and reducing locus recovery. METHODS We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a 'mega353' target file, with each locus represented by 17-373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene data sets. RESULTS Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 32%, increased locus recovery at 75% length by 49%, and increased the total length of the concatenated loci by 29%. DISCUSSION Increasing the phylogenetic density of the target reference file results in improved recovery of target capture loci. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets.
Collapse
Affiliation(s)
- Todd G. B. McLay
- National Herbarium of VictoriaRoyal Botanic Gardens VictoriaMelbourneAustralia
- School of BiosciencesUniversity of MelbourneMelbourneAustralia
- Centre for Australian National Biodiversity ResearchCSIROCanberraAustralia
| | - Joanne L. Birch
- School of BiosciencesUniversity of MelbourneMelbourneAustralia
| | - Bee F. Gunn
- National Herbarium of VictoriaRoyal Botanic Gardens VictoriaMelbourneAustralia
- School of BiosciencesUniversity of MelbourneMelbourneAustralia
| | - Weixuan Ning
- School of Fundamental SciencesMassey UniversityPalmerston NorthNew Zealand
| | - Jennifer A. Tate
- School of Fundamental SciencesMassey UniversityPalmerston NorthNew Zealand
| | - Lars Nauheimer
- James Cook UniversityCairnsAustralia
- Australian Tropical HerbariumJames Cook UniversityCairnsAustralia
| | - Elizabeth M. Joyce
- James Cook UniversityCairnsAustralia
- Australian Tropical HerbariumJames Cook UniversityCairnsAustralia
| | - Lalita Simpson
- James Cook UniversityCairnsAustralia
- Australian Tropical HerbariumJames Cook UniversityCairnsAustralia
| | | | | | - Félix Forest
- Royal Botanic Gardens, KewRichmondSurreyTW9 3AEUnited Kingdom
| | - Chris J. Jackson
- National Herbarium of VictoriaRoyal Botanic Gardens VictoriaMelbourneAustralia
| |
Collapse
|
7
|
Nauheimer L, Weigner N, Joyce E, Crayn D, Clarke C, Nargar K. HybPhaser: A workflow for the detection and phasing of hybrids in target capture data sets. Appl Plant Sci 2021; 9:APS311441. [PMID: 34336402 PMCID: PMC8312746 DOI: 10.1002/aps3.11441] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 04/28/2021] [Indexed: 05/24/2023]
Abstract
PREMISE Hybrids contain divergent alleles that can confound phylogenetic analyses but can provide insights into reticulated evolution when identified and phased. We developed a workflow to detect hybrids in target capture data sets and phase reads into parental lineages using a similarity and phylogenetic framework. METHODS We used Angiosperms353 target capture data for Nepenthes, including known hybrids to test the novel workflow. Reference mapping was used to assess heterozygous sites across the data set and to detect hybrid accessions and paralogous genes. Hybrid samples were phased by mapping reads to multiple references and sorting reads according to similarity. Phased accessions were included in the phylogenetic framework. RESULTS All known Nepenthes hybrids and nine additional samples had high levels of heterozygous sites, had reads associated with multiple divergent clades, and were phased into accessions resembling divergent haplotypes. Phylogenetic analysis including phased accessions increased clade support and confirmed parental lineages of hybrids. DISCUSSION HybPhaser provides a novel approach to detect and phase hybrids in target capture data sets, which can provide insights into reticulations by revealing origins of hybrids and reduce conflicting signal, leading to more robust phylogenetic analyses.
Collapse
Affiliation(s)
- Lars Nauheimer
- Australian Tropical HerbariumJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
- Centre for Tropical Bioinformatics and Molecular BiologyJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
- Centre for Tropical Environmental Sustainability ScienceJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
| | - Nicholas Weigner
- Australian Tropical HerbariumJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
| | - Elizabeth Joyce
- Australian Tropical HerbariumJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
- Centre for Tropical Environmental Sustainability ScienceJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
| | - Darren Crayn
- Australian Tropical HerbariumJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
- Centre for Tropical Bioinformatics and Molecular BiologyJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
- Centre for Tropical Environmental Sustainability ScienceJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
| | - Charles Clarke
- Australian Tropical HerbariumJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
- Cairns Botanic GardensCollins AvenueEdge HillQueensland4870Australia
| | - Katharina Nargar
- Australian Tropical HerbariumJames Cook UniversityMcGregor RoadSmithfieldQueensland4878Australia
- National Research Collections AustraliaCommonwealth Industrial and Scientific Research Organisation (CSIRO)GPO Box 1700CanberraAustralian Capital Territory2601Australia
| |
Collapse
|
8
|
Grewe F, Ametrano C, Widhelm TJ, Leavitt S, Distefano I, Polyiam W, Pizarro D, Wedin M, Crespo A, Divakar PK, Lumbsch HT. Using target enrichment sequencing to study the higher-level phylogeny of the largest lichen-forming fungi family: Parmeliaceae (Ascomycota). IMA Fungus 2020; 11:27. [PMID: 33317627 PMCID: PMC7734834 DOI: 10.1186/s43008-020-00051-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 11/29/2020] [Indexed: 11/10/2022] Open
Abstract
Parmeliaceae is the largest family of lichen-forming fungi with a worldwide distribution. We used a target enrichment data set and a qualitative selection method for 250 out of 350 genes to infer the phylogeny of the major clades in this family including 81 taxa, with both subfamilies and all seven major clades previously recognized in the subfamily Parmelioideae. The reduced genome-scale data set was analyzed using concatenated-based Bayesian inference and two different Maximum Likelihood analyses, and a coalescent-based species tree method. The resulting topology was strongly supported with the majority of nodes being fully supported in all three concatenated-based analyses. The two subfamilies and each of the seven major clades in Parmelioideae were strongly supported as monophyletic. In addition, most backbone relationships in the topology were recovered with high nodal support. The genus Parmotrema was found to be polyphyletic and consequently, it is suggested to accept the genus Crespoa to accommodate the species previously placed in Parmotrema subgen. Crespoa. This study demonstrates the power of reduced genome-scale data sets to resolve phylogenetic relationships with high support. Due to lower costs, target enrichment methods provide a promising avenue for phylogenetic studies including larger taxonomic/specimen sampling than whole genome data would allow.
Collapse
Affiliation(s)
- Felix Grewe
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA.
| | - Claudio Ametrano
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| | - Todd J Widhelm
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| | - Steven Leavitt
- Department of Biology and M. L. Bean Life Science Museum, Brigham Young University, Provo, UT, USA
| | - Isabel Distefano
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| | - Wetchasart Polyiam
- Lichen Research Unit, Biology Department, Faculty of Science, Ramkhamhaeng University, Ramkhamhaeng 24 Road, Bangkok, 10240, Thailand
| | - David Pizarro
- Departamento de Farmacología, Farmacognosia y Botánica, Facultad de Farmacia, Universidad Complutense de Madrid, 28040, Madrid, Spain
| | - Mats Wedin
- Department of Botany, Swedish Museum of Natural History, PO Box 50007, SE-104 05, Stockholm, Sweden
| | - Ana Crespo
- Departamento de Farmacología, Farmacognosia y Botánica, Facultad de Farmacia, Universidad Complutense de Madrid, 28040, Madrid, Spain
| | - Pradeep K Divakar
- Departamento de Farmacología, Farmacognosia y Botánica, Facultad de Farmacia, Universidad Complutense de Madrid, 28040, Madrid, Spain
| | - H Thorsten Lumbsch
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| |
Collapse
|
9
|
Jantzen JR, Amarasinghe P, Folk RA, Reginato M, Michelangeli FA, Soltis DE, Cellinese N, Soltis PS. A two-tier bioinformatic pipeline to develop probes for target capture of nuclear loci with applications in Melastomataceae. Appl Plant Sci 2020; 8:e11345. [PMID: 32477841 PMCID: PMC7249273 DOI: 10.1002/aps3.11345] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 12/20/2019] [Indexed: 05/21/2023]
Abstract
PREMISE Putatively single-copy nuclear (SCN) loci, which are identified using genomic resources of closely related species, are ideal for phylogenomic inference. However, suitable genomic resources are not available for many clades, including Melastomataceae. We introduce a versatile approach to identify SCN loci for clades with few genomic resources and use it to develop probes for target enrichment in the distantly related Memecylon and Tibouchina (Melastomataceae). METHODS We present a two-tiered pipeline. First, we identified putatively SCN loci using MarkerMiner and transcriptomes from distantly related species in Melastomataceae. Published loci and genes of functional significance were then added (384 total loci). Second, using HybPiper, we retrieved 689 homologous template sequences for these loci using genome-skimming data from within the focal clades. RESULTS We sequenced 193 loci common to Memecylon and Tibouchina. Probes designed from 56 template sequences successfully targeted sequences in both clades. Probes designed from genome-skimming data within a focal clade were more successful than probes designed from other sources. DISCUSSION Our pipeline successfully identified and targeted SCN loci in Memecylon and Tibouchina, enabling phylogenomic studies in both clades and potentially across Melastomataceae. This pipeline could be easily applied to other clades with few genomic resources.
Collapse
Affiliation(s)
- Johanna R. Jantzen
- Department of BiologyUniversity of FloridaGainesvilleFlorida32611USA
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFlorida32611USA
| | - Prabha Amarasinghe
- Department of BiologyUniversity of FloridaGainesvilleFlorida32611USA
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFlorida32611USA
| | - Ryan A. Folk
- Department of Biological SciencesMississippi State UniversityStarkvilleMississippi39762USA
| | - Marcelo Reginato
- Institute of Systematic BotanyThe New York Botanical GardenBronxNew York10458USA
- Universidade Federal do Rio Grande do SulPorto AlegreRio Grande do Sul90040‐060Brazil
| | | | - Douglas E. Soltis
- Department of BiologyUniversity of FloridaGainesvilleFlorida32611USA
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFlorida32611USA
| | - Nico Cellinese
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFlorida32611USA
| | - Pamela S. Soltis
- Florida Museum of Natural HistoryUniversity of FloridaGainesvilleFlorida32611USA
| |
Collapse
|
10
|
Herrando-Moraira S. Exploring data processing strategies in NGS target enrichment to disentangle radiations in the tribe Cardueae (Compositae). Mol Phylogenet Evol 2018; 128:69-87. [PMID: 30036700 DOI: 10.1016/j.ympev.2018.07.012] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2018] [Revised: 07/13/2018] [Accepted: 07/14/2018] [Indexed: 12/17/2022]
Abstract
Target enrichment is a cost-effective sequencing technique that holds promise for elucidating evolutionary relationships in fast-evolving lineages. However, potential biases and impact of bioinformatic sequence treatments in phylogenetic inference have not been thoroughly explored yet. Here, we investigate this issue with an ultimate goal to shed light into a highly diversified group of Compositae (Asteraceae) constituted by four main genera: Arctium, Cousinia, Saussurea, and Jurinea. Specifically, we compared sequence data extraction methods implemented in two easy-to-use workflows, PHYLUCE and HybPiper, and assessed the impact of two filtering practices intended to reduce phylogenetic noise. In addition, we compared two phylogenetic inference methods: (1) the concatenation approach, in which all loci were concatenated in a supermatrix; and (2) the coalescence approach, in which gene trees were produced independently and then used to construct a species tree under coalescence assumptions. Here we confirm the usefulness of the set of 1061 COS targets (a nuclear conserved orthology loci set developed for the Compositae) across a variety of taxonomic levels. Intergeneric relationships were completely resolved: there are two sister groups, Arctium-Cousinia and Saussurea-Jurinea, which are in agreement with a morphological hypothesis. Intrageneric relationships among species of Arctium, Cousinia, and Saussurea are also well defined. Conversely, conflicting species relationships remain for Jurinea. Methodological choices significantly affected phylogenies in terms of topology, branch length, and support. Across all analyses, the phylogeny obtained using HybPiper and the strictest scheme of removing fast-evolving sites was estimated as the optimal. Regarding methodological choices, we conclude that: (1) trees obtained under the coalescence approach are topologically more congruent between them than those inferred using the concatenation approach; (2) refining treatments only improved support values under the concatenation approach; and (3) branch support values are maximized when fast-evolving sites are removed in the concatenation approach, and when a higher number of loci is analyzed in the coalescence approach.
Collapse
Affiliation(s)
- Sonia Herrando-Moraira
- Botanic Institute of Barcelona (IBB, CSIC-ICUB), Pg. del Migdia, s.n., 08038 Barcelona, Spain.
| | | |
Collapse
|
11
|
Wolf PG, Robison TA, Johnson MG, Sundue MA, Testo WL, Rothfels CJ. Target sequence capture of nuclear-encoded genes for phylogenetic analysis in ferns. Appl Plant Sci 2018; 6:e01148. [PMID: 30131890 PMCID: PMC5991577 DOI: 10.1002/aps3.1148] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Accepted: 03/04/2018] [Indexed: 05/07/2023]
Abstract
PREMISE OF THE STUDY Until recently, most phylogenetic studies of ferns were based on chloroplast genes. Evolutionary inferences based on these data can be incomplete because the characters are from a single linkage group and are uniparentally inherited. These limitations are particularly acute in studies of hybridization, which is prevalent in ferns; fern hybrids are common and ferns are able to hybridize across highly diverged lineages, up to 60 million years since divergence in one documented case. However, it not yet clear what effect such hybridization has on fern evolution, in part due to a paucity of available biparentally inherited (nuclear-encoded) markers. METHODS We designed oligonucleotide baits to capture 25 targeted, low-copy nuclear markers from a sample of 24 species spanning extant fern diversity. RESULTS Most loci were successfully sequenced from most accessions. Although the baits were designed from exon (transcript) data, we successfully captured intron sequences that should be useful for more focused phylogenetic studies. We present phylogenetic analyses of the new target sequence capture data and integrate these into a previous transcript-based data set. DISCUSSION We make our bait sequences available to the community as a resource for further studies of fern phylogeny.
Collapse
Affiliation(s)
- Paul G. Wolf
- Ecology Center and Department of BiologyUtah State UniversityLoganUtah84322USA
| | - Tanner A. Robison
- Ecology Center and Department of BiologyUtah State UniversityLoganUtah84322USA
| | - Matthew G. Johnson
- Department of Biological SciencesTexas Tech UniversityLubbockTexas79409USA
| | - Michael A. Sundue
- Pringle HerbariumDepartment of Plant BiologyUniversity of VermontBurlingtonVermont05405USA
| | - Weston L. Testo
- Pringle HerbariumDepartment of Plant BiologyUniversity of VermontBurlingtonVermont05405USA
| | - Carl J. Rothfels
- University Herbarium and Department of Integrative BiologyUniversity of CaliforniaBerkeleyCalifornia94720USA
| |
Collapse
|