1
|
Höhna S, Lower SE, Duchen P, Catalán A. Robustness of divergence time estimation despite gene tree estimation error: a case study of fireflies (Coleoptera: Lampyridae). Syst Biol 2025; 74:335-348. [PMID: 39534920 DOI: 10.1093/sysbio/syae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 08/28/2024] [Accepted: 11/04/2024] [Indexed: 11/16/2024] Open
Abstract
Genomic data have become ubiquitous in phylogenomic studies, including divergence time estimation, but provide new challenges. These challenges include, among others, biological gene tree discordance, methodological gene tree estimation error, and computational limitations on performing full Bayesian inference under complex models. In this study, we use a recently published firefly (Coleoptera: Lampyridae) anchored hybrid enrichment data set (AHE; 436 loci for 88 Lampyridae species and 10 outgroup species) as a case study to explore gene tree estimation error and the robustness of divergence time estimation. First, we explored the amount of model violation using posterior predictive simulations because model violations are likely to bias phylogenetic inferences and produce gene tree estimation error. We specifically focused on missing data (either uniformly distributed or systematically) and the distribution of highly variable and conserved sites (either uniformly distributed or clustered). Our assessment of model adequacy showed that standard phylogenetic substitution models are not adequate for any of the 436 AHE loci. We tested if the model violations and alignment errors resulted indeed in gene tree estimation error by comparing the observed gene tree discordance to simulated gene tree discordance under the multispecies coalescent model. Thus, we show that the inferred gene tree discordance is not only due to biological mechanism but primarily due to inference errors. Lastly, we explored if divergence time estimation is robust despite the observed gene tree estimation error. We selected four subsets of the full AHE data set, concatenated each subset and performed a Bayesian relaxed clock divergence estimation in RevBayes. The estimated divergence times overlapped for all nodes that are shared between the topologies. Thus, divergence time estimation is robust using any well selected data subset as long as the topology inference is robust.
Collapse
Affiliation(s)
- Sebastian Höhna
- GeoBio-Center, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
- Department of Earth and Environmental Sciences, Paleontology & Geobiology, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
| | - Sarah E Lower
- Department of Biology, Bucknell University, Lewisburg, PA 17837, United States
| | - Pablo Duchen
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg Universität Mainz, 55128 Mainz, Germany
| | - Ana Catalán
- GeoBio-Center, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
- Division of Evolutionary Biology, Ludeig-Maximilians-Universität München, 82152 Planegg-Martinsried, Germany
| |
Collapse
|
2
|
Collienne L, Barker M, Suchard MA, Matsen FA. Phylogenetic Tree Instability After Taxon Addition: Empirical Frequency, Predictability, and Consequences For Online Inference. Syst Biol 2025; 74:101-111. [PMID: 39453463 PMCID: PMC11809580 DOI: 10.1093/sysbio/syae059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 09/30/2024] [Accepted: 10/22/2024] [Indexed: 10/26/2024] Open
Abstract
Online phylogenetic inference methods add sequentially arriving sequences to an inferred phylogeny without the need to recompute the entire tree from scratch. Some online method implementations exist already, but there remains concern that additional sequences may change the topological relationship among the original set of taxa. We call such a change in tree topology a lack of stability for the inferred tree. In this article, we analyze the stability of single taxon addition in a Maximum Likelihood framework across 1000 empirical datasets. We find that instability occurs in almost 90% of our examples, although observed topological differences do not always reach significance under the approximately unbiased (AU) test. Changes in tree topology after addition of a taxon rarely occur close to its attachment location, and are more frequently observed in more distant tree locations carrying low bootstrap support. To investigate whether instability is predictable, we hypothesize sources of instability and design summary statistics addressing these hypotheses. Using these summary statistics as input features for machine learning under random forests, we are able to predict instability and can identify the most influential features. In summary, it does not appear that a strict insertion-only online inference method will deliver globally optimal trees, although relaxing insertion strictness by allowing for a small number of final tree rearrangements or accepting slightly suboptimal solutions appears feasible.
Collapse
Affiliation(s)
- Lena Collienne
- Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Mary Barker
- Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Marc A Suchard
- Department of Human Genetics, University of California, 885 Tiverton Drive, Los Angeles, CA 90095, USA
- Department of Computational Medicine, University of California, 885 Tiverton Drive, Los Angeles, CA 90095, USA
- Department of Biostatistics, University of California, 650 Charles E. Young Dr. South, Los Angeles, CA 90095, USA
| | - Frederick A Matsen
- Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
- Department of Statistics, University of Washington, Padelford Hall, Northeast Stevens Way, Seattle, WA 98195, USA
- Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195, USA
| |
Collapse
|
3
|
Toups BS, Thomson RC, Brown JM. Complex Models of Sequence Evolution Improve Fit, But Not Gene Tree Discordance, for Tetrapod Mitogenomes. Syst Biol 2025; 74:86-100. [PMID: 39392926 DOI: 10.1093/sysbio/syae056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 08/22/2024] [Accepted: 10/09/2024] [Indexed: 10/13/2024] Open
Abstract
Variation in gene tree estimates is widely observed in empirical phylogenomic data and is often assumed to be the result of biological processes. However, a recent study using tetrapod mitochondrial genomes to control for biological sources of variation due to their haploid, uniparentally inherited, and non-recombining nature found that levels of discordance among mitochondrial gene trees were comparable to those found in studies that assume only biological sources of variation. Additionally, they found that several of the models of sequence evolution chosen to infer gene trees were doing an inadequate job of fitting the sequence data. These results indicated that significant amounts of gene tree discordance in empirical data may be due to poor fit of sequence evolution models and that more complex and biologically realistic models may be needed. To test how the fit of sequence evolution models relates to gene tree discordance, we analyzed the same mitochondrial data sets as the previous study using 2 additional, more complex models of sequence evolution that each include a different biologically realistic aspect of the evolutionary process: A covarion model to incorporate site-specific rate variation across lineages (heterotachy), and a partitioned model to incorporate variable evolutionary patterns by codon position. Our results show that both additional models fit the data better than the models used in the previous study, with the covarion being consistently and strongly preferred as tree size increases. However, even these more preferred models still inferred highly discordant mitochondrial gene trees, thus deepening the mystery around what we label the "Mito-Phylo Paradox" and leading us to ask whether the observed variation could, in fact, be biological in nature after all.
Collapse
Affiliation(s)
- Benjamin S Toups
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| | - Robert C Thomson
- School of Life Sciences, University of Hawai'i, 3190 Maile Way, St. John 101, Honolulu, HI 96822, USA
| | - Jeremy M Brown
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, 202 Life Sciences Bldg, Baton Rouge, LA 70803, USA
| |
Collapse
|
4
|
Jin Y, Du X, Jiang C, Ji W, Yang P. Disentangling sources of gene tree discordance for Hordeum species via target-enriched sequencing assays. Mol Phylogenet Evol 2024; 199:108160. [PMID: 39019201 DOI: 10.1016/j.ympev.2024.108160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 07/04/2024] [Accepted: 07/14/2024] [Indexed: 07/19/2024]
Abstract
Hordeum is an economically and evolutionarily important genus within the Triticeae tribe of the family Poaceae, and contains 33 widely distributed and diverse species which cytologically represent four subgenomes (H, Xa, Xu and I). These wild species (except Hordeum spontaneum, which is the primary gene pool of barley) are secondary or tertiary gene-pool germplasms for barley and wheat improvement, and uncovering their complicated evolutionary relationships would benefit for future breeding programs. Here, we developed a complexity-reduced pipeline via capturing genome-wide distributed fragments via two novel target-enriched assays (HorCap v1.0 and BarPlex v1.0) in conjugation with high-throughput sequencing of the enrichments. Both assays were tested for genotyping 40 species from three genera (Hordeum, Triticum, and Aegilops) containing 82 samples 67 accessions. Either of both assays worked efficiently in genotyping, while integration of both assays can significantly improve the robustness and resolution of the Hordeum phylogenetic trees. Interestingly, the incomplete lineage sorting (ILS) was inferred for the first time as the major factor causing phylogenetic discordance among the four subgenomes, whereas in New World species (carrying I genome) post-speciation introgression events were revealed. Through revising the evolutionary relationships of the Hordeum species based on an ancestral state reconstruction for the diploids and parental donor inference for the polyploids, our results raised new queries about the Hordeum phylogeny. Moreover, both newly-developed assays are applicable in genotyping and phylogenetic analysis of Hordeum and other Triticeae wild species.
Collapse
Affiliation(s)
- Yanlong Jin
- State Key Laboratory of Crop Gene Resources and Breeding, Key Laboratory of Grain Crop Genetic Resources Evaluation and Utilization (MARA), Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; State Key Laboratory of Crop Stress Biology for Arid Areas, College of Agronomy, Northwest AandF University, Yangling 712100, China
| | - Xin Du
- State Key Laboratory of Crop Stress Biology for Arid Areas, College of Agronomy, Northwest AandF University, Yangling 712100, China
| | - Congcong Jiang
- State Key Laboratory of Crop Gene Resources and Breeding, Key Laboratory of Grain Crop Genetic Resources Evaluation and Utilization (MARA), Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Wanquan Ji
- State Key Laboratory of Crop Stress Biology for Arid Areas, College of Agronomy, Northwest AandF University, Yangling 712100, China
| | - Ping Yang
- State Key Laboratory of Crop Gene Resources and Breeding, Key Laboratory of Grain Crop Genetic Resources Evaluation and Utilization (MARA), Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| |
Collapse
|
5
|
Fabreti LG, Coghill LM, Thomson RC, Höhna S, Brown JM. The Expected Behaviors of Posterior Predictive Tests and Their Unexpected Interpretation. Mol Biol Evol 2024; 41:msae051. [PMID: 38437512 PMCID: PMC10946647 DOI: 10.1093/molbev/msae051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 01/09/2024] [Indexed: 03/06/2024] Open
Abstract
Poor fit between models of sequence or trait evolution and empirical data is known to cause biases and lead to spurious conclusions about evolutionary patterns and processes. Bayesian posterior prediction is a flexible and intuitive approach for detecting such cases of poor fit. However, the expected behavior of posterior predictive tests has never been characterized for evolutionary models, which is critical for their proper interpretation. Here, we show that the expected distribution of posterior predictive P-values is generally not uniform, in contrast to frequentist P-values used for hypothesis testing, and extreme posterior predictive P-values often provide more evidence of poor fit than typically appreciated. Posterior prediction assesses model adequacy under highly favorable circumstances, because the model is fitted to the data, which leads to expected distributions that are often concentrated around intermediate values. Nonuniform expected distributions of P-values do not pose a problem for the application of these tests, however, and posterior predictive P-values can be interpreted as the posterior probability that the fitted model would predict a dataset with a test statistic value as extreme as the value calculated from the observed data.
Collapse
Affiliation(s)
- Luiza Guimarães Fabreti
- GeoBio-Center, Ludwig-Maximilians-Universität München, Richard-Wagner-Str. 10, Munich 80333, Germany
- Department of Earth and Environmental Sciences, Paleontology & Geobiology, Ludwig-Maximilians-Universität München, Richard-Wagner-Str. 10, Munich 80333, Germany
| | - Lyndon M Coghill
- Center for Computation & Technology, Louisiana State University, Baton Rouge, LA 70803, USA
- Present address: Division of Research, Innovation, and Impact & Department of Veterinary Pathobiology, University of Missouri, Columbia, MO 65211, USA
| | - Robert C Thomson
- School of Life Sciences, University of Hawai‘i at Mānoa, Honolulu, HI 96822, USA
| | - Sebastian Höhna
- GeoBio-Center, Ludwig-Maximilians-Universität München, Richard-Wagner-Str. 10, Munich 80333, Germany
- Department of Earth and Environmental Sciences, Paleontology & Geobiology, Ludwig-Maximilians-Universität München, Richard-Wagner-Str. 10, Munich 80333, Germany
| | - Jeremy M Brown
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
6
|
Guimarães Fabreti L, Höhna S. Nucleotide Substitution Model Selection Is Not Necessary for Bayesian Inference of Phylogeny With Well-Behaved Priors. Syst Biol 2023; 72:1418-1432. [PMID: 37455495 DOI: 10.1093/sysbio/syad041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 06/02/2023] [Accepted: 07/02/2023] [Indexed: 07/18/2023] Open
Abstract
Model selection aims to choose the most adequate model for the statistical analysis at hand. The model must be complex enough to capture the complexity of the data but should be simple enough not to overfit. In phylogenetics, the most common model selection scenario concerns selecting an adequate substitution and partition model for sequence evolution to infer a phylogenetic tree. Previously, several studies showed that substitution model under-parameterization can bias phylogenetic studies. Here, we explored the impact of substitution model over-parameterization in a Bayesian statistical framework. We performed simulations under the simplest substitution model, the Jukes-Cantor model, and compare posterior estimates of phylogenetic tree topologies and tree length under the true model to the most complex model, the $\text{GTR}+\Gamma+\text{I}$ substitution model, including over-splitting the data into additional subsets (i.e., applying partitioned models). We explored 4 choices of prior distributions: the default substitution model priors of MrBayes, BEAST2, and RevBayes and a newly devised prior choice (Tame). Our results show that Bayesian inference of phylogeny is robust to substitution model over-parameterization and over-partitioning but only under our new prior settings. All 3 current default priors introduced biases for the estimated tree length. We conclude that substitution and partition model selection are superfluous steps in Bayesian phylogenetic inference pipelines if well-behaved prior distributions are applied and more effort should focus on more complex and biologically realistic substitution models.
Collapse
Affiliation(s)
- Luiza Guimarães Fabreti
- GeoBio-Center, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
- Department of Earth and Environmental Sciences, Paleontology & Geobiology, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
| | - Sebastian Höhna
- GeoBio-Center, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
- Department of Earth and Environmental Sciences, Paleontology & Geobiology, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
| |
Collapse
|
7
|
Pardo-De la Hoz CJ, Magain N, Piatkowski B, Cornet L, Dal Forno M, Carbone I, Miadlikowska J, Lutzoni F. Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-Supported Phylogenomic Trees of Nostocalean Cyanobacteria. Syst Biol 2023; 72:694-712. [PMID: 36827095 DOI: 10.1093/sysbio/syad008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 02/12/2023] [Accepted: 02/22/2023] [Indexed: 02/25/2023] Open
Abstract
Prokaryotic genomes are often considered to be mosaics of genes that do not necessarily share the same evolutionary history due to widespread horizontal gene transfers (HGTs). Consequently, representing evolutionary relationships of prokaryotes as bifurcating trees has long been controversial. However, studies reporting conflicts among gene trees derived from phylogenomic data sets have shown that these conflicts can be the result of artifacts or evolutionary processes other than HGT, such as incomplete lineage sorting, low phylogenetic signal, and systematic errors due to substitution model misspecification. Here, we present the results of an extensive exploration of phylogenetic conflicts in the cyanobacterial order Nostocales, for which previous studies have inferred strongly supported conflicting relationships when using different concatenated phylogenomic data sets. We found that most of these conflicts are concentrated in deep clusters of short internodes of the Nostocales phylogeny, where the great majority of individual genes have low resolving power. We then inferred phylogenetic networks to detect HGT events while also accounting for incomplete lineage sorting. Our results indicate that most conflicts among gene trees are likely due to incomplete lineage sorting linked to an ancient rapid radiation, rather than to HGTs. Moreover, the short internodes of this radiation fit the expectations of the anomaly zone, i.e., a region of the tree parameter space where a species tree is discordant with its most likely gene tree. We demonstrated that concatenation of different sets of loci can recover up to 17 distinct and well-supported relationships within the putative anomaly zone of Nostocales, corresponding to the observed conflicts among well-supported trees based on concatenated data sets from previous studies. Our findings highlight the important role of rapid radiations as a potential cause of strongly conflicting phylogenetic relationships when using phylogenomic data sets of bacteria. We propose that polytomies may be the most appropriate phylogenetic representation of these rapid radiations that are part of anomaly zones, especially when all possible genomic markers have been considered to infer these phylogenies. [Anomaly zone; bacteria; horizontal gene transfer; incomplete lineage sorting; Nostocales; phylogenomic conflict; rapid radiation; Rhizonema.].
Collapse
Affiliation(s)
| | - Nicolas Magain
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
| | - Bryan Piatkowski
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Luc Cornet
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
- BCCM/IHEM, Mycology and Aerobiology, Sciensano, Brussels, Belgium
| | | | - Ignazio Carbone
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27606, USA
| | | | | |
Collapse
|
8
|
Zhao M, Kurtis SM, White ND, Moncrieff AE, Leite RN, Brumfield RT, Braun EL, Kimball RT. Exploring Conflicts in Whole Genome Phylogenetics: A Case Study Within Manakins (Aves: Pipridae). Syst Biol 2023; 72:161-178. [PMID: 36130303 PMCID: PMC10452962 DOI: 10.1093/sysbio/syac062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 09/03/2022] [Accepted: 09/06/2022] [Indexed: 11/13/2022] Open
Abstract
Some phylogenetic problems remain unresolved even when large amounts of sequence data are analyzed and methods that accommodate processes such as incomplete lineage sorting are employed. In addition to investigating biological sources of phylogenetic incongruence, it is also important to reduce noise in the phylogenomic dataset by using appropriate filtering approach that addresses gene tree estimation errors. We present the results of a case study in manakins, focusing on the very difficult clade comprising the genera Antilophia and Chiroxiphia. Previous studies suggest that Antilophia is nested within Chiroxiphia, though relationships among Antilophia+Chiroxiphia species have been highly unstable. We extracted more than 11,000 loci (ultra-conserved elements and introns) from whole genomes and conducted analyses using concatenation and multispecies coalescent methods. Topologies resulting from analyses using all loci differed depending on the data type and analytical method, with 2 clades (Antilophia+Chiroxiphia and Manacus+Pipra+Machaeopterus) in the manakin tree showing incongruent results. We hypothesized that gene trees that conflicted with a long coalescent branch (e.g., the branch uniting Antilophia+Chiroxiphia) might be enriched for cases of gene tree estimation error, so we conducted analyses that either constrained those gene trees to include monophyly of Antilophia+Chiroxiphia or excluded these loci. While constraining trees reduced some incongruence, excluding the trees led to completely congruent species trees, regardless of the data type or model of sequence evolution used. We found that a suite of gene metrics (most importantly the number of informative sites and likelihood of intralocus recombination) collectively explained the loci that resulted in non-monophyly of Antilophia+Chiroxiphia. We also found evidence for introgression that may have contributed to the discordant topologies we observe in Antilophia+Chiroxiphia and led to deviations from expectations given the multispecies coalescent model. Our study highlights the importance of identifying factors that can obscure phylogenetic signal when dealing with recalcitrant phylogenetic problems, such as gene tree estimation error, incomplete lineage sorting, and reticulation events. [Birds; c-gene; data type; gene estimation error; model fit; multispecies coalescent; phylogenomics; reticulation].
Collapse
Affiliation(s)
- Min Zhao
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| | - Sarah M Kurtis
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| | - Noor D White
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, Bethesda, MD 20892, USA
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA
| | - Andre E Moncrieff
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USAand
| | - Rafael N Leite
- Graduate Program in Ecology, National Institute of Amazonian Research, Manaus, AM, Brazil
| | - Robb T Brumfield
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USAand
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| | - Rebecca T Kimball
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
9
|
Tyszka AS, Bretz EC, Robertson HM, Woodcock-Girard MD, Ramanauskas K, Larson DA, Stull GW, Walker JF. Characterizing conflict and congruence of molecular evolution across organellar genome sequences for phylogenetics in land plants. FRONTIERS IN PLANT SCIENCE 2023; 14:1125107. [PMID: 37063179 PMCID: PMC10098128 DOI: 10.3389/fpls.2023.1125107] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 03/13/2023] [Indexed: 06/19/2023]
Abstract
Chloroplasts and mitochondria each contain their own genomes, which have historically been and continue to be important sources of information for inferring the phylogenetic relationships among land plants. The organelles are predominantly inherited from the same parent, and therefore should exhibit phylogenetic concordance. In this study, we examine the mitochondrion and chloroplast genomes of 226 land plants to infer the degree of similarity between the organelles' evolutionary histories. Our results show largely concordant topologies are inferred between the organelles, aside from four well-supported conflicting relationships that warrant further investigation. Despite broad patterns of topological concordance, our findings suggest that the chloroplast and mitochondrial genomes evolved with significant differences in molecular evolution. The differences result in the genes from the chloroplast and the mitochondrion preferentially clustering with other genes from their respective organelles by a program that automates selection of evolutionary model partitions for sequence alignments. Further investigation showed that changes in compositional heterogeneity are not always uniform across divergences in the land plant tree of life. These results indicate that although the chloroplast and mitochondrial genomes have coexisted for over 1 billion years, phylogenetically, they are still evolving sufficiently independently to warrant separate models of evolution. As genome sequencing becomes more accessible, research into these organelles' evolution will continue revealing insight into the ancient cellular events that shaped not only their history, but the history of plants as a whole.
Collapse
Affiliation(s)
- Alexa S. Tyszka
- Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL, United States
| | - Eric C. Bretz
- Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL, United States
| | - Holly M. Robertson
- Sainsbury Laboratory, School of Biological Sciences, University of Cambridge, Cambridge, England, United Kingdom
| | - Miles D. Woodcock-Girard
- Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL, United States
| | - Karolis Ramanauskas
- Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL, United States
| | - Drew A. Larson
- Department of Biology, Indiana University, Bloomington, IN, United States
| | - Gregory W. Stull
- Germplasm Bank of Wild Species in Southwest China, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, China
- Department of Botany, National Museum of Natural History, Smithsonian Institution, Washington, DC, United States
| | - Joseph F. Walker
- Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL, United States
| |
Collapse
|
10
|
Xie P, Tang L, Luo Y, Liu C, Yan H. Plastid Phylogenomic Insights into the Inter-Tribal Relationships of Plantaginaceae. BIOLOGY 2023; 12:biology12020263. [PMID: 36829541 PMCID: PMC9953724 DOI: 10.3390/biology12020263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/21/2023] [Accepted: 01/26/2023] [Indexed: 02/10/2023]
Abstract
Plantaginaceae, consisting of 12 tribes, is a diverse, cosmopolitan family. To date, the inter-tribal relationships of this family have been unresolved, and the plastome structure and composition within Plantaginaceae have seldom been comprehensively investigated. In this study, we compared the plastomes from 41 Plantaginaceae species (including 6 newly sequenced samples and 35 publicly representative species) representing 11 tribes. To clarify the inter-tribal relationships of Plantaginaceae, we inferred phylogenic relationships based on the concatenated and coalescent analyses of 68 plastid protein-coding genes. PhyParts analysis was performed to assess the level of concordance and conflict among gene trees across the species tree. The results indicate that most plastomes of Plantaginaceae are largely conserved in terms of genome structure and gene content. In contrast to most previous studies, a robust phylogeny was recovered using plastome data, providing new insights for better understanding the inter-tribal relationships of Plantaginaceae. Both concatenated and coalescent phylogenies favored the sister relationship between Plantagineae and Digitalideae, as well as between Veroniceae and Hemiphragmeae. Sibthorpieae diverged into a separate branch which was sister to a clade comprising the four tribes mentioned above. Furthermore, the sister relationship between Russelieae and Cheloneae is strongly supported. The results of PhyParts showed gene tree congruence and conflict to varying degrees, but most plastid genes were uninformative for phylogenetic nodes, revealing the defects of previous studies using single or multiple plastid DNA sequences to infer the phylogeny of Plantaginaceae.
Collapse
Affiliation(s)
- Pingxuan Xie
- College of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou 510006, China
| | - Lilei Tang
- College of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou 510006, China
| | - Yanzhen Luo
- College of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou 510006, China
| | - Changkun Liu
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Hanjing Yan
- College of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou 510006, China
- Key Laboratory of State Administration of Traditional Chinese Medicine for Production and Development of Cantonese Medicinal Materials, Guangzhou 510006, China
- Correspondence:
| |
Collapse
|
11
|
Duan Y, Fu S, Ye Z, Bu W. Phylogeny of Urostylididae (Heteroptera: Pentatomoidea) reveals rapid radiation and challenges traditional classification. ZOOL SCR 2023. [DOI: 10.1111/zsc.12582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Yujie Duan
- Institute of Entomology, College of Life Sciences Nankai University Tianjin China
| | - Siying Fu
- Institute of Entomology, College of Life Sciences Nankai University Tianjin China
| | - Zhen Ye
- Institute of Entomology, College of Life Sciences Nankai University Tianjin China
| | - Wenjun Bu
- Institute of Entomology, College of Life Sciences Nankai University Tianjin China
| |
Collapse
|
12
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
13
|
Hutter CR, Cobb KA, Portik DM, Travers SL, Wood PL, Brown RM. FrogCap: A modular sequence capture probe-set for phylogenomics and population genetics for all frogs, assessed across multiple phylogenetic scales. Mol Ecol Resour 2022; 22:1100-1119. [PMID: 34569723 DOI: 10.1111/1755-0998.13517] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 09/08/2021] [Accepted: 09/14/2021] [Indexed: 12/01/2022]
Abstract
Despite the prevalence of high-throughput sequencing in phylogenetics, many relationships remain difficult to resolve because of conflicting signal among genomic regions. Selection of different types of molecular markers from different genomic regions is required to overcome these challenges. For evolutionary studies in frogs, we introduce the publicly available FrogCap suite of genomic resources, which is a large collection of ~15,000 markers that unifies previous genetic sequencing efforts. FrogCap is designed to be modular, such that subsets of markers and SNPs can be selected based on the desired phylogenetic scale. FrogCap uses a variety of marker types that include exons and introns, ultraconserved elements, and previously sequenced Sanger markers, which span up to 10,000 bp in alignment lengths; in addition, we demonstrate potential for SNP-based analyses. We tested FrogCap using 121 samples distributed across five phylogenetic scales, comparing probes designed using a consensus- or exemplar genome-based approach. Using the consensus design is more resilient to issues with sensitivity, specificity, and missing data than picking an exemplar genome sequence. We also tested the impact of different bait kit sizes (20,020 vs. 40,040) on depth of coverage and found triple the depth for the 20,020 bait kit. We observed sequence capture success (i.e., missing data, sequenced markers/bases, marker length, and informative sites) across phylogenetic scales. The incorporation of different marker types is effective for deep phylogenetic relationships and shallow population genetics studies. Having demonstrated FrogCap's utility and modularity, we conclude that these new resources are efficacious for high-throughput sequencing projects across variable timescales.
Collapse
Affiliation(s)
- Carl R Hutter
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, USA
| | - Kerry A Cobb
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, USA
| | - Daniel M Portik
- California Academy of Sciences, San Francisco, California, USA
| | - Scott L Travers
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, USA
- Department of Biological Sciences, Rutgers University-Newark, Newark, New Jersey, USA
| | - Perry L Wood
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, USA
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, USA
| |
Collapse
|
14
|
Bourke BP, Justi SA, Caicedo-Quiroga L, Pecor DB, Wilkerson RC, Linton YM. Phylogenetic analysis of the Neotropical Albitarsis Complex based on mitogenome data. Parasit Vectors 2021; 14:589. [PMID: 34838107 PMCID: PMC8627034 DOI: 10.1186/s13071-021-05090-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 11/08/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Some of the most important malaria vectors in South America belong to the Albitarsis Complex (Culicidae; Anophelinae; Anopheles). Understanding the origin, nature, and geographical distribution of species diversity in this important complex has important implications for vector incrimination, control, and management, and for modelling future responses to climate change, deforestation, and human population expansion. This study attempts to further explore species diversity and evolutionary history in the Albitarsis Complex by undertaking a characterization and phylogenetic analysis of the mitogenome of all 10 putative taxa in the Albitarsis Complex. METHODS Mitogenome assembly and annotation allowed for feature comparison among Albitarsis Complex and Anopheles species. Selection analysis was conducted across all 13 protein-coding genes. Maximum likelihood and Bayesian inference methods were used to construct gene and species trees, respectively. Bayesian methods were also used to jointly estimate species delimitation and species trees. RESULTS Gene composition and order were conserved across species within the complex. Unique signatures of positive selection were detected in two species-Anopheles janconnae and An. albitarsis G-which may have played a role in the recent and rapid diversification of the complex. The COI gene phylogeny does not fully recover the mitogenome phylogeny, and a multispecies coalescent-based phylogeny shows that considerable uncertainty exists through much of the mitogenome species tree. The origin of divergence in the complex dates to the Pliocene/Pleistocene boundary, and divergence within the distinct northern South American clade is estimated at approximately 1 million years ago. Neither the phylogenetic trees nor the delimitation approach rejected the 10-species hypothesis, although the analyses could not exclude the possibility that four putative species with scant a priori support (An. albitarsis G, An. albitarsis H, An. albitarsis I, and An. albitarsis J), represent population-level, rather than species-level, splits. CONCLUSION The lack of resolution in much of the species tree and the limitations of the delimitation analysis warrant future studies on the complex using genome-wide data and the inclusion of additional specimens, particularly from two putative species, An. albitarsis I and An. albitarsis J.
Collapse
Affiliation(s)
- Brian P Bourke
- Walter Reed Biosystematics Unit, Smithsonian Institution Museum Support Center, MRC-534, 4210 Silver Hill Rd., Suitland, MD, 20746, USA.
- Walter Reed Army Institute of Research, 503 Robert Grant Avenue, Silver Spring, MD, 20910, USA.
- Department of Entomology, Smithsonian Institution-National Museum of Natural History, 10th St NE & Constitution Ave NE, Washington, DC, 20002, USA.
| | - Silvia A Justi
- Walter Reed Biosystematics Unit, Smithsonian Institution Museum Support Center, MRC-534, 4210 Silver Hill Rd., Suitland, MD, 20746, USA
- Walter Reed Army Institute of Research, 503 Robert Grant Avenue, Silver Spring, MD, 20910, USA
- Department of Entomology, Smithsonian Institution-National Museum of Natural History, 10th St NE & Constitution Ave NE, Washington, DC, 20002, USA
| | - Laura Caicedo-Quiroga
- Walter Reed Biosystematics Unit, Smithsonian Institution Museum Support Center, MRC-534, 4210 Silver Hill Rd., Suitland, MD, 20746, USA
- Walter Reed Army Institute of Research, 503 Robert Grant Avenue, Silver Spring, MD, 20910, USA
- Department of Entomology, Smithsonian Institution-National Museum of Natural History, 10th St NE & Constitution Ave NE, Washington, DC, 20002, USA
| | - David B Pecor
- Walter Reed Biosystematics Unit, Smithsonian Institution Museum Support Center, MRC-534, 4210 Silver Hill Rd., Suitland, MD, 20746, USA
- Walter Reed Army Institute of Research, 503 Robert Grant Avenue, Silver Spring, MD, 20910, USA
- Department of Entomology, Smithsonian Institution-National Museum of Natural History, 10th St NE & Constitution Ave NE, Washington, DC, 20002, USA
| | - Richard C Wilkerson
- Walter Reed Biosystematics Unit, Smithsonian Institution Museum Support Center, MRC-534, 4210 Silver Hill Rd., Suitland, MD, 20746, USA
- Walter Reed Army Institute of Research, 503 Robert Grant Avenue, Silver Spring, MD, 20910, USA
- Department of Entomology, Smithsonian Institution-National Museum of Natural History, 10th St NE & Constitution Ave NE, Washington, DC, 20002, USA
| | - Yvonne-Marie Linton
- Walter Reed Biosystematics Unit, Smithsonian Institution Museum Support Center, MRC-534, 4210 Silver Hill Rd., Suitland, MD, 20746, USA
- Walter Reed Army Institute of Research, 503 Robert Grant Avenue, Silver Spring, MD, 20910, USA
- Department of Entomology, Smithsonian Institution-National Museum of Natural History, 10th St NE & Constitution Ave NE, Washington, DC, 20002, USA
| |
Collapse
|
15
|
Liston A, Weitemier KA, Letelier L, Podani J, Zong Y, Liu L, Dickinson TA. Phylogeny of Crataegus (Rosaceae) based on 257 nuclear loci and chloroplast genomes: evaluating the impact of hybridization. PeerJ 2021; 9:e12418. [PMID: 34754629 PMCID: PMC8555502 DOI: 10.7717/peerj.12418] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 10/10/2021] [Indexed: 11/20/2022] Open
Abstract
Background Hawthorn species (Crataegus L.; Rosaceae tribe Maleae) form a well-defined clade comprising five subgeneric groups readily distinguished using either molecular or morphological data. While multiple subsidiary groups (taxonomic sections, series) are recognized within some subgenera, the number of and relationships among species in these groups are subject to disagreement. Gametophytic apomixis and polyploidy are prevalent in the genus, and disagreement concerns whether and how apomictic genotypes should be recognized taxonomically. Recent studies suggest that many polyploids arise from hybridization between members of different infrageneric groups. Methods We used target capture and high throughput sequencing to obtain nucleotide sequences for 257 nuclear loci and nearly complete chloroplast genomes from a sample of hawthorns representing all five currently recognized subgenera. Our sample is structured to include two examples of intersubgeneric hybrids and their putative diploid and tetraploid parents. We queried the alignment of nuclear loci directly for evidence of hybridization, and compared individual gene trees with each other, and with both the maximum likelihood plastome tree and the nuclear concatenated and multilocus coalescent-based trees. Tree comparisons provided a promising, if challenging (because of the number of comparisons involved) method for visualizing variation in tree topology. We found it useful to deploy comparisons based not only on tree-tree distances but also on a metric of tree-tree concordance that uses extrinsic information about the relatedness of the terminals in comparing tree topologies. Results We obtained well-supported phylogenies from plastome sequences and from a minimum of 244 low copy-number nuclear loci. These are consistent with a previous morphology-based subgeneric classification of the genus. Despite the high heterogeneity of individual gene trees, we corroborate earlier evidence for the importance of hybridization in the evolution of Crataegus. Hybridization between subgenus Americanae and subgenus Sanguineae was documented for the origin of Sanguineae tetraploids, but not for a tetraploid Americanae species. This is also the first application of target capture probes designed with apple genome sequence. We successfully assembled 95% of 257 loci in Crataegus, indicating their potential utility across the genera of the apple tribe.
Collapse
Affiliation(s)
- Aaron Liston
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States of America
| | - Kevin A Weitemier
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States of America.,Department of Fisheries and Wildlife, Oregon State University, Corvallis, OR, United States of America
| | - Lucas Letelier
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States of America
| | - János Podani
- Department of Plant Systematics, Ecology and Theoretical Biology, Eötvös Lorand University, Budapest, Hungary
| | - Yu Zong
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States of America.,College of Chemistry & Life Sciences, Zhejiang Normal University, Jinhua, Zhejiang, China
| | - Lang Liu
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Timothy A Dickinson
- Department of Natural History, Royal Ontario Museum, Toronto, Ontario, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
16
|
Arato J, Fitch WT. Phylogenetic signal in the vocalizations of vocal learning and vocal non-learning birds. Philos Trans R Soc Lond B Biol Sci 2021; 376:20200241. [PMID: 34482730 PMCID: PMC8419570 DOI: 10.1098/rstb.2020.0241] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2021] [Indexed: 11/16/2022] Open
Abstract
Some animal vocalizations develop reliably in the absence of relevant experience, but an intriguing subset of animal vocalizations is learned: they require acoustic models during ontogeny in order to develop, and the learner's vocal output reflects those models. To what extent do such learned vocalizations reflect phylogeny? We compared the degree to which phylogenetic signal is present in vocal signals from a wide taxonomic range of birds, including both vocal learners (songbirds) and vocal non-learners. We used publically available molecular phylogenies and developed methods to analyse spectral and temporal features in a carefully curated collection of high-quality recordings of bird songs and bird calls, to yield acoustic distance measures. Our methods were initially developed using pairs of closely related North American and European bird species, and then applied to a non-overlapping random stratified sample of European birds. We found strong similarity in acoustic and genetic distances, which manifested itself as a significant phylogenetic signal, in both samples. In songbirds, both learned song and (mostly) unlearned calls allowed reconstruction of phylogenetic trees nearly isomorphic to the phylogenetic trees derived from genetic analysis. We conclude that phylogeny and inheritance constrain vocal structure to a surprising degree, even in learned birdsong. This article is part of the theme issue 'Vocal learning in animals and humans'.
Collapse
Affiliation(s)
- Jozsef Arato
- Vienna Cognitive Science Hub, University of Vienna, Vienna, Austria
| | - W. Tecumseh Fitch
- Vienna Cognitive Science Hub, University of Vienna, Vienna, Austria
- Department of Cognitive Biology, University of Vienna, Vienna, Austria
| |
Collapse
|
17
|
Duchêne DA, Mather N, Van Der Wal C, Ho SYW. Excluding loci with substitution saturation improves inferences from phylogenomic data. Syst Biol 2021; 71:676-689. [PMID: 34508605 PMCID: PMC9016599 DOI: 10.1093/sysbio/syab075] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Accepted: 09/07/2021] [Indexed: 11/21/2022] Open
Abstract
The historical signal in nucleotide sequences becomes eroded over time by substitutions occurring repeatedly at the same sites. This phenomenon, known as substitution saturation, is recognized as one of the primary obstacles to deep-time phylogenetic inference using genome-scale data sets. We present a new test of substitution saturation and demonstrate its performance in simulated and empirical data. For some of the 36 empirical phylogenomic data sets that we examined, we detect substitution saturation in around 50% of loci. We found that saturation tends to be flagged as problematic in loci with highly discordant phylogenetic signals across sites. Within each data set, the loci with smaller numbers of informative sites are more likely to be flagged as containing problematic levels of saturation. The entropy saturation test proposed here is sensitive to high evolutionary rates relative to the evolutionary timeframe, while also being sensitive to several factors known to mislead phylogenetic inference, including short internal branches relative to external branches, short nucleotide sequences, and tree imbalance. Our study demonstrates that excluding loci with substitution saturation can be an effective means of mitigating the negative impact of multiple substitutions on phylogenetic inferences. [Phylogenetic model performance; phylogenomics; substitution model; substitution saturation; test statistics.]
Collapse
Affiliation(s)
- David A Duchêne
- Centre for Evolutionary Hologenomics, University of Copenhagen, 1352 Copenhagen, Denmark
| | - Niklas Mather
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Cara Van Der Wal
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
18
|
Chafin TK, Douglas MR, Bangs MR, Martin BT, Mussmann SM, Douglas ME. Taxonomic Uncertainty and the Anomaly Zone: Phylogenomics Disentangle a Rapid Radiation to Resolve Contentious Species (Gila robusta Complex) in the Colorado River. Genome Biol Evol 2021; 13:evab200. [PMID: 34432005 PMCID: PMC8449829 DOI: 10.1093/gbe/evab200] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/19/2021] [Indexed: 12/18/2022] Open
Abstract
Species are indisputable units for biodiversity conservation, yet their delimitation is fraught with both conceptual and methodological difficulties. A classic example is the taxonomic controversy surrounding the Gila robusta complex in the lower Colorado River of southwestern North America. Nominal species designations were originally defined according to weakly diagnostic morphological differences, but these conflicted with subsequent genetic analyses. Given this ambiguity, the complex was re-defined as a single polytypic unit, with the proposed "threatened" status under the U.S. Endangered Species Act of two elements being withdrawn. Here we re-evaluated the status of the complex by utilizing dense spatial and genomic sampling (n = 387 and >22 k loci), coupled with SNP-based coalescent and polymorphism-aware phylogenetic models. In doing so, we found that all three species were indeed supported as evolutionarily independent lineages, despite widespread phylogenetic discordance. To juxtapose this discrepancy with previous studies, we first categorized those evolutionary mechanisms driving discordance, then tested (and subsequently rejected) prior hypotheses which argued phylogenetic discord in the complex was driven by the hybrid origin of Gila nigra. The inconsistent patterns of diversity we found within G. robusta were instead associated with rapid Plio-Pleistocene drainage evolution, with subsequent divergence within the "anomaly zone" of tree space producing ambiguities that served to confound prior studies. Our results not only support the resurrection of the three species as distinct entities but also offer an empirical example of how phylogenetic discordance can be categorized within other recalcitrant taxa, particularly when variation is primarily partitioned at the species level.
Collapse
Affiliation(s)
- Tyler K Chafin
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, USA
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, USA
| | - Marlis R Douglas
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, USA
| | - Max R Bangs
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, USA
- Department of Biological Science, Florida State University, Tallahassee, Florida, USA
| | - Bradley T Martin
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, USA
- Global Campus, University of Arkansas, Fayetteville, Arkansas, USA
| | - Steven M Mussmann
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, USA
- Southwestern Native Aquatic Resources and Recovery Center, U.S. Fish & Wildlife Service, Dexter, New Mexico, USA
| | - Michael E Douglas
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, USA
| |
Collapse
|
19
|
Harrington SM, Wishingrad V, Thomson RC. Properties of Markov Chain Monte Carlo Performance across Many Empirical Alignments. Mol Biol Evol 2021; 38:1627-1640. [PMID: 33185685 PMCID: PMC8042746 DOI: 10.1093/molbev/msaa295] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Nearly all current Bayesian phylogenetic applications rely on Markov chain Monte Carlo (MCMC) methods to approximate the posterior distribution for trees and other parameters of the model. These approximations are only reliable if Markov chains adequately converge and sample from the joint posterior distribution. Although several studies of phylogenetic MCMC convergence exist, these have focused on simulated data sets or select empirical examples. Therefore, much that is considered common knowledge about MCMC in empirical systems derives from a relatively small family of analyses under ideal conditions. To address this, we present an overview of commonly applied phylogenetic MCMC diagnostics and an assessment of patterns of these diagnostics across more than 18,000 empirical analyses. Many analyses appeared to perform well and failures in convergence were most likely to be detected using the average standard deviation of split frequencies, a diagnostic that compares topologies among independent chains. Different diagnostics yielded different information about failed convergence, demonstrating that multiple diagnostics must be employed to reliably detect problems. The number of taxa and average branch lengths in analyses have clear impacts on MCMC performance, with more taxa and shorter branches leading to more difficult convergence. We show that the usage of models that include both Γ-distributed among-site rate variation and a proportion of invariable sites is not broadly problematic for MCMC convergence but is also unnecessary. Changes to heating and the usage of model-averaged substitution models can both offer improved convergence in some cases, but neither are a panacea.
Collapse
Affiliation(s)
| | - Van Wishingrad
- School of Life Sciences, University of Hawai'i, Honolulu, HI
| | | |
Collapse
|
20
|
Vankan M, Ho SYW, Duchêne DA. Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference. Syst Biol 2021; 71:490-500. [PMID: 34255084 PMCID: PMC8830059 DOI: 10.1093/sysbio/syab051] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 06/18/2021] [Indexed: 11/12/2022] Open
Abstract
Phylogenetic analyses of genomic data provide a powerful means of reconstructing the evolutionary relationships among organisms, yet such analyses are often hindered by conflicting phylogenetic signals among loci. Identifying the signals that are most influential to species-tree estimation can help to inform the choice of data for phylogenomic analysis. We investigated this in an analysis of 30 phylogenomic data sets. For each data set, we examined the association between several branch-length characteristics of gene trees and the distance between these gene trees and the corresponding species trees. We found that the distance of each gene tree to the species tree inferred from the full data set was positively associated with variation in root-to-tip distances and negatively associated with mean branch support. However, no such associations were found for gene-tree length, a measure of the overall substitution rate at each locus. We further explored the usefulness of the best-performing branch-based characteristics for selecting loci for phylogenomic analyses. We found that loci that yield gene trees with high variation in root-to-tip distances have a disproportionately distant signal of tree topology compared with the complete data sets. These results suggest that rate variation across lineages should be taken into consideration when exploring and even selecting loci for phylogenomic analysis.[Branch support; data filtering; nucleotide substitution model; phylogenomics; substitution rate; summary coalescent methods.]
Collapse
Affiliation(s)
- Mezzalina Vankan
- School of Life and Environmental Sciences, University of Sydney, NSW 2006, Australia.,Research School of Biology, Australian National University, ACT 2601, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, NSW 2006, Australia
| | - David A Duchêne
- Research School of Biology, Australian National University, ACT 2601, Australia.,Centre for Evolutionary Hologenomics, University of Copenhagen, Copenhagen 1352, Denmark
| |
Collapse
|
21
|
Freitas FV, Branstetter MG, Griswold T, Almeida EAB. Partitioned Gene-Tree Analyses and Gene-Based Topology Testing Help Resolve Incongruence in a Phylogenomic Study of Host-Specialist Bees (Apidae: Eucerinae). Mol Biol Evol 2021; 38:1090-1100. [PMID: 33179746 PMCID: PMC7947843 DOI: 10.1093/molbev/msaa277] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Incongruence among phylogenetic results has become a common occurrence in analyses of genome-scale data sets. Incongruence originates from uncertainty in underlying evolutionary processes (e.g., incomplete lineage sorting) and from difficulties in determining the best analytical approaches for each situation. To overcome these difficulties, more studies are needed that identify incongruences and demonstrate practical ways to confidently resolve them. Here, we present results of a phylogenomic study based on the analysis 197 taxa and 2,526 ultraconserved element (UCE) loci. We investigate evolutionary relationships of Eucerinae, a diverse subfamily of apid bees (relatives of honey bees and bumble bees) with >1,200 species. We sampled representatives of all tribes within the group and >80% of genera, including two mysterious South American genera, Chilimalopsis and Teratognatha. Initial analysis of the UCE data revealed two conflicting hypotheses for relationships among tribes. To resolve the incongruence, we tested concatenation and species tree approaches and used a variety of additional strategies including locus filtering, partitioned gene-trees searches, and gene-based topological tests. We show that within-locus partitioning improves gene tree and subsequent species-tree estimation, and that this approach, confidently resolves the incongruence observed in our data set. After exploring our proposed analytical strategy on eucerine bees, we validated its efficacy to resolve hard phylogenetic problems by implementing it on a published UCE data set of Adephaga (Insecta: Coleoptera). Our results provide a robust phylogenetic hypothesis for Eucerinae and demonstrate a practical strategy for resolving incongruence in other phylogenomic data sets.
Collapse
Affiliation(s)
- Felipe V Freitas
- Laboratório de Biologia Comparada e Abelhas (LBCA), Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
- U.S. Department of Agriculture, Agricultural Research Service (USDA-ARS), Pollinating Insects Research Unit, Utah State University, Logan, UT
| | - Michael G Branstetter
- U.S. Department of Agriculture, Agricultural Research Service (USDA-ARS), Pollinating Insects Research Unit, Utah State University, Logan, UT
| | - Terry Griswold
- U.S. Department of Agriculture, Agricultural Research Service (USDA-ARS), Pollinating Insects Research Unit, Utah State University, Logan, UT
| | - Eduardo A B Almeida
- Laboratório de Biologia Comparada e Abelhas (LBCA), Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| |
Collapse
|
22
|
Jiang X, Edwards SV, Liu L. The Multispecies Coalescent Model Outperforms Concatenation Across Diverse Phylogenomic Data Sets. Syst Biol 2021; 69:795-812. [PMID: 32011711 PMCID: PMC7302055 DOI: 10.1093/sysbio/syaa008] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 12/24/2019] [Accepted: 01/02/2020] [Indexed: 11/30/2022] Open
Abstract
A statistical framework of model comparison and model validation is essential to resolving the debates over concatenation and coalescent models in phylogenomic data analysis. A set of statistical tests are here applied and developed to evaluate and compare the adequacy of substitution, concatenation, and multispecies coalescent (MSC) models across 47 phylogenomic data sets collected across tree of life. Tests for substitution models and the concatenation assumption of topologically congruent gene trees suggest that a poor fit of substitution models, rejected by 44% of loci, and concatenation models, rejected by 38% of loci, is widespread. Logistic regression shows that the proportions of GC content and informative sites are both negatively correlated with the fit of substitution models across loci. Moreover, a substantial violation of the concatenation assumption of congruent gene trees is consistently observed across six major groups (birds, mammals, fish, insects, reptiles, and others, including other invertebrates). In contrast, among those loci adequately described by a given substitution model, the proportion of loci rejecting the MSC model is 11%, significantly lower than those rejecting the substitution and concatenation models. Although conducted on reduced data sets due to computational constraints, Bayesian model validation and comparison both strongly favor the MSC over concatenation across all data sets; the concatenation assumption of congruent gene trees rarely holds for phylogenomic data sets with more than 10 loci. Thus, for large phylogenomic data sets, model comparisons are expected to consistently and more strongly favor the coalescent model over the concatenation model. We also found that loci rejecting the MSC have little effect on species tree estimation. Our study reveals the value of model validation and comparison in phylogenomic data analysis, as well as the need for further improvements of multilocus models and computational tools for phylogenetic inference. [Bayes factor; Bayesian model validation; coalescent prior; congruent gene trees; independent prior; Metazoa; posterior predictive simulation.]
Collapse
Affiliation(s)
- Xiaodong Jiang
- Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30602, USA
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Liang Liu
- Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30602, USA.,Institute of Bioinformatics, University of Georgia, 120 Green Street, Athens, GA 30602, USA
| |
Collapse
|
23
|
Koenen EJ, Kidner C, de Souza ÉR, Simon MF, Iganci JR, Nicholls JA, Brown GK, de Queiroz LP, Luckow M, Lewis GP, Pennington RT, Hughes CE. Hybrid capture of 964 nuclear genes resolves evolutionary relationships in the mimosoid legumes and reveals the polytomous origins of a large pantropical radiation. AMERICAN JOURNAL OF BOTANY 2020; 107:1710-1735. [PMID: 33253423 PMCID: PMC7839790 DOI: 10.1002/ajb2.1568] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 08/07/2020] [Indexed: 05/05/2023]
Abstract
PREMISE Targeted enrichment methods facilitate sequencing of hundreds of nuclear loci to enhance phylogenetic resolution and elucidate why some parts of the "tree of life" are difficult (if not impossible) to resolve. The mimosoid legumes are a prominent pantropical clade of ~3300 species of woody angiosperms for which previous phylogenies have shown extensive lack of resolution, especially among the species-rich and taxonomically challenging ingoids. METHODS We generated transcriptomes to select low-copy nuclear genes, enrich these via hybrid capture for representative species of most mimosoid genera, and analyze the resulting data using de novo assembly and various phylogenomic tools for species tree inference. We also evaluate gene tree support and conflict for key internodes and use phylogenetic network analysis to investigate phylogenetic signal across the ingoids. RESULTS Our selection of 964 nuclear genes greatly improves phylogenetic resolution across the mimosoid phylogeny and shows that the ingoid clade can be resolved into several well-supported clades. However, nearly all loci show lack of phylogenetic signal for some of the deeper internodes within the ingoids. CONCLUSIONS Lack of resolution in the ingoid clade is most likely the result of hyperfast diversification, potentially causing a hard polytomy of six or seven lineages. The gene set for targeted sequencing presented here offers great potential to further enhance the phylogeny of mimosoids and the wider Caesalpinioideae with denser taxon sampling, to provide a framework for taxonomic reclassification, and to study the ingoid radiation.
Collapse
Affiliation(s)
- Erik J. M. Koenen
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107ZurichCH‐8008Switzerland
| | - Catherine Kidner
- School of Biological SciencesUniversity of EdinburghKing’s Buildings, Mayfield RoadEdinburghUK
- Royal Botanic Gardens Edinburgh20a Inverleith RowEdinburghEH3 5LRUK
| | - Élvia R. de Souza
- Departamento Ciências BiológicasUniversidade Estadual de Feira de SantanaAvenida Transnordestina s/n—Novo Horizonte44036‐900Feira de SantanaBrazil
| | - Marcelo F. Simon
- Embrapa Recursos Genéticos e BiotecnologiaParque Estação Biológica (PqEB)Avenida W5 norte70770‐917BrasíliaBrazil
| | - João R. Iganci
- Instituto de BiologiaUniversidade Federal de PelotasCampus Universitário Capão do LeãoTravessa André Dreyfus s/nCapão do Leão96010‐900Rio Grande do SulBrazil
| | - James A. Nicholls
- School of Biological SciencesUniversity of EdinburghKing’s Buildings, Mayfield RoadEdinburghUK
- Australian National Insect CollectionCSIROClunies Ross StActonACT 2601Australia
| | - Gillian K. Brown
- Queensland HerbariumBrisbane Botanic GardensMount Coot‐tha, Mt Coot‐tha RoadToowong4066QueenslandAustralia
| | - Luciano P. de Queiroz
- Departamento Ciências BiológicasUniversidade Estadual de Feira de SantanaAvenida Transnordestina s/n—Novo Horizonte44036‐900Feira de SantanaBrazil
| | - Melissa Luckow
- L.H. Bailey HortoriumDepartment of Plant BiologyCornell University412 Mann Library BuildingIthacaNew York14853USA
| | - Gwilym P. Lewis
- Comparative Plant and Fungal Biology DepartmentRoyal Botanic GardensKew, RichmondSurreyTW9 3AEUK
| | - R. Toby Pennington
- Royal Botanic Gardens Edinburgh20a Inverleith RowEdinburghEH3 5LRUK
- GeographyUniversity of ExeterAmory Building, Rennes DriveExeterEX4 4RJUK
| | - Colin E. Hughes
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107ZurichCH‐8008Switzerland
| |
Collapse
|
24
|
Wang E, Zhang D, Braun MS, Hotz-Wagenblatt A, Pärt T, Arlt D, Schmaljohann H, Bairlein F, Lei F, Wink M. Can Mitogenomes of the Northern Wheatear (Oenanthe oenanthe) Reconstruct Its Phylogeography and Reveal the Origin of Migrant Birds? Sci Rep 2020; 10:9290. [PMID: 32518318 PMCID: PMC7283232 DOI: 10.1038/s41598-020-66287-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 05/15/2020] [Indexed: 11/09/2022] Open
Abstract
The Northern Wheatear (Oenanthe oenanthe, including the nominate and the two subspecies O. o. leucorhoa and O. o. libanotica) and the Seebohm’s Wheatear (Oenanthe seebohmi) are today regarded as two distinct species. Before, all four taxa were regarded as four subspecies of the Northern Wheatear. Their classification has exclusively been based on ecological and morphological traits, while their molecular characterization is still missing. With this study, we used next-generation sequencing to assemble 117 complete mitochondrial genomes covering O. o. oenanthe, O. o. leucorhoa and O. seebohmi. We compared the resolution power of each individual mitochondrial marker and concatenated marker sets to reconstruct the phylogeny and estimate speciation times of three taxa. Moreover, we tried to identify the origin of migratory wheatears caught on Helgoland (Germany) and on Crete (Greece). Mitogenome analysis revealed two different ancient lineages that separated around 400,000 years ago. Both lineages consisted of a mix of subspecies and species. The phylogenetic trees, as well as haplotype networks are incongruent with the present morphology-based classification. Mitogenome could not distinguish these presumed species. The genetic panmixia among present populations and taxa might be the consequence of mitochondrial introgression between ancient wheatear populations.
Collapse
Affiliation(s)
- Erjia Wang
- Institute of Pharmacy and Molecular Biotechnology, Heidelberg University, Heidelberg, Germany.
| | - Dezhi Zhang
- Key laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,College of Life Sciences, UniversityMerops apiaster. J. Divers of Chinese Academy of Sciences, Beijing, China
| | - Markus Santhosh Braun
- Institute of Pharmacy and Molecular Biotechnology, Heidelberg University, Heidelberg, Germany
| | - Agnes Hotz-Wagenblatt
- Omics IT and Data Management Core Facility, German Cancer Research Center, Heidelberg University, Heidelberg, Germany
| | - Tomas Pärt
- Department of Ecology, Swedish University of Agricultural Science, Uppsala, Sweden
| | - Debora Arlt
- Department of Ecology, Swedish University of Agricultural Science, Uppsala, Sweden
| | - Heiko Schmaljohann
- Institute of Avian Research "Vogelwarte Helgoland", Wilhelmshaven, Germany.,Institute for Biology und Environmental Sciences (IBU), Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Franz Bairlein
- Institute of Avian Research "Vogelwarte Helgoland", Wilhelmshaven, Germany
| | - Fumin Lei
- Key laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,College of Life Sciences, UniversityMerops apiaster. J. Divers of Chinese Academy of Sciences, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Michael Wink
- Institute of Pharmacy and Molecular Biotechnology, Heidelberg University, Heidelberg, Germany.
| |
Collapse
|
25
|
Larson DA, Walker JF, Vargas OM, Smith SA. A consensus phylogenomic approach highlights paleopolyploid and rapid radiation in the history of Ericales. AMERICAN JOURNAL OF BOTANY 2020; 107:773-789. [PMID: 32350864 DOI: 10.1002/ajb2.1469] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2019] [Accepted: 02/12/2020] [Indexed: 05/27/2023]
Abstract
PREMISE Large genomic data sets offer the promise of resolving historically recalcitrant species relationships. However, different methodologies can yield conflicting results, especially when clades have experienced ancient, rapid diversification. Here, we analyzed the ancient radiation of Ericales and explored sources of uncertainty related to species tree inference, conflicting gene tree signal, and the inferred placement of gene and genome duplications. METHODS We used a hierarchical clustering approach, with tree-based homology and orthology detection, to generate six filtered phylogenomic matrices consisting of data from 97 transcriptomes and genomes. Support for species relationships was inferred from multiple lines of evidence including shared gene duplications, gene tree conflict, gene-wise edge-based analyses, concatenation, and coalescent-based methods, and is summarized in a consensus framework. RESULTS Our consensus approach supported a topology largely concordant with previous studies, but suggests that the data are not capable of resolving several ancient relationships because of lack of informative characters, sensitivity to methodology, and extensive gene tree conflict correlated with paleopolyploidy. We found evidence of a whole-genome duplication before the radiation of all or most ericalean families, and demonstrate that tree topology and heterogeneous evolutionary rates affect the inferred placement of genome duplications. CONCLUSIONS We provide several hypotheses regarding the history of Ericales, and confidently resolve most nodes, but demonstrate that a series of ancient divergences are unresolvable with these data. Whether paleopolyploidy is a major source of the observed phylogenetic conflict warrants further investigation.
Collapse
Affiliation(s)
- Drew A Larson
- Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Joseph F Walker
- Sainsbury Laboratory (SLCU), University of Cambridge, Cambridge, CB2 1LR, UK
| | - Oscar M Vargas
- Department of Ecology & Evolutionary Biology, University of California, Santa Cruz, CA, 95060, USA
| | - Stephen A Smith
- Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
26
|
Musher LJ, Ferreira M, Auerbach AL, McKay J, Cracraft J. Why is Amazonia a 'source' of biodiversity? Climate-mediated dispersal and synchronous speciation across the Andes in an avian group (Tityrinae). Proc Biol Sci 2020; 286:20182343. [PMID: 30940057 DOI: 10.1098/rspb.2018.2343] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Amazonia is a 'source' of biodiversity for other Neotropical ecosystems, but which conditions trigger in situ speciation and emigration is contentious. Three hypotheses for how communities have assembled include (1) a stochastic model wherein chance dispersal events lead to gradual emigration and species accumulation, (2) diversity-dependence wherein successful dispersal events decline through time due to ecological limits, and (3) barrier displacement wherein environmental change facilitates dispersal to other biomes via transient habitat corridors. We sequenced thousands of molecular markers for the Neotropical Tityrinae (Aves) and applied a novel filtering protocol to identify loci with high utility for dated phylogenomics. We used these loci to estimate divergence times and model Tityrinae's evolutionary history. We detected a prominent role for speciation driven by barriers including synchronous speciation across the Andes and found that dispersal increased toward the present. Because diversification was continuous but dispersal was non-random over time, we show that barrier displacement better explains Tityrinae's history than stochasticity or diversity-dependence. We propose that Amazonia is a source of biodiversity because (1) it is a relic of a biome that was once more extensive, (2) environmentally mediated corridors facilitated emigration and (3) constant diversification is attributed to a spatially heterogeneous landscape that is perpetually dynamic through time.
Collapse
Affiliation(s)
- Lukas J Musher
- 1 Department of Ornithology, American Museum of Natural History , Central Park West @ 79th Street, New York, NY 10024 , USA.,2 The Richard Gilder Graduate School, American Museum of Natural History , Central Park West @ 79th Street, New York, NY 10024 , USA
| | - Mateus Ferreira
- 3 Programa Pós-Graduação em Genética, Conservação e Biologia Evolutiva, INPA , Manaus, AM , Brazil
| | - Anya L Auerbach
- 4 Department of Biological Sciences Collegiate Division, University of Chicago , 1101 East 57th Street, Chicago, IL 60637 , USA
| | - Jessica McKay
- 1 Department of Ornithology, American Museum of Natural History , Central Park West @ 79th Street, New York, NY 10024 , USA
| | - Joel Cracraft
- 1 Department of Ornithology, American Museum of Natural History , Central Park West @ 79th Street, New York, NY 10024 , USA
| |
Collapse
|
27
|
Koenen EJM, Ojeda DI, Steeves R, Migliore J, Bakker FT, Wieringa JJ, Kidner C, Hardy OJ, Pennington RT, Bruneau A, Hughes CE. Large-scale genomic sequence data resolve the deepest divergences in the legume phylogeny and support a near-simultaneous evolutionary origin of all six subfamilies. THE NEW PHYTOLOGIST 2020; 225:1355-1369. [PMID: 31665814 PMCID: PMC6972672 DOI: 10.1111/nph.16290] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 09/14/2019] [Indexed: 05/02/2023]
Abstract
Phylogenomics is increasingly used to infer deep-branching relationships while revealing the complexity of evolutionary processes such as incomplete lineage sorting, hybridization/introgression and polyploidization. We investigate the deep-branching relationships among subfamilies of the Leguminosae (or Fabaceae), the third largest angiosperm family. Despite their ecological and economic importance, a robust phylogenetic framework for legumes based on genome-scale sequence data is lacking. We generated alignments of 72 chloroplast genes and 7621 homologous nuclear-encoded proteins, for 157 and 76 taxa, respectively. We analysed these with maximum likelihood, Bayesian inference, and a multispecies coalescent summary method, and evaluated support for alternative topologies across gene trees. We resolve the deepest divergences in the legume phylogeny despite lack of phylogenetic signal across all chloroplast genes and the majority of nuclear genes. Strongly supported conflict in the remainder of nuclear genes is suggestive of incomplete lineage sorting. All six subfamilies originated nearly simultaneously, suggesting that the prevailing view of some subfamilies as 'basal' or 'early-diverging' with respect to others should be abandoned, which has important implications for understanding the evolution of legume diversity and traits. Our study highlights the limits of phylogenetic resolution in relation to rapid successive speciation.
Collapse
Affiliation(s)
- Erik J. M. Koenen
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107CH‐8008ZurichSwitzerland
| | - Dario I. Ojeda
- Service Évolution Biologique et ÉcologieFaculté des SciencesUniversité Libre de BruxellesAvenue Franklin Roosevelt 501050BrusselsBelgium
- Norwegian Institute of Bioeconomy ResearchHøgskoleveien 81433ÅsNorway
| | - Royce Steeves
- Institut de Recherche en Biologie Végétale and Département de Sciences BiologiquesUniversité de Montréal4101 Sherbrooke St EMontrealQCH1X 2B2Canada
- Fisheries & Oceans CanadaGulf Fisheries Center343 Université AveMonctonNBE1C 5K4Canada
| | - Jérémy Migliore
- Service Évolution Biologique et ÉcologieFaculté des SciencesUniversité Libre de BruxellesAvenue Franklin Roosevelt 501050BrusselsBelgium
| | - Freek T. Bakker
- Biosystematics GroupWageningen UniversityDroevendaalsesteeg 16708 PBWageningenthe Netherlands
| | - Jan J. Wieringa
- Naturalis Biodiversity Center, LeidenDarwinweg 22333 CRLeidenthe Netherlands
| | - Catherine Kidner
- Royal Botanic Gardens Edinburgh20a Inverleith RowEdinburghEH3 5LRUK
- School of Biological SciencesUniversity of EdinburghKing's Buildings, Mayfield RdEdinburghEH9 3JUUK
| | - Olivier J. Hardy
- Service Évolution Biologique et ÉcologieFaculté des SciencesUniversité Libre de BruxellesAvenue Franklin Roosevelt 501050BrusselsBelgium
| | - R. Toby Pennington
- Royal Botanic Gardens Edinburgh20a Inverleith RowEdinburghEH3 5LRUK
- GeographyUniversity of ExeterAmory Building, Rennes DriveExeterEX4 4RJUK
| | - Anne Bruneau
- Institut de Recherche en Biologie Végétale and Département de Sciences BiologiquesUniversité de Montréal4101 Sherbrooke St EMontrealQCH1X 2B2Canada
| | - Colin E. Hughes
- Department of Systematic and Evolutionary BotanyUniversity of ZurichZollikerstrasse 107CH‐8008ZurichSwitzerland
| |
Collapse
|
28
|
Duckett DJ, Pelletier TA, Carstens BC. Identifying model violations under the multispecies coalescent model using P2C2M.SNAPP. PeerJ 2020; 8:e8271. [PMID: 31949994 PMCID: PMC6956792 DOI: 10.7717/peerj.8271] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Accepted: 11/22/2019] [Indexed: 11/20/2022] Open
Abstract
Phylogenetic estimation under the multispecies coalescent model (MSCM) assumes all incongruence among loci is caused by incomplete lineage sorting. Therefore, applying the MSCM to datasets that contain incongruence that is caused by other processes, such as gene flow, can lead to biased phylogeny estimates. To identify possible bias when using the MSCM, we present P2C2M.SNAPP. P2C2M.SNAPP is an R package that identifies model violations using posterior predictive simulation. P2C2M.SNAPP uses the posterior distribution of species trees output by the software package SNAPP to simulate posterior predictive datasets under the MSCM, and then uses summary statistics to compare either the empirical data or the posterior distribution to the posterior predictive distribution to identify model violations. In simulation testing, P2C2M.SNAPP correctly classified up to 83% of datasets (depending on the summary statistic used) as to whether or not they violated the MSCM model. P2C2M.SNAPP represents a user-friendly way for researchers to perform posterior predictive model checks when using the popular SNAPP phylogenetic estimation program. It is freely available as an R package, along with additional program details and tutorials.
Collapse
Affiliation(s)
- Drew J Duckett
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | | | - Bryan C Carstens
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
29
|
Li G, Figueiró HV, Eizirik E, Murphy WJ. Recombination-Aware Phylogenomics Reveals the Structured Genomic Landscape of Hybridizing Cat Species. Mol Biol Evol 2020; 36:2111-2126. [PMID: 31198971 PMCID: PMC6759079 DOI: 10.1093/molbev/msz139] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Current phylogenomic approaches implicitly assume that the predominant phylogenetic signal within a genome reflects the true evolutionary history of organisms, without assessing the confounding effects of postspeciation gene flow that can produce a mosaic of phylogenetic signals that interact with recombinational variation. Here, we tested the validity of this assumption with a phylogenomic analysis of 27 species of the cat family, assessing local effects of recombination rate on species tree inference and divergence time estimation across their genomes. We found that the prevailing phylogenetic signal within the autosomes is not always representative of the most probable speciation history, due to ancient hybridization throughout felid evolution. Instead, phylogenetic signal was concentrated within regions of low recombination, and notably enriched within large X chromosome recombination cold spots that exhibited recurrent patterns of strong genetic differentiation and selective sweeps across mammalian orders. By contrast, regions of high recombination were enriched for signatures of ancient gene flow, and these sequences inflated crown-lineage divergence times by ∼40%. We conclude that existing phylogenomic approaches to infer the Tree of Life may be highly misleading without considering the genomic architecture of phylogenetic signal relative to recombination rate and its interplay with historical hybridization.
Collapse
Affiliation(s)
- Gang Li
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX
| | - Henrique V Figueiró
- PUCRS, Escola de Ciências, Laboratory of Genomics and Molecular Biology, Porto Alegre, Brazil.,INCT-EECBio, Brazil
| | - Eduardo Eizirik
- PUCRS, Escola de Ciências, Laboratory of Genomics and Molecular Biology, Porto Alegre, Brazil.,INCT-EECBio, Brazil
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX
| |
Collapse
|
30
|
Walker JF, Walker-Hale N, Vargas OM, Larson DA, Stull GW. Characterizing gene tree conflict in plastome-inferred phylogenies. PeerJ 2019; 7:e7747. [PMID: 31579615 PMCID: PMC6764362 DOI: 10.7717/peerj.7747] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 08/25/2019] [Indexed: 11/20/2022] Open
Abstract
Evolutionary relationships among plants have been inferred primarily using chloroplast data. To date, no study has comprehensively examined the plastome for gene tree conflict. Using a broad sampling of angiosperm plastomes, we characterize gene tree conflict among plastid genes at various time scales and explore correlates to conflict (e.g., evolutionary rate, gene length, molecule type). We uncover notable gene tree conflict against a backdrop of largely uninformative genes. We find alignment length and tree length are strong predictors of concordance, and that nucleotides outperform amino acids. Of the most commonly used markers, matK, greatly outperforms rbcL; however, the rarely used gene rpoC2 is the top-performing gene in every analysis. We find that rpoC2 reconstructs angiosperm phylogeny as well as the entire concatenated set of protein-coding chloroplast genes. Our results suggest that longer genes are superior for phylogeny reconstruction. The alleviation of some conflict through the use of nucleotides suggests that stochastic and systematic error is likely the root of most of the observed conflict, but further research on biological conflict within plastome is warranted given documented cases of heteroplasmic recombination. We suggest that researchers should filter genes for topological concordance when performing downstream comparative analyses on phylogenetic data, even when using chloroplast genomes.
Collapse
Affiliation(s)
- Joseph F. Walker
- Sainsbury Laboratory (SLCU), University of Cambridge, Cambridge, United Kingdom
| | - Nathanael Walker-Hale
- Department of Plant Sciences, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
| | - Oscar M. Vargas
- University of California, Santa Cruz, Santa Cruz, United States of America
| | - Drew A. Larson
- University of Michigan—Ann Arbor, Ann Arbor, MI, United States of America
| | - Gregory W. Stull
- Department of Botany, Smithsonian Institution, Washington, United States of America
| |
Collapse
|
31
|
Zhang D, Zou H, Hua CJ, Li WX, Mahboob S, Al-Ghanim KA, Al-Misned F, Jakovlić I, Wang GT. Mitochondrial Architecture Rearrangements Produce Asymmetrical Nonadaptive Mutational Pressures That Subvert the Phylogenetic Reconstruction in Isopoda. Genome Biol Evol 2019; 11:1797-1812. [PMID: 31192351 PMCID: PMC6601869 DOI: 10.1093/gbe/evz121] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2019] [Indexed: 01/04/2023] Open
Abstract
The phylogeny of Isopoda, a speciose order of crustaceans, remains unresolved, with different data sets (morphological, nuclear, mitochondrial) often producing starkly incongruent phylogenetic hypotheses. We hypothesized that extreme diversity in their life histories might be causing compositional heterogeneity/heterotachy in their mitochondrial genomes, and compromising the phylogenetic reconstruction. We tested the effects of different data sets (mitochondrial, nuclear, nucleotides, amino acids, concatenated genes, individual genes, gene orders), phylogenetic algorithms (assuming data homogeneity, heterogeneity, and heterotachy), and partitioning; and found that almost all of them produced unique topologies. As we also found that mitogenomes of Asellota and two Cymothoida families (Cymothoidae and Corallanidae) possess inversed base (GC) skew patterns in comparison to other isopods, we concluded that inverted skews cause long-branch attraction phylogenetic artifacts between these taxa. These asymmetrical skews are most likely driven by multiple independent inversions of origin of replication (i.e., nonadaptive mutational pressures). Although the PhyloBayes CAT-GTR algorithm managed to attenuate some of these artifacts (and outperform partitioning), mitochondrial data have limited applicability for reconstructing the phylogeny of Isopoda. Regardless of this, our analyses allowed us to propose solutions to some unresolved phylogenetic debates, and support Asellota are the most likely candidate for the basal isopod branch. As our findings show that architectural rearrangements might produce major compositional biases even on relatively short evolutionary timescales, the implications are that proving the suitability of data via composition skew analyses should be a prerequisite for every study that aims to use mitochondrial data for phylogenetic reconstruction, even among closely related taxa.
Collapse
Affiliation(s)
- Dong Zhang
- Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
- University of Chinese Academy of Sciences, Beijing, P.R. China
| | - Hong Zou
- Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
| | - Cong-Jie Hua
- Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
| | - Wen-Xiang Li
- Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
| | - Shahid Mahboob
- Department of Zoology, College of Science, King Saud University, Riyadh, Saudi Arabia
- Department of Zoology, GC University, Faisalabad, Pakistan
| | | | - Fahad Al-Misned
- Department of Zoology, College of Science, King Saud University, Riyadh, Saudi Arabia
| | | | - Gui-Tang Wang
- Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, P.R. China
| |
Collapse
|
32
|
Mendes FK, Livera AP, Hahn MW. The perils of intralocus recombination for inferences of molecular convergence. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180244. [PMID: 31154973 DOI: 10.1098/rstb.2018.0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Accurate inferences of convergence require that the appropriate tree topology be used. If there is a mismatch between the tree a trait has evolved along and the tree used for analysis, then false inferences of convergence ('hemiplasy') can occur. To avoid problems of hemiplasy when there are high levels of gene tree discordance with the species tree, researchers have begun to construct tree topologies from individual loci. However, due to intralocus recombination, even locus-specific trees may contain multiple topologies within them. This implies that the use of individual tree topologies discordant with the species tree can still lead to incorrect inferences about molecular convergence. Here, we examine the frequency with which single exons and single protein-coding genes contain multiple underlying tree topologies, in primates and Drosophila, and quantify the effects of hemiplasy when using trees inferred from individual loci. In both clades, we find that there are most often multiple diagnosable topologies within single exons and whole genes, with 91% of Drosophila protein-coding genes containing multiple topologies. Because of this underlying topological heterogeneity, even using trees inferred from individual protein-coding genes results in 25% and 38% of substitutions falsely labelled as convergent in primates and Drosophila, respectively. While constructing local trees can reduce the problem of hemiplasy, our results suggest that it will be difficult to completely avoid false inferences of convergence. We conclude by suggesting several ways forward in the analysis of convergent evolution, for both molecular and morphological characters. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Fábio K Mendes
- 1 Department of Computer Science, The University of Auckland , Auckland 1010 , New Zealand.,2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Andrew P Livera
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Matthew W Hahn
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA.,3 Department of Computer Science, Indiana University , Bloomington, IN 47405 , USA
| |
Collapse
|
33
|
Gonçalves DJP, Simpson BB, Ortiz EM, Shimizu GH, Jansen RK. Incongruence between gene trees and species trees and phylogenetic signal variation in plastid genes. Mol Phylogenet Evol 2019; 138:219-232. [PMID: 31146023 DOI: 10.1016/j.ympev.2019.05.022] [Citation(s) in RCA: 83] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 05/20/2019] [Accepted: 05/21/2019] [Indexed: 10/26/2022]
Abstract
The current classification of angiosperms is based primarily on concatenated plastid markers and maximum likelihood (ML) inference. This approach has been justified by the assumption that plastid DNA (ptDNA) is inherited as a single locus and that its individual genes produce congruent trees. However, structural and functional characteristics of ptDNA suggest that plastid genes may not evolve as a single locus and are experiencing different evolutionary forces. To examine this idea, we produced new complete plastid genome (plastome) sequences of 27 species and combined these data with publicly available sequences to produce a final dataset that includes 78 plastid genes for 89 species of rosids and five outgroups. We used four data matrices (i.e., gene, exon, codon-aligned, and amino acid) to infer species and gene trees using ML and multispecies coalescent (MSC) methods. Rosids include about one third of all angiosperms and their two major clades, fabids and malvids, were recovered in almost all analyses. However, we detected incongruence between species trees inferred with different matrices and methods and previously published plastid and nuclear phylogenies. We visualized and tested the significance of incongruence between gene trees and species trees. We then measured the distribution of phylogenetic signal across sites and genes supporting alternative placements of five controversial nodes at different taxonomic levels. Gene trees inferred with plastid data often disagree with species trees inferred using both ML (with unpartitioned or partitioned data) and MSC. Species trees inferred with both methods produced alternative topologies for a few taxa. Our results show that, in a phylogenetic context, plastid protein-coding genes may not be fully linked and behaving as a single locus. Furthermore, concatenated matrices may produce highly supported phylogenies that are discordant with individual gene trees. We also show that phylogenies inferred with MSC are accurate. We therefore emphasize the importance of considering variation in phylogenetic signal across plastid genes and the exploration of plastome data to increase accuracy of estimating relationships. We also support the use of MSC with plastome matrices in future phylogenomic investigations.
Collapse
Affiliation(s)
- Deise J P Gonçalves
- Department of Integrative Biology, The University of Texas at Austin, 2415 Speedway #C0930, Austin, TX 78713, USA.
| | - Beryl B Simpson
- Department of Integrative Biology, The University of Texas at Austin, 2415 Speedway #C0930, Austin, TX 78713, USA
| | - Edgardo M Ortiz
- Department of Integrative Biology, The University of Texas at Austin, 2415 Speedway #C0930, Austin, TX 78713, USA; Department of Ecology & Ecosystem Management, Plant Biodiversity Research, Technical University of Munich, Emil-Ramann Strasse 2, Freising D-85354, Germany
| | - Gustavo H Shimizu
- Department of Plant Biology, University of Campinas, 13083-970 Campinas, SP, Brazil
| | - Robert K Jansen
- Department of Integrative Biology, The University of Texas at Austin, 2415 Speedway #C0930, Austin, TX 78713, USA; Genomics and Biotechnology Research Group, Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
34
|
Simmons MP, Sloan DB, Springer MS, Gatesy J. Gene-wise resampling outperforms site-wise resampling in phylogenetic coalescence analyses. Mol Phylogenet Evol 2019; 131:80-92. [DOI: 10.1016/j.ympev.2018.10.001] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 10/01/2018] [Indexed: 01/15/2023]
|
35
|
Walker JF, Walker-Hale N, Vargas OM, Larson DA, Stull GW. Characterizing gene tree conflict in plastome-inferred phylogenies. PeerJ 2019. [PMID: 31579615 DOI: 10.1101/512079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023] Open
Abstract
Evolutionary relationships among plants have been inferred primarily using chloroplast data. To date, no study has comprehensively examined the plastome for gene tree conflict. Using a broad sampling of angiosperm plastomes, we characterize gene tree conflict among plastid genes at various time scales and explore correlates to conflict (e.g., evolutionary rate, gene length, molecule type). We uncover notable gene tree conflict against a backdrop of largely uninformative genes. We find alignment length and tree length are strong predictors of concordance, and that nucleotides outperform amino acids. Of the most commonly used markers, matK, greatly outperforms rbcL; however, the rarely used gene rpoC2 is the top-performing gene in every analysis. We find that rpoC2 reconstructs angiosperm phylogeny as well as the entire concatenated set of protein-coding chloroplast genes. Our results suggest that longer genes are superior for phylogeny reconstruction. The alleviation of some conflict through the use of nucleotides suggests that stochastic and systematic error is likely the root of most of the observed conflict, but further research on biological conflict within plastome is warranted given documented cases of heteroplasmic recombination. We suggest that researchers should filter genes for topological concordance when performing downstream comparative analyses on phylogenetic data, even when using chloroplast genomes.
Collapse
Affiliation(s)
- Joseph F Walker
- Sainsbury Laboratory (SLCU), University of Cambridge, Cambridge, United Kingdom
| | - Nathanael Walker-Hale
- Department of Plant Sciences, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
| | - Oscar M Vargas
- University of California, Santa Cruz, Santa Cruz, United States of America
| | - Drew A Larson
- University of Michigan-Ann Arbor, Ann Arbor, MI, United States of America
| | - Gregory W Stull
- Department of Botany, Smithsonian Institution, Washington, United States of America
| |
Collapse
|
36
|
What are the roles of taxon sampling and model fit in tests of cyto-nuclear discordance using avian mitogenomic data? Mol Phylogenet Evol 2019; 130:132-142. [DOI: 10.1016/j.ympev.2018.10.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 09/11/2018] [Accepted: 10/09/2018] [Indexed: 11/23/2022]
|
37
|
Brown JM, Thomson RC. Evaluating Model Performance in Evolutionary Biology. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2018. [DOI: 10.1146/annurev-ecolsys-110617-062249] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Many fields of evolutionary biology now depend on stochastic mathematical models. These models are valuable for their ability to formalize predictions in the face of uncertainty and provide a quantitative framework for testing hypotheses. However, no mathematical model will fully capture biological complexity. Instead, these models attempt to capture the important features of biological systems using relatively simple mathematical principles. These simplifications can allow us to focus on differences that are meaningful, while ignoring those that are not. However, simplification also requires assumptions, and to the extent that these are wrong, so is our ability to predict or compare. Here, we discuss approaches for evaluating the performance of evolutionary models in light of their assumptions by comparing them against reality. We highlight general approaches, how they are applied, and remaining opportunities. Absolute tests of fit, even when not explicitly framed as such, are fundamental to progress in understanding evolution.
Collapse
Affiliation(s)
- Jeremy M. Brown
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana 70803, USA
| | - Robert C. Thomson
- Department of Biology, University of Hawai'i, Honolulu, Hawai'i 96822, USA
| |
Collapse
|
38
|
Vinnikov KA, Thomson RC, Munroe TA. Revised classification of the righteye flounders (Teleostei: Pleuronectidae) based on multilocus phylogeny with complete taxon sampling. Mol Phylogenet Evol 2018. [PMID: 29535031 DOI: 10.1016/j.ympev.2018.03.014] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Members of the family Pleuronectidae are common representatives of the marine benthic fauna inhabiting northern regions of the Atlantic and Pacific oceans. The most recent comprehensive classification of the family, based entirely on morphological synapomorphies, recognized five subfamilies, 23 genera, and 61 extant species. However, several subsequent molecular studies have shown that many synapomorphic characters discovered in the morphological study might represent homoplasies, thereby questioning the reliance on these characters with the warning that they may provide misleading information for testing other morphology-based evolutionary hypotheses. In the present study, we propose a comprehensive taxonomic reassessment of the family Pleuronectidae based on the molecular phylogeny reconstructed from four nuclear and three mitochondrial loci and represented by complete taxon sampling of all but one valid species currently assigned to this family. To check for robustness of the phylogenetic hypothesis, we analyzed the effect of base compositional heterogeneity on phylogenetic signal for each locus and compared six different gene partitioning schemes. The final dataset, comprising 14 partitions and 154 individuals, was used to reconstruct phylogenetic trees in RAxML, MrBayes and BEAST2. Alternative topologies for several questionable nodes were compared using Bayes factors. The topology with the highest marginal likelihood was selected as the final phylogenetic tree for inferring pleuronectid relationships and character evolution. Based on our results, we recognize the Pleuronectidae comprising five subfamilies, 24 genera and 59 species. Our new phylogeny comprises five major monophyletic groups within the family, which we define as the subfamilies within the family: Atheresthinae, Pleuronichthyinae, Microstominae, Hippoglossinae and Pleuronectinae. Taxonomic composition of most of these subfamilies is different from that proposed in previous classifications. We also re-assess hypotheses proposed in earlier studies regarding intra-relationships of species of each lineage. Results of the current study contribute to better understanding of the evolutionary relationships of pleuronectid flatfishes based on molecular evidence, and they also provide the framework towards future comprehensive morphological revision of constituent lineages within the family Pleuronectidae.
Collapse
Affiliation(s)
- Kirill A Vinnikov
- Department of Biology, University of Hawai'i at Mānoa, Honolulu, HI 96822, USA; Department of Marine Biodiversity and Bioresources, Far Eastern Federal University, Vladivostok 690091, Russia.
| | - Robert C Thomson
- Department of Biology, University of Hawai'i at Mānoa, Honolulu, HI 96822, USA
| | - Thomas A Munroe
- National Systematics Laboratory, NOAA's National Marine Fisheries Service, Office of Science and Technology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013, USA
| |
Collapse
|