1
|
Tay JH, Baele G, Duchene S. Detecting Episodic Evolution through Bayesian Inference of Molecular Clock Models. Mol Biol Evol 2023; 40:msad212. [PMID: 37738550 PMCID: PMC10560005 DOI: 10.1093/molbev/msad212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 09/13/2023] [Accepted: 09/20/2023] [Indexed: 09/24/2023] Open
Abstract
Molecular evolutionary rate variation is a key aspect of the evolution of many organisms that can be modeled using molecular clock models. For example, fixed local clocks revealed the role of episodic evolution in the emergence of SARS-CoV-2 variants of concern. Like all statistical models, however, the reliability of such inferences is contingent on an assessment of statistical evidence. We present a novel Bayesian phylogenetic approach for detecting episodic evolution. It consists of computing Bayes factors, as the ratio of posterior and prior odds of evolutionary rate increases, effectively quantifying support for the effect size. We conducted an extensive simulation study to illustrate the power of this method and benchmarked it to formal model comparison of a range of molecular clock models using (log) marginal likelihood estimation, and to inference under a random local clock model. Quantifying support for the effect size has higher sensitivity than formal model testing and is straight-forward to compute, because it only needs samples from the posterior and prior distribution. However, formal model testing has the advantage of accommodating a wide range molecular clock models. We also assessed the ability of an automated approach, known as the random local clock, where branches under episodic evolution may be detected without their a priori definition. In an empirical analysis of a data set of SARS-CoV-2 genomes, we find "very strong" evidence for episodic evolution. Our results provide guidelines and practical methods for Bayesian detection of episodic evolution, as well as avenues for further research into this phenomenon.
Collapse
Affiliation(s)
- John H Tay
- Peter Doherty Institute for Infection and Immunity, Department of Microbiology and Immunology, University of Melbourne, Melbourne, Victoria, Australia
| | - Guy Baele
- Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium
| | - Sebastian Duchene
- Peter Doherty Institute for Infection and Immunity, Department of Microbiology and Immunology, University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
2
|
Wu C, Paradis NJ, Lakernick PM, Hryb M. L-shaped distribution of the relative substitution rate (c/μ) observed for SARS-COV-2's genome, inconsistent with the selectionist theory, the neutral theory and the nearly neutral theory but a near-neutral balanced selection theory: Implication on "neutralist-selectionist" debate. Comput Biol Med 2023; 153:106522. [PMID: 36638615 PMCID: PMC9814386 DOI: 10.1016/j.compbiomed.2022.106522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 12/17/2022] [Accepted: 12/31/2022] [Indexed: 01/07/2023]
Abstract
The genomic substitution rate (GSR) of SARS-CoV-2 exhibits a molecular clock feature and does not change under fluctuating environmental factors such as the infected human population (10°-107), vaccination etc. The molecular clock feature is believed to be inconsistent with the selectionist theory (ST). The GSR shows lack of dependence on the effective population size, suggesting Ohta's nearly neutral theory (ONNT) is not applicable to this virus. Big variation of the substitution rate within its genome is also inconsistent with Kimura's neutral theory (KNT). Thus, all three existing evolution theories fail to explain the evolutionary nature of this virus. In this paper, we proposed a Segment Substitution Rate Model (SSRM) under non-neutral selections and pointed out that a balanced mechanism between negative and positive selection of some segments that could also lead to the molecular clock feature. We named this hybrid mechanism as near-neutral balanced selection theory (NNBST) and examined if it was followed by SARS-CoV-2 using the three independent sets of SARS-CoV-2 genomes selected by the Nextstrain team. Intriguingly, the relative substitution rate of this virus exhibited an L-shaped probability distribution consisting with NNBST rather than Poisson distribution predicted by KNT or an asymmetric distribution predicted by ONNT in which nearly neutral sites are believed to be slightly deleterious only, or the distribution that is lack of nearly neutral sites predicted by ST. The time-dependence of the substitution rates for some segments and their correlation with the vaccination were observed, supporting NNBST. Our relative substitution rate method provides a tool to resolve the long standing "neutralist-selectionist" controversy. Implications of NNBST in resolving Lewontin's Paradox is also discussed.
Collapse
Affiliation(s)
- Chun Wu
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA; Department of Biological & Biomedical Sciences, Rowan University, Glassboro, NJ, 08028, USA.
| | - Nicholas J Paradis
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA
| | - Phillip M Lakernick
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA
| | - Mariya Hryb
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA
| |
Collapse
|
3
|
Cornuault J, Sanmartín I. A road map for phylogenetic models of species trees. Mol Phylogenet Evol 2022; 173:107483. [DOI: 10.1016/j.ympev.2022.107483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/09/2022] [Accepted: 04/05/2022] [Indexed: 10/18/2022]
|
4
|
Evolutionary Shift from Purifying Selection towards Divergent Selection of SARS-CoV2 Favors its Invasion into Multiple Human Organs. Virus Res 2022; 313:198712. [PMID: 35176330 PMCID: PMC8843322 DOI: 10.1016/j.virusres.2022.198712] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 02/10/2022] [Accepted: 02/13/2022] [Indexed: 01/07/2023]
Abstract
SARS-CoV2 virus is believed to be originated from a closely related bat Coronavirus RaTG13 lineage and uses its key entry-point residues in S1 protein to attach with human ACE2 receptor. SARS-CoV2 could enter human from bat with its poorly developed entry-point residues much before its known appearance with slower mutation rate or recently with efficiently developed entry-point residues with higher mutation rate or through an intermediate host. Temporal analysis of SARS-CoV2 genome shows that its nucleotide substitution rate is as low as 27nt/year with an evolutionary rate of 9×10−4/site/year, which is well within the range of other RNA virus (10−4 to 10−6/site/year). TMRCA of SARS-CoV2 from bat RaTG13 lineage appears to be in between 9 and 14 years. Evolution of a critical entry-point residue Y493Q needs two substitutions with an intermediate virus carrying Y493H (Y>H>Q) but has not been identified in known twenty-nine bat CoV virus. Genetic codon analysis indicates that SARS-CoV2 evolution during propagation in human disobeys neutral evolution as nonsynonymous mutations surpass synonymous mutations with the increase of ω (dn/ds). Taken together, genetic data suggests that SARS-CoV2 is originated long time back before its appearance in human in 2019. Increase of ω signifies that SARs-CoV2 evolution is approaching towards diversifying selection from purifying selection predictably for its infection power to evade multiple human organs.
Collapse
|
5
|
May MR, Contreras DL, Sundue MA, Nagalingum NS, Looy CV, Rothfels CJ. Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity. Syst Biol 2021; 70:1232-1255. [PMID: 33760075 PMCID: PMC8513765 DOI: 10.1093/sysbio/syab020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 03/09/2021] [Accepted: 03/22/2021] [Indexed: 11/24/2022] Open
Abstract
Phylogenetic divergence-time estimation has been revolutionized by two recent developments: 1) total-evidence dating (or "tip-dating") approaches that allow for the incorporation of fossils as tips in the analysis, with their phylogenetic and temporal relationships to the extant taxa inferred from the data and 2) the fossilized birth-death (FBD) class of tree models that capture the processes that produce the tree (speciation, extinction, and fossilization) and thus provide a coherent and biologically interpretable tree prior. To explore the behavior of these methods, we apply them to marattialean ferns, a group that was dominant in Carboniferous landscapes prior to declining to its modest extant diversity of slightly over 100 species. We show that tree models have a dramatic influence on estimates of both divergence times and topological relationships. This influence is driven by the strong, counter-intuitive informativeness of the uniform tree prior, and the inherent nonidentifiability of divergence-time models. In contrast to the strong influence of the tree models, we find minor effects of differing the morphological transition model or the morphological clock model. We compare the performance of a large pool of candidate models using a combination of posterior-predictive simulation and Bayes factors. Notably, an FBD model with epoch-specific speciation and extinction rates was strongly favored by Bayes factors. Our best-fitting model infers stem and crown divergences for the Marattiales in the mid-Devonian and Late Cretaceous, respectively, with elevated speciation rates in the Mississippian and elevated extinction rates in the Cisuralian leading to a peak diversity of ${\sim}$2800 species at the end of the Carboniferous, representing the heyday of the Psaroniaceae. This peak is followed by the rapid decline and ultimate extinction of the Psaroniaceae, with their descendants, the Marattiaceae, persisting at approximately stable levels of diversity until the present. This general diversification pattern appears to be insensitive to potential biases in the fossil record; despite the preponderance of available fossils being from Pennsylvanian coal balls, incorporating fossilization-rate variation does not improve model fit. In addition, by incorporating temporal data directly within the model and allowing for the inference of the phylogenetic position of the fossils, our study makes the surprising inference that the clade of extant Marattiales is relatively young, younger than any of the fossils historically thought to be congeneric with extant species. This result is a dramatic demonstration of the dangers of node-based approaches to divergence-time estimation, where the assignment of fossils to particular clades is made a priori (earlier node-based studies that constrained the minimum ages of extant genera based on these fossils resulted in much older age estimates than in our study) and of the utility of explicit models of morphological evolution and lineage diversification. [Bayesian model comparison; Carboniferous; divergence-time estimation; fossil record; fossilized birth-death; lineage diversification; Marattiales; models of morphological evolution; Psaronius; RevBayes.].
Collapse
Affiliation(s)
- Michael R May
- Department of Integrative Biology, University of California, Berkeley, 3040 Valley Life Sciences Building #3140, Berkeley, CA 94720, USA
- University Herbarium, University of California, Berkeley, 1001 Valley Life Sciences Building #2465, Berkeley, CA 94720, USA
| | - Dori L Contreras
- Department of Paleontology, Perot Museum of Nature and Science, 2201 N. Field Street, Dallas TX 75201, USA
| | - Michael A Sundue
- Department of Plant Biology, University of Vermont, 111 Jeffords Hall, 63 Carrigan Drive, Burlington, VT 05405, USA
- The Pringle Herbarium, University of Vermont, 305 Jeffords Hall, 63 Carrigan Drive, Burlington, VT 05405, USA
| | - Nathalie S Nagalingum
- Department of Botany, California Academy of Sciences, Golden Gate Park, 55 Music Concourse Drive, San Francisco, CA 94118, USA
| | - Cindy V Looy
- Department of Integrative Biology, University of California, Berkeley, 3040 Valley Life Sciences Building #3140, Berkeley, CA 94720, USA
- University Herbarium, University of California, Berkeley, 1001 Valley Life Sciences Building #2465, Berkeley, CA 94720, USA
- Museum of Paleontology, University of California, 1101 Valley Life Sciences Building, Berkeley, CA 94720, USA
| | - Carl J Rothfels
- Department of Integrative Biology, University of California, Berkeley, 3040 Valley Life Sciences Building #3140, Berkeley, CA 94720, USA
- University Herbarium, University of California, Berkeley, 1001 Valley Life Sciences Building #2465, Berkeley, CA 94720, USA
| |
Collapse
|
6
|
Accounting for the Biological Complexity of Pathogenic Fungi in Phylogenetic Dating. J Fungi (Basel) 2021; 7:jof7080661. [PMID: 34436200 PMCID: PMC8400180 DOI: 10.3390/jof7080661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Revised: 08/11/2021] [Accepted: 08/11/2021] [Indexed: 11/17/2022] Open
Abstract
In the study of pathogen evolution, temporal dating of phylogenies provides information on when species and lineages may have diverged in the past. When combined with spatial and epidemiological data in phylodynamic models, these dated phylogenies can also help infer where and when outbreaks occurred, how pathogens may have spread to new geographic locations and/or niches, and how virulence or drug resistance has developed over time. Although widely applied to viruses and, increasingly, to bacterial pathogen outbreaks, phylogenetic dating is yet to be widely used in the study of pathogenic fungi. Fungi are complex organisms with several biological processes that could present issues with appropriate inference of phylogenies, clock rates, and divergence times, including high levels of recombination and slower mutation rates although with potentially high levels of mutation rate variation. Here, we discuss some of the key methodological challenges in accurate phylogeny reconstruction for fungi in the context of the temporal analyses conducted to date and make recommendations for future dating studies to aid development of a best practices roadmap in light of the increasing threat of fungal outbreaks and antifungal drug resistance worldwide.
Collapse
|
7
|
Partial Diffusion Markov Model of Heterogeneous TCP Link: Optimization with Incomplete Information. MATHEMATICS 2021. [DOI: 10.3390/math9141632] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The paper presents a new mathematical model of TCP (Transmission Control Protocol) link functioning in a heterogeneous (wired/wireless) channel. It represents a controllable, partially observable stochastic dynamic system. The system state describes the status of the modeled TCP link and expresses it via an unobservable controllable MJP (Markov jump process) with finite-state space. Observations are formed by low-frequency counting processes of packet losses and timeouts and a high-frequency compound Poisson process of packet acknowledgments. The information transmission through the TCP-equipped channel is considered a stochastic control problem with incomplete information. The main idea to solve it is to impose the separation principle on the problem. The paper proposes a mathematical framework and algorithmic support to implement the solution. It includes a solution to the stochastic control problem with complete information, a diffusion approximation of the high-frequency observations, a solution to the MJP state filtering problem given the observations with multiplicative noises, and a numerical scheme of the filtering algorithm. The paper also contains the results of a comparative study of the proposed state-based congestion control algorithm with the contemporary TCP versions: Illinois, CUBIC, Compound, and BBR (Bottleneck Bandwidth and RTT).
Collapse
|
8
|
Lucena DAA, Almeida EAB. Morphology and Bayesian tip-dating recover deep Cretaceous-age divergences among major chrysidid lineages (Hymenoptera: Chrysididae). Zool J Linn Soc 2021. [DOI: 10.1093/zoolinnean/zlab010] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Abstract
We integrated phylogenetic, biogeographic and palaeontological data to reconstruct the evolutionary history of the cuckoo wasps. We propose a phylogenetic hypothesis based on a comprehensive morphological study resulting in 300 characters coded for both living and extinct species. Phylogenetic relationships and divergence time estimation were simultaneously inferred in a Bayesian tip-dating framework, applying a relaxed morphological clock. Results unequivocally indicate Chrysididae to be monophyletic, as well as all traditionally recognized subfamilies and tribes. Within the Chrysidinae, Elampini was placed as the sister-group of the other three chrysidine tribes, with Parnopini as sister to the clade including Allocoeliini and Chrysidini. Dating analysis indicates that the major lineages started to differentiate around 130 Mya during the Early Cretaceous. The clades recognized as subfamilies started differentiating during the Palaeogene and the Neogene. Our results reveal an intricate process on the geographic evolution of chrysidid wasps and dispute previous ideas that Cretaceous-old splits in their early history could be associated with vicariant events related to the breakup between Africa and South America. The present-day southern disjunctions of some groups are interpreted as the outcome of more recent dispersals and extinctions of representatives from Nearctic and Palaearctic faunas during the Neogene, when northern continents became significantly colder.
Collapse
Affiliation(s)
- Daercio A A Lucena
- Laboratório de Biologia Comparada e Abelhas, Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Avenida Bandeirantes, 3900., Ribeirão Preto, SP,Brazil
| | - Eduardo A B Almeida
- Laboratório de Biologia Comparada e Abelhas, Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Avenida Bandeirantes, 3900., Ribeirão Preto, SP,Brazil
| |
Collapse
|
9
|
May MR, Moore BR. A Bayesian Approach for Inferring the Impact of a Discrete Character on Rates of Continuous-Character Evolution in the Presence of Background-Rate Variation. Syst Biol 2020; 69:530-544. [PMID: 31665487 PMCID: PMC7608729 DOI: 10.1093/sysbio/syz069] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Accepted: 10/21/2019] [Indexed: 11/14/2022] Open
Abstract
Understanding how and why rates of character evolution vary across the Tree of Life is central to many evolutionary questions; for example, does the trophic apparatus (a set of continuous characters) evolve at a higher rate in fish lineages that dwell in reef versus nonreef habitats (a discrete character)? Existing approaches for inferring the relationship between a discrete character and rates of continuous-character evolution rely on comparing a null model (in which rates of continuous-character evolution are constant across lineages) to an alternative model (in which rates of continuous-character evolution depend on the state of the discrete character under consideration). However, these approaches are susceptible to a "straw-man" effect: the influence of the discrete character is inflated because the null model is extremely unrealistic. Here, we describe MuSSCRat, a Bayesian approach for inferring the impact of a discrete trait on rates of continuous-character evolution in the presence of alternative sources of rate variation ("background-rate variation"). We demonstrate by simulation that our method is able to reliably infer the degree of state-dependent rate variation, and show that ignoring background-rate variation leads to biased inferences regarding the degree of state-dependent rate variation in grunts (the fish group Haemulidae). [Bayesian phylogenetic comparative methods; continuous-character evolution; data augmentation; discrete-character evolution.].
Collapse
Affiliation(s)
- Michael R May
- Department of Evolution and Ecology, University of California, Davis, Storer Hall, One Shields Avenue, Davis, CA 95616, USA
| | - Brian R Moore
- Department of Evolution and Ecology, University of California, Davis, Storer Hall, One Shields Avenue, Davis, CA 95616, USA
| |
Collapse
|
10
|
Ji X, Zhang Z, Holbrook A, Nishimura A, Baele G, Rambaut A, Lemey P, Suchard MA. Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics. Mol Biol Evol 2020; 37:3047-3060. [PMID: 32458974 PMCID: PMC7530611 DOI: 10.1093/molbev/msaa130] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order O(N)-dimensional gradient calculations based on the standard pruning algorithm require O(N2) operations, where N is the number of sampled molecular sequences. With the advent of high-throughput sequencing, recent phylogenetic studies have analyzed hundreds to thousands of sequences, with an apparent trend toward even larger data sets as a result of advancing technology. Such large-scale analyses challenge phylogenetic reconstruction by requiring inference on larger sets of process parameters to model the increasing data heterogeneity. To make these analyses tractable, we present a linear-time algorithm for O(N)-dimensional gradient evaluation and apply it to general continuous-time Markov processes of sequence substitution on a phylogenetic tree without a need to assume either stationarity or reversibility. We apply this approach to learn the branch-specific evolutionary rates of three pathogenic viruses: West Nile virus, Dengue virus, and Lassa virus. Our proposed algorithm significantly improves inference efficiency with a 126- to 234-fold increase in maximum-likelihood optimization and a 16- to 33-fold computational performance increase in a Bayesian framework.
Collapse
Affiliation(s)
- Xiang Ji
- Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA
- Department of Mathematics, School of Science & Engineering, Tulane University, New Orleans, LA
| | - Zhenyu Zhang
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA
| | - Andrew Holbrook
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA
| | - Akihiko Nishimura
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| | - Guy Baele
- Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium
| | - Andrew Rambaut
- Institute of Evolutionary Biology, Centre for Immunology, Infection and Evolution, University of Edinburgh, Edinburgh, United Kingdom
| | - Philippe Lemey
- Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium
| | - Marc A Suchard
- Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA
| |
Collapse
|
11
|
Zwaenepoel A, Van de Peer Y. Model-Based Detection of Whole-Genome Duplications in a Phylogeny. Mol Biol Evol 2020; 37:2734-2746. [PMID: 32359154 DOI: 10.1093/molbev/msaa111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Ancient whole-genome duplications (WGDs) leave signatures in comparative genomic data sets that can be harnessed to detect these events of presumed evolutionary importance. Current statistical approaches for the detection of ancient WGDs in a phylogenetic context have two main drawbacks. The first is that unwarranted restrictive assumptions on the "background" gene duplication and loss rates make inferences unreliable in the face of model violations. The second is that most methods can only be used to examine a limited set of a priori selected WGD hypotheses and cannot be used to discover WGDs in a phylogeny. In this study, we develop an approach for WGD inference using gene count data that seeks to overcome both issues. We employ a phylogenetic birth-death model that includes WGD in a flexible hierarchical Bayesian approach and use reversible-jump Markov chain Monte Carlo to perform Bayesian inference of branch-specific duplication, loss, and WGD retention rates across the space of WGD configurations. We evaluate the proposed method using simulations, apply it to data sets from flowering plants, and discuss the statistical intricacies of model-based WGD inference.
Collapse
Affiliation(s)
- Arthur Zwaenepoel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent, Belgium.,Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| |
Collapse
|
12
|
Guindon S. Rates and Rocks: Strengths and Weaknesses of Molecular Dating Methods. Front Genet 2020; 11:526. [PMID: 32536940 PMCID: PMC7267027 DOI: 10.3389/fgene.2020.00526] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 04/30/2020] [Indexed: 12/19/2022] Open
Abstract
I present here an in-depth, although non-exhaustive, review of two topics in molecular dating. Clock models, which describe the evolution of the rate of evolution, are considered first. Some of the shortcomings of popular approaches-uncorrelated clock models in particular-are presented and discussed. Autocorrelated models are shown to be more reasonable from a biological perspective. Some of the most recent autocorrelated models also rely on a coherent treatment of instantaneous and average substitution rates while previous models are based on implicit approximations. Second, I provide a brief overview of the processes involved in collecting and preparing fossil data. I then review the main techniques that use this data for calibrating the molecular clock. I argue that, in its current form, the fossilized birth-death process relies on assumptions about the mechanisms underlying fossilization and the data collection process that may negatively impact the date estimates. Node-dating approaches make better use of the data available, even though they rest on paleontologists' intervention to prepare raw fossil data. Altogether, this study provides indications that may help practitioners in selecting appropriate methods for molecular dating. It will also hopefully participate in defining the contour of future methodological developments in the field.
Collapse
Affiliation(s)
- Stéphane Guindon
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CNRS and Université Montpellier (UMR 5506), Montpellier, France
| |
Collapse
|
13
|
Grosser K, Metzler D. Modeling methylation dynamics with simultaneous changes in CpG islands. BMC Bioinformatics 2020; 21:115. [PMID: 32183713 PMCID: PMC7079395 DOI: 10.1186/s12859-020-3438-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 03/02/2020] [Indexed: 11/24/2022] Open
Abstract
Background In vertebrate genomes, CpG sites can be clustered into CpG islands, and the amount of methylation in a CpG island can change due to gene regulation processes. Thus, single regulatory events can simultaneously change the methylation states of many CpG sites within a CpG island. This should be taken into account when quantifying the amount of change in methylation, for example in form of a branch length in a phylogeny of cell types. Results We propose a probabilistic model (the IWE-SSE model) of methylation dynamics that accounts for simultaneous methylation changes in multiple CpG sites belonging to the same CpG island. We further propose a Markov-chain Monte-Carlo (MCMC) method to fit this model to methylation data from cell type phylogenies and apply this method to available data from murine haematopoietic cells and from human cell lines. Combined with simulation studies, these analyses show that accounting for CpG island wide methylation changes has a strong effect on the inferred branch lengths and leads to a significantly better model fit for the methylation data from murine haematopoietic cells and human cell lines. Conclusion The MCMC based parameter estimation method for the IWE-SSE model in combination with our MCMC based inference method allows to quantify the amount of methylation changes at single CpG sites as well as on entire CpG islands. Accounting for changes affecting entire islands can lead to more accurate branch length estimation in the presence of simultaneous methylation change.
Collapse
Affiliation(s)
- Konrad Grosser
- Department of Biology, Ludwigs-Maximilians Universität München, Großhaderner Straße 2, Planegg, 82152, Germany
| | - Dirk Metzler
- Department of Biology, Ludwigs-Maximilians Universität München, Großhaderner Straße 2, Planegg, 82152, Germany.
| |
Collapse
|
14
|
A model with many small shifts for estimating species-specific diversification rates. Nat Ecol Evol 2019; 3:1086-1092. [DOI: 10.1038/s41559-019-0908-0] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 04/26/2019] [Indexed: 01/23/2023]
|
15
|
Freyman WA, Höhna S. Cladogenetic and Anagenetic Models of Chromosome Number Evolution: A Bayesian Model Averaging Approach. Syst Biol 2018; 67:195-215. [PMID: 28945917 DOI: 10.1093/sysbio/syx065] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 07/01/2017] [Indexed: 11/14/2022] Open
Abstract
ABSSTRACT Chromosome number is a key feature of the higher-order organization of the genome, and changes in chromosome number play a fundamental role in evolution. Dysploid gains and losses in chromosome number, as well as polyploidization events, may drive reproductive isolation and lineage diversification. The recent development of probabilistic models of chromosome number evolution in the groundbreaking work by Mayrose et al. (2010, ChromEvol) have enabled the inference of ancestral chromosome numbers over molecular phylogenies and generated new interest in studying the role of chromosome changes in evolution. However, the ChromEvol approach assumes all changes occur anagenetically (along branches), and does not model events that are specifically cladogenetic. Cladogenetic changes may be expected if chromosome changes result in reproductive isolation. Here we present a new class of models of chromosome number evolution (called ChromoSSE) that incorporate both anagenetic and cladogenetic change. The ChromoSSE models allow us to determine the mode of chromosome number evolution; is chromosome evolution occurring primarily within lineages, primarily at lineage splitting, or in clade-specific combinations of both? Furthermore, we can estimate the location and timing of possible chromosome speciation events over the phylogeny. We implemented ChromoSSE in a Bayesian statistical framework, specifically in the software RevBayes, to accommodate uncertainty in parameter estimates while leveraging the full power of likelihood based methods. We tested ChromoSSE's accuracy with simulations and re-examined chromosomal evolution in Aristolochia, Carex section Spirostachyae, Helianthus, Mimulus sensu lato (s.l.), and Primula section Aleuritia, finding evidence for clade-specific combinations of anagenetic and cladogenetic dysploid and polyploid modes of chromosome evolution. [Anagenetic; Bayes factors; chromosome evolution; chromosome speciation; chromoSSE; cladogenetic; dysploidy; phylogenetic models; polyploidy; reversible-jump Markov chain Monte Carlo; whole genome duplication.].
Collapse
Affiliation(s)
- William A Freyman
- Department of Integrative Biology, University of California, 3040 Valley Life Sciences Building #3140, Berkeley, CA 94720, USA
| | - Sebastian Höhna
- Department of Integrative Biology, University of California, 3040 Valley Life Sciences Building #3140, Berkeley, CA 94720, USA.,Department of Statistics, University of California, 367 Evans Hall, Berkeley, CA 94720, USA
| |
Collapse
|
16
|
Bromham L, Duchêne S, Hua X, Ritchie AM, Duchêne DA, Ho SYW. Bayesian molecular dating: opening up the black box. Biol Rev Camb Philos Soc 2017; 93:1165-1191. [DOI: 10.1111/brv.12390] [Citation(s) in RCA: 104] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Revised: 11/13/2017] [Accepted: 11/17/2017] [Indexed: 12/27/2022]
Affiliation(s)
- Lindell Bromham
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
| | - Sebastián Duchêne
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute; The University of Melbourne; Melbourne VIC 3010 Australia
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Xia Hua
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
| | - Andrew M. Ritchie
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - David A. Duchêne
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Simon Y. W. Ho
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| |
Collapse
|
17
|
Lartillot N, Phillips MJ, Ronquist F. A mixed relaxed clock model. Philos Trans R Soc Lond B Biol Sci 2017; 371:rstb.2015.0132. [PMID: 27325829 PMCID: PMC4920333 DOI: 10.1098/rstb.2015.0132] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/29/2016] [Indexed: 12/13/2022] Open
Abstract
Over recent years, several alternative relaxed clock models have been proposed in the context of Bayesian dating. These models fall in two distinct categories: uncorrelated and autocorrelated across branches. The choice between these two classes of relaxed clocks is still an open question. More fundamentally, the true process of rate variation may have both long-term trends and short-term fluctuations, suggesting that more sophisticated clock models unfolding over multiple time scales should ultimately be developed. Here, a mixed relaxed clock model is introduced, which can be mechanistically interpreted as a rate variation process undergoing short-term fluctuations on the top of Brownian long-term trends. Statistically, this mixed clock represents an alternative solution to the problem of choosing between autocorrelated and uncorrelated relaxed clocks, by proposing instead to combine their respective merits. Fitting this model on a dataset of 105 placental mammals, using both node-dating and tip-dating approaches, suggests that the two pure clocks, Brownian and white noise, are rejected in favour of a mixed model with approximately equal contributions for its uncorrelated and autocorrelated components. The tip-dating analysis is particularly sensitive to the choice of the relaxed clock model. In this context, the classical pure Brownian relaxed clock appears to be overly rigid, leading to biases in divergence time estimation. By contrast, the use of a mixed clock leads to more recent and more reasonable estimates for the crown ages of placental orders and superorders. Altogether, the mixed clock introduced here represents a first step towards empirically more adequate models of the patterns of rate variation across phylogenetic trees.This article is part of the themed issue 'Dating species divergences using rocks and clocks'.
Collapse
Affiliation(s)
- Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, Université Claude Bernard Lyon 1, F-69622 Villeurbanne Cedex, France
| | - Matthew J Phillips
- School of Earth, Environmental and Biological Sciences, Queensland University of Technology, Brisbane, Australia
| | - Fredrik Ronquist
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, 104 05 Stockholm, Sweden
| |
Collapse
|
18
|
Lee HJ, Kishino H, Rodrigue N, Thorne JL. Grouping substitution types into different relaxed molecular clocks. Philos Trans R Soc Lond B Biol Sci 2017; 371:rstb.2015.0141. [PMID: 27325837 DOI: 10.1098/rstb.2015.0141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2016] [Indexed: 11/12/2022] Open
Abstract
Different types of nucleotide substitutions experience different patterns of rate change over time. We propose clustering context-dependent (or context-independent) nucleotide substitution types according to how their rates change and then using the grouping for divergence time estimation. With our models, relative rates among types that are in the same group are fixed, whereas absolute rates of the types within a group change over time according to a shared relaxed molecular clock. We illustrate our procedure by analysing a 0.15 Mb intergenic region to infer divergence times relating eight primates. The different groupings of substitution types that we explore have little effect on the posterior means of divergence times, but the widths of the credibility intervals decrease as the number of groups increases.This article is part of the themed issue 'Dating species divergences using rocks and clocks'.
Collapse
Affiliation(s)
- Hui-Jie Lee
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Hirohisa Kishino
- Laboratory of Biometrics and Bioinformatics, University of Tokyo, Tokyo, Japan
| | - Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, Ottawa, Ontario, Canada
| | - Jeffrey L Thorne
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
19
|
Discovery or Extinction of New Scleroderma Species in Amazonia? PLoS One 2016; 11:e0167879. [PMID: 28002414 PMCID: PMC5176273 DOI: 10.1371/journal.pone.0167879] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 11/15/2016] [Indexed: 11/19/2022] Open
Abstract
The Amazon Forest is a hotspot of biodiversity harboring an unknown number of undescribed taxa. Inventory studies are urgent, mainly in the areas most endangered by human activities such as extensive dam construction, where species could be in risk of extinction before being described and named. In 2015, intensive studies performed in a few locations in the Brazilian Amazon rainforest revealed three new species of the genus Scleroderma: S. anomalosporum, S. camassuense and S. duckei. The two first species were located in one of the many areas flooded by construction of hydroelectric dams throughout the Amazon; and the third in the Reserva Florestal Adolpho Ducke, a protected reverse by the INPA. The species were identified through morphology and molecular analyses of barcoding sequences (Internal Transcribed Spacer nrDNA). Scleroderma anomalosporum is characterized mainly by the smooth spores under LM in mature basidiomata (under SEM with small, unevenly distributed granules, a characteristic not observed in other species of the genus), the large size of the basidiomata, up to 120 mm diameter, and the stelliform dehiscence; S. camassuense mainly by the irregular to stellate dehiscence, the subreticulated spores and the bright sulfur-yellow colour, and Scleroderma duckei mainly by the verrucose exoperidium, stelliform dehiscence, and verrucose spores. Description, illustration and affinities with other species of the genus are provided.
Collapse
|
20
|
Critically evaluating the theory and performance of Bayesian analysis of macroevolutionary mixtures. Proc Natl Acad Sci U S A 2016; 113:9569-74. [PMID: 27512038 DOI: 10.1073/pnas.1518659113] [Citation(s) in RCA: 173] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Bayesian analysis of macroevolutionary mixtures (BAMM) has recently taken the study of lineage diversification by storm. BAMM estimates the diversification-rate parameters (speciation and extinction) for every branch of a study phylogeny and infers the number and location of diversification-rate shifts across branches of a tree. Our evaluation of BAMM reveals two major theoretical errors: (i) the likelihood function (which estimates the model parameters from the data) is incorrect, and (ii) the compound Poisson process prior model (which describes the prior distribution of diversification-rate shifts across branches) is incoherent. Using simulation, we demonstrate that these theoretical issues cause statistical pathologies; posterior estimates of the number of diversification-rate shifts are strongly influenced by the assumed prior, and estimates of diversification-rate parameters are unreliable. Moreover, the inability to correctly compute the likelihood or to correctly specify the prior for rate-variable trees precludes the use of Bayesian approaches for testing hypotheses regarding the number and location of diversification-rate shifts using BAMM.
Collapse
|
21
|
Höhna S, Landis MJ, Heath TA, Boussau B, Lartillot N, Moore BR, Huelsenbeck JP, Ronquist F. RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language. Syst Biol 2016; 65:726-36. [PMID: 27235697 PMCID: PMC4911942 DOI: 10.1093/sysbio/syw021] [Citation(s) in RCA: 306] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Revised: 03/02/2015] [Accepted: 03/01/2015] [Indexed: 01/12/2023] Open
Abstract
Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.].
Collapse
Affiliation(s)
- Sebastian Höhna
- Department of Integrative Biology; Department of Statistics, University of California, Berkeley, CA 94720, USA; Department of Evolution and Ecology, University of California, Davis, CA 95616, USA; Department of Mathematics, Stockholm University, Stockholm, SE-106 91 Stockholm, Sweden;
| | | | - Tracy A Heath
- Department of Integrative Biology; Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA; Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA 50011, USA;
| | - Bastien Boussau
- Department of Integrative Biology; Laboratoire de Biométrie et Biologie Evolutive, Centre National de la Recherche Scientifique, Unité Mixte de Recherche 5558, Université Lyon 1, F-69622 Villeurbanne, France; and
| | - Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Evolutive, Centre National de la Recherche Scientifique, Unité Mixte de Recherche 5558, Université Lyon 1, F-69622 Villeurbanne, France; and
| | - Brian R Moore
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA;
| | | | - Fredrik Ronquist
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-10405 Stockholm, Sweden
| |
Collapse
|
22
|
May MR, Höhna S, Moore BR. A Bayesian approach for detecting the impact of mass‐extinction events on molecular phylogenies when rates of lineage diversification may vary. Methods Ecol Evol 2016. [DOI: 10.1111/2041-210x.12563] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Michael R. May
- Department of Evolution and Ecology University of California Davis CA95616USA
| | - Sebastian Höhna
- Department of Evolution and Ecology University of California Davis CA95616USA
- Department of Integrative Biology University of California Berkeley CA94720USA
- Department of Statistics University of California Berkeley CA94720USA
- Department of Mathematics Stockholm University Stockholm SE‐106 91Sweden
| | - Brian R. Moore
- Department of Evolution and Ecology University of California Davis CA95616USA
| |
Collapse
|
23
|
Rambaut A, Lam TT, Max Carvalho L, Pybus OG. Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen). Virus Evol 2016; 2:vew007. [PMID: 27774300 PMCID: PMC4989882 DOI: 10.1093/ve/vew007] [Citation(s) in RCA: 1229] [Impact Index Per Article: 153.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Gene sequences sampled at different points in time can be used to infer molecular phylogenies on a natural timescale of months or years, provided that the sequences in question undergo measurable amounts of evolutionary change between sampling times. Data sets with this property are termed heterochronous and have become increasingly common in several fields of biology, most notably the molecular epidemiology of rapidly evolving viruses. Here we introduce the cross-platform software tool, TempEst (formerly known as Path-O-Gen), for the visualization and analysis of temporally sampled sequence data. Given a molecular phylogeny and the dates of sampling for each sequence, TempEst uses an interactive regression approach to explore the association between genetic divergence through time and sampling dates. TempEst can be used to (1) assess whether there is sufficient temporal signal in the data to proceed with phylogenetic molecular clock analysis, and (2) identify sequences whose genetic divergence and sampling date are incongruent. Examination of the latter can help identify data quality problems, including errors in data annotation, sample contamination, sequence recombination, or alignment error. We recommend that all users of the molecular clock models implemented in BEAST first check their data using TempEst prior to analysis.
Collapse
Affiliation(s)
- Andrew Rambaut
- Institute of Evolutionary Biology,; Centre for Immunity, Infection and Evolution, University of Edinburgh, Ashworth Laboratories, King's Buildings, Edinburgh EH9 3JT, UK
| | - Tommy T Lam
- School of Public Health, University of Hong Kong, Hong Kong SAR, China and
| | | | - Oliver G Pybus
- Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, UK
| |
Collapse
|
24
|
Zhang C, Stadler T, Klopfstein S, Heath TA, Ronquist F. Total-Evidence Dating under the Fossilized Birth-Death Process. Syst Biol 2016; 65:228-49. [PMID: 26493827 PMCID: PMC4748749 DOI: 10.1093/sysbio/syv080] [Citation(s) in RCA: 197] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2015] [Accepted: 10/12/2015] [Indexed: 11/16/2022] Open
Abstract
Bayesian total-evidence dating involves the simultaneous analysis of morphological data from the fossil record and morphological and sequence data from recent organisms, and it accommodates the uncertainty in the placement of fossils while dating the phylogenetic tree. Due to the flexibility of the Bayesian approach, total-evidence dating can also incorporate additional sources of information. Here, we take advantage of this and expand the analysis to include information about fossilization and sampling processes. Our work is based on the recently described fossilized birth-death (FBD) process, which has been used to model speciation, extinction, and fossilization rates that can vary over time in a piecewise manner. So far, sampling of extant and fossil taxa has been assumed to be either complete or uniformly at random, an assumption which is only valid for a minority of data sets. We therefore extend the FBD process to accommodate diversified sampling of extant taxa, which is standard practice in studies of higher-level taxa. We verify the implementation using simulations and apply it to the early radiation of Hymenoptera (wasps, ants, and bees). Previous total-evidence dating analyses of this data set were based on a simple uniform tree prior and dated the initial radiation of extant Hymenoptera to the late Carboniferous (309 Ma). The analyses using the FBD prior under diversified sampling, however, date the radiation to the Triassic and Permian (252 Ma), slightly older than the age of the oldest hymenopteran fossils. By exploring a variety of FBD model assumptions, we show that it is mainly the accommodation of diversified sampling that causes the push toward more recent divergence times. Accounting for diversified sampling thus has the potential to close the long-discussed gap between rocks and clocks. We conclude that the explicit modeling of fossilization and sampling processes can improve divergence time estimates, but only if all important model aspects, including sampling biases, are adequately addressed.
Collapse
Affiliation(s)
- Chi Zhang
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05 Stockholm, Sweden
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich, 4053 Basel, Switzerland; Swiss Institute of Bioinformatics (SIB), Switzerland
| | - Seraina Klopfstein
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05 Stockholm, Sweden; Department of Invertebrates, Natural History Museum Bern, CH-3005 Bern, Switzerland
| | - Tracy A Heath
- Department of Integrative Biology, University of California, Berkeley, CA 94720 USA; Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA; Department of Ecology, Evolution & Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Fredrik Ronquist
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, SE-104 05 Stockholm, Sweden;
| |
Collapse
|
25
|
Ismail SI, Batzer JC, Harrington TC, Crous PW, Lavrov DV, Li H, Gleason ML. Ancestral state reconstruction infers phytopathogenic origins of sooty blotch and flyspeck fungi on apple. Mycologia 2016; 108:292-302. [PMID: 26740537 DOI: 10.3852/15-036] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2015] [Accepted: 11/07/2015] [Indexed: 01/07/2023]
Abstract
Members of the sooty blotch and flyspeck (SBFS) complex are epiphytic fungi in the Ascomycota that cause economically damaging blemishes of apples worldwide. SBFS fungi are polyphyletic, but approx. 96% of SBFS species are in the Capnodiales. Evolutionary origins of SBFS fungi remain unclear, so we attempted to infer their origins by means of ancestral state reconstruction on a phylogenetic tree built utilizing genes for the nuc 28S rDNA (approx. 830 bp from near the 59 end) and the second largest subunit of RNA polymerase II (RPB2). The analyzed taxa included the well-known genera of SBFS as well as non-SBFS fungi from seven families within the Capnodiales. The non-SBFS taxa were selected based on their distinct ecological niches, including plant-parasitic and saprophytic species. The phylogenetic analyses revealed that most SBFS species in the Capnodiales are closely related to plant-parasitic fungi. Ancestral state reconstruction provided strong evidence that plant-parasitic fungi were the ancestors of the major SBFS lineages. Knowledge gained from this study may help to better understand the ecology and evolution of epiphytic fungi.
Collapse
Affiliation(s)
- Siti Izera Ismail
- Department of Plant Protection, Faculty of Agriculture, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia
| | - Jean Carlson Batzer
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011
| | - Thomas C Harrington
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011
| | - Pedro W Crous
- CBS-KNAW Fungal Biodiversity Centre, Uppsalalaan 8, 3584 CT Utrecht, the Netherlands MicrobiologyDepartment of Biology, Utrecht University, Padualaan 8, 3584 CH Utrecht, the NetherlandsWageningen University and Research Center (WUR), Laboratory of Phytopathology, Droevendaalsesteeg 1, 6708 PB Wageningen, the Netherlands
| | - Dennis V Lavrov
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa 50011
| | - Huanyu Li
- Department of Plant Pathology, Gansu Agricultural University, Lanzhou City, Gansu, China
| | - Mark L Gleason
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, Iowa 50011
| |
Collapse
|
26
|
dos Reis M, Donoghue PCJ, Yang Z. Bayesian molecular clock dating of species divergences in the genomics era. Nat Rev Genet 2015; 17:71-80. [PMID: 26688196 DOI: 10.1038/nrg.2015.8] [Citation(s) in RCA: 151] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Five decades have passed since the proposal of the molecular clock hypothesis, which states that the rate of evolution at the molecular level is constant through time and among species. This hypothesis has become a powerful tool in evolutionary biology, making it possible to use molecular sequences to estimate the geological ages of species divergence events. With recent advances in Bayesian clock dating methodology and the explosive accumulation of genetic sequence data, molecular clock dating has found widespread applications, from tracking virus pandemics and studying the macroevolutionary process of speciation and extinction to estimating a timescale for life on Earth.
Collapse
Affiliation(s)
- Mario dos Reis
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK.,School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| | - Philip C J Donoghue
- School of Earth Sciences, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol BS8 1TQ, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| |
Collapse
|
27
|
Lartillot N. Probabilistic models of eukaryotic evolution: time for integration. Philos Trans R Soc Lond B Biol Sci 2015; 370:20140338. [PMID: 26323768 PMCID: PMC4571576 DOI: 10.1098/rstb.2014.0338] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/03/2015] [Indexed: 11/12/2022] Open
Abstract
In spite of substantial work and recent progress, a global and fully resolved picture of the macroevolutionary history of eukaryotes is still under construction. This concerns not only the phylogenetic relations among major groups, but also the general characteristics of the underlying macroevolutionary processes, including the patterns of gene family evolution associated with endosymbioses, as well as their impact on the sequence evolutionary process. All these questions raise formidable methodological challenges, calling for a more powerful statistical paradigm. In this direction, model-based probabilistic approaches have played an increasingly important role. In particular, improved models of sequence evolution accounting for heterogeneities across sites and across lineages have led to significant, although insufficient, improvement in phylogenetic accuracy. More recently, one main trend has been to move away from simple parametric models and stepwise approaches, towards integrative models explicitly considering the intricate interplay between multiple levels of macroevolutionary processes. Such integrative models are in their infancy, and their application to the phylogeny of eukaryotes still requires substantial improvement of the underlying models, as well as additional computational developments.
Collapse
Affiliation(s)
- Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, Université Claude Bernard Lyon 1, F-69622 Villeurbanne Cedex, France
| |
Collapse
|
28
|
Huang J, Zhao Y, Bai D, Shiraigol W, Li B, Yang L, Wu J, Bao W, Ren X, Jin B, Zhao Q, Li A, Bao S, Bao W, Xing Z, An A, Gao Y, Wei R, Bao Y, Bao T, Han H, Bai H, Bao Y, Zhang Y, Daidiikhuu D, Zhao W, Liu S, Ding J, Ye W, Ding F, Sun Z, Shi Y, Zhang Y, Meng H, Dugarjaviin M. Donkey genome and insight into the imprinting of fast karyotype evolution. Sci Rep 2015; 5:14106. [PMID: 26373886 PMCID: PMC4571621 DOI: 10.1038/srep14106] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 08/17/2015] [Indexed: 11/20/2022] Open
Abstract
The donkey, like the horse, is a promising model for exploring karyotypic instability. We report the de novo whole-genome assemblies of the donkey and the Asiatic wild ass. Our results reflect the distinct characteristics of donkeys, including more effective energy metabolism and better immunity than horses. The donkey shows a steady demographic trajectory. We detected abundant satellite sequences in some inactive centromere regions but not in neocentromere regions, while ribosomal RNAs frequently emerged in neocentromere regions but not in the obsolete centromere regions. Expanded miRNA families and five newly discovered miRNA target genes involved in meiosis may be associated with fast karyotype evolution. APC/C, controlling sister chromatid segregation, cytokinesis, and the establishment of the G1 cell cycle phase were identified by analysis of miRNA targets and rapidly evolving genes.
Collapse
Affiliation(s)
- Jinlong Huang
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Yiping Zhao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Dongyi Bai
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Wunierfu Shiraigol
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Bei Li
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Lihua Yang
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Jing Wu
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Wuyundalai Bao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Xiujuan Ren
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Burenqiqige Jin
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Qinan Zhao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Anaer Li
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Sarula Bao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Wuyingga Bao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Zhencun Xing
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Aoruga An
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Yahan Gao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Ruiyuan Wei
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Yirugeletu Bao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Taoketao Bao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Haige Han
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Haitang Bai
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Yanqing Bao
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Yuhong Zhang
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Dorjsuren Daidiikhuu
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| | - Wenjing Zhao
- School of Agriculture and Biology, Shanghai Jiaotong University; Shanghai Key Laboratory of Veterinary Biotechnology, 800 Dongchuan Road, Shanghai 200240, P. R. China
| | - Shuyun Liu
- School of Agriculture and Biology, Shanghai Jiaotong University; Shanghai Key Laboratory of Veterinary Biotechnology, 800 Dongchuan Road, Shanghai 200240, P. R. China
| | - Jinmei Ding
- School of Agriculture and Biology, Shanghai Jiaotong University; Shanghai Key Laboratory of Veterinary Biotechnology, 800 Dongchuan Road, Shanghai 200240, P. R. China
| | - Weixing Ye
- Shanghai Personal Biotechnology Limited Company, 218 Yindu Road, Shanghai 200231, P. R. China
| | - Fangmei Ding
- Shanghai Personal Biotechnology Limited Company, 218 Yindu Road, Shanghai 200231, P. R. China
| | - Zikui Sun
- Shanghai Personal Biotechnology Limited Company, 218 Yindu Road, Shanghai 200231, P. R. China
| | - Yixiang Shi
- Shanghai Personal Biotechnology Limited Company, 218 Yindu Road, Shanghai 200231, P. R. China
| | - Yan Zhang
- SRA Inc. 6003 Executive Blvd. Suite 400, Rockville, MD20852, USA
| | - He Meng
- School of Agriculture and Biology, Shanghai Jiaotong University; Shanghai Key Laboratory of Veterinary Biotechnology, 800 Dongchuan Road, Shanghai 200240, P. R. China
| | - Manglai Dugarjaviin
- College of Animal Science, Inner Mongolia Agricultural University, 306 Zhaowuda Road, Hohhot 010018, P. R. China
| |
Collapse
|
29
|
Lee HJ, Rodrigue N, Thorne JL. Relaxing the Molecular Clock to Different Degrees for Different Substitution Types. Mol Biol Evol 2015; 32:1948-61. [PMID: 25931515 PMCID: PMC4833082 DOI: 10.1093/molbev/msv099] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Rates of molecular evolution can vary over time. Diverse statistical techniques for divergence time estimation have been developed to accommodate this variation. These typically require that all sequence (or codon) positions at a locus change independently of one another. They also generally assume that the rates of different types of nucleotide substitutions vary across a phylogeny in the same way. This permits divergence time estimation procedures to employ an instantaneous rate matrix with relative rates that do not differ among branches. However, previous studies have suggested that some substitution types (e.g., CpG to TpG changes in mammals) are more clock-like than others. As has been previously noted, this is biologically plausible given the mutational mechanism of CpG to TpG changes. Through stochastic mapping of sequence histories from context-independent substitution models, our approach allows for context-dependent nucleotide substitutions to change their relative rates over time. We apply our approach to the analysis of a 0.15 Mb intergenic region from eight primates. In accord with previous findings, we find comparatively little rate variation over time for CpG to TpG substitutions but we find more for other substitution types. We conclude by discussing the limitations and prospects of our approach.
Collapse
Affiliation(s)
- Hui-Jie Lee
- Department of Statistics, North Carolina State University
| | | | - Jeffrey L Thorne
- Department of Statistics, North Carolina State University Department of Biological Sciences, North Carolina State University
| |
Collapse
|
30
|
Wikström N, Kainulainen K, Razafimandimbison SG, Smedmark JEE, Bremer B. A revised time tree of the asterids: establishing a temporal framework for evolutionary studies of the coffee family (rubiaceae). PLoS One 2015; 10:e0126690. [PMID: 25996595 PMCID: PMC4462594 DOI: 10.1371/journal.pone.0126690] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 04/07/2015] [Indexed: 11/19/2022] Open
Abstract
Divergence time analyses in the coffee family (Rubiaceae) have all relied on the same Gentianales crown group age estimate, reported by an earlier analysis of the asterids, for defining the upper age bound of the root node in their analyses. However, not only did the asterid analysis suffer from several analytical shortcomings, but the estimate itself has been used in highly inconsistent ways in these Rubiaceae analyses. Based on the original data, we here reanalyze the divergence times of the asterids using relaxed-clock models and 14 fossil-based minimum age constraints. We also expand the data set to include an additional 67 taxa from Rubiaceae sampled across all three subfamilies recognized in the family. Three analyses are conducted: a separate analysis of the asterids, which completely mirrors the original asterid analysis in terms of taxon sample and data; a separate analysis of the Gentianales, where the result from the first analysis is used for defining a secondary root calibration point; and a combined analysis where all taxa are analyzed simultaneously. Results are presented in the form of a time-calibrated phylogeny, and age estimates for asterid groups, Gentianales, and major groups of Rubiaceae are compared and discussed in relation to previously published estimates. Our updated age estimates for major groups of Rubiaceae provide a significant step forward towards the long term goal of establishing a robust temporal framework for the divergence of this biologically diverse and fascinating group of plants.
Collapse
Affiliation(s)
- Niklas Wikström
- Bergius Foundation, The Royal Swedish Academy of Sciences and Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-10691, Stockholm, Sweden
| | - Kent Kainulainen
- Bergius Foundation, The Royal Swedish Academy of Sciences and Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-10691, Stockholm, Sweden
| | - Sylvain G. Razafimandimbison
- Bergius Foundation, The Royal Swedish Academy of Sciences and Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-10691, Stockholm, Sweden
| | - Jenny E. E. Smedmark
- University of Bergen, University Museum of Bergen, The Natural History Collections, Post Box 7800, NO-5020 Bergen, Norway
| | - Birgitta Bremer
- Bergius Foundation, The Royal Swedish Academy of Sciences and Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-10691, Stockholm, Sweden
| |
Collapse
|
31
|
Ho SYW, Duchêne S. Molecular-clock methods for estimating evolutionary rates and timescales. Mol Ecol 2014; 23:5947-65. [DOI: 10.1111/mec.12953] [Citation(s) in RCA: 225] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Revised: 09/29/2014] [Accepted: 09/30/2014] [Indexed: 11/29/2022]
Affiliation(s)
- Simon Y. W. Ho
- School of Biological Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Sebastián Duchêne
- School of Biological Sciences; University of Sydney; Sydney NSW 2006 Australia
| |
Collapse
|
32
|
Uyeda JC, Harmon LJ. A novel Bayesian method for inferring and interpreting the dynamics of adaptive landscapes from phylogenetic comparative data. Syst Biol 2014; 63:902-18. [PMID: 25077513 DOI: 10.1093/sysbio/syu057] [Citation(s) in RCA: 177] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Our understanding of macroevolutionary patterns of adaptive evolution has greatly increased with the advent of large-scale phylogenetic comparative methods. Widely used Ornstein-Uhlenbeck (OU) models can describe an adaptive process of divergence and selection. However, inference of the dynamics of adaptive landscapes from comparative data is complicated by interpretational difficulties, lack of identifiability among parameter values and the common requirement that adaptive hypotheses must be assigned a priori. Here, we develop a reversible-jump Bayesian method of fitting multi-optima OU models to phylogenetic comparative data that estimates the placement and magnitude of adaptive shifts directly from the data. We show how biologically informed hypotheses can be tested against this inferred posterior of shift locations using Bayes Factors to establish whether our a priori models adequately describe the dynamics of adaptive peak shifts. Furthermore, we show how the inclusion of informative priors can be used to restrict models to biologically realistic parameter space and test particular biological interpretations of evolutionary models. We argue that Bayesian model fitting of OU models to comparative data provides a framework for integrating of multiple sources of biological data-such as microevolutionary estimates of selection parameters and paleontological timeseries-allowing inference of adaptive landscape dynamics with explicit, process-based biological interpretations.
Collapse
Affiliation(s)
- Josef C Uyeda
- Department of Biological Sciences, University of Idaho, Life Sciences South 252, 875 Perimeter Dr MS 3051, Moscow, ID., University of Idaho
| | - Luke J Harmon
- Department of Biological Sciences, University of Idaho, Life Sciences South 252, 875 Perimeter Dr MS 3051, Moscow, ID., University of Idaho
| |
Collapse
|
33
|
Bellot S, Renner SS. Exploring new dating approaches for parasites: the worldwide Apodanthaceae (Cucurbitales) as an example. Mol Phylogenet Evol 2014; 80:1-10. [PMID: 25057774 DOI: 10.1016/j.ympev.2014.07.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Revised: 07/04/2014] [Accepted: 07/07/2014] [Indexed: 11/27/2022]
Abstract
Gene trees of holoparasitic plants usually show distinctly longer branch lengths than seen in photosynthetic closest relatives. Such substitution rate jumps have made it difficult to infer the absolute divergence times of parasites. An additional problem is that parasite clades often lack a fossil record. Using nuclear and mitochondrial DNA sequences of Apodanthaceae, a worldwide family of endoparasites living inside Fabaceae and Salicaceae, we compared several dating approaches: (i) an uncorrelated lognormal (UCLN) model calibrated with outgroup fossils, (ii) ages of host lineages as a maximal age in an UCLN model, (iii) user-assigned local clocks, and (iv) outgroup-fossil-calibrated random local clocks (RLC) with varying prior probabilities on the number of permitted rate changes (RLCu and RLCp models), a variable that has never been explored. The resulting dated phylogenies include all 10 species of the family, three in Australia, one in Iran, one in Africa, and the remainder in the Americas. All clock models infer a drastic rate jump between nonparasitic outgroups and Apodanthaceae, but since they distribute the rate heterogeneity differently, they result in much-different age estimates. Bayes factors using path and stepping-stone sampling indicated that the RLCp model fit poorly, while for matR, topologically unconstrained RLCu and UCLN models did not differ significantly and for 18S, the UCLN model was preferred. Under the equally well fitting models, the Apodanthaceae appear to be a relatively old clade, with a stem age falling between 65 and 81my, the divergence of Apodanthes from Pilostyles between 36 and 57my ago, and the crown age of the Australian clade 8-18my ago. In our study system, host-age calibrations did not yield well-constrained results, but they may work better in other parasite clades. For small data sets where statistical convergence can be reached even with complex models, random local clocks should be explored as an alternative to the exclusive reliance on UCLN clocks.
Collapse
Affiliation(s)
- Sidonie Bellot
- Systematic Botany and Mycology, University of Munich (LMU), Menzinger Str. 67, 80638 Munich, Germany.
| | - Susanne S Renner
- Systematic Botany and Mycology, University of Munich (LMU), Menzinger Str. 67, 80638 Munich, Germany
| |
Collapse
|
34
|
Abstract
MOTIVATION Brownian models have been introduced in phylogenetics for describing variation in substitution rates through time, with applications to molecular dating or to the comparative analysis of variation in substitution patterns among lineages. Thus far, however, the Monte Carlo implementations of these models have relied on crude approximations, in which the Brownian process is sampled only at the internal nodes of the phylogeny or at the midpoints along each branch, and the unknown trajectory between these sampled points is summarized by simple branchwise average substitution rates. RESULTS A more accurate Monte Carlo approach is introduced, explicitly sampling a fine-grained discretization of the trajectory of the (potentially multivariate) Brownian process along the phylogeny. Generic Monte Carlo resampling algorithms are proposed for updating the Brownian paths along and across branches. Specific computational strategies are developed for efficient integration of the finite-time substitution probabilities across branches induced by the Brownian trajectory. The mixing properties and the computational complexity of the resulting Markov chain Monte Carlo sampler scale reasonably with the discretization level, allowing practical applications with up to a few hundred discretization points along the entire depth of the tree. The method can be generalized to other Markovian stochastic processes, making it possible to implement a wide range of time-dependent substitution models with well-controlled computational precision. AVAILABILITY The program is freely available at www.phylobayes.org.
Collapse
Affiliation(s)
- Benjamin Horvilleur
- Université de Lyon, Université Lyon 1, CNRS; UMR 5558, Laboratoire de Biométrie, Biologie Évolutive, F-69622 Villeurbanne, France
| | - Nicolas Lartillot
- Université de Lyon, Université Lyon 1, CNRS; UMR 5558, Laboratoire de Biométrie, Biologie Évolutive, F-69622 Villeurbanne, France
| |
Collapse
|
35
|
Heath TA, Huelsenbeck JP, Stadler T. The fossilized birth-death process for coherent calibration of divergence-time estimates. Proc Natl Acad Sci U S A 2014; 111:E2957-66. [PMID: 25009181 PMCID: PMC4115571 DOI: 10.1073/pnas.1319091111] [Citation(s) in RCA: 358] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Time-calibrated species phylogenies are critical for addressing a wide range of questions in evolutionary biology, such as those that elucidate historical biogeography or uncover patterns of coevolution and diversification. Because molecular sequence data are not informative on absolute time, external data--most commonly, fossil age estimates--are required to calibrate estimates of species divergence dates. For Bayesian divergence time methods, the common practice for calibration using fossil information involves placing arbitrarily chosen parametric distributions on internal nodes, often disregarding most of the information in the fossil record. We introduce the "fossilized birth-death" (FBD) process--a model for calibrating divergence time estimates in a Bayesian framework, explicitly acknowledging that extant species and fossils are part of the same macroevolutionary process. Under this model, absolute node age estimates are calibrated by a single diversification model and arbitrary calibration densities are not necessary. Moreover, the FBD model allows for inclusion of all available fossils. We performed analyses of simulated data and show that node age estimation under the FBD model results in robust and accurate estimates of species divergence times with realistic measures of statistical uncertainty, overcoming major limitations of standard divergence time estimation methods. We used this model to estimate the speciation times for a dataset composed of all living bears, indicating that the genus Ursus diversified in the Late Miocene to Middle Pliocene.
Collapse
Affiliation(s)
- Tracy A Heath
- Department of Integrative Biology, University of California, Berkeley, CA 94720;Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045
| | - John P Huelsenbeck
- Department of Integrative Biology, University of California, Berkeley, CA 94720;Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Tanja Stadler
- Department of Environmental Systems Science, Eidgenössische Technische Hochschule Zürich, 8092 Zurich, Switzerland; andDepartment of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich, 4058 Basel, Switzerland
| |
Collapse
|
36
|
Schweizer M, Güntert M, Seehausen O, Leuenberger C, Hertwig ST. Parallel adaptations to nectarivory in parrots, key innovations and the diversification of the Loriinae. Ecol Evol 2014; 4:2867-83. [PMID: 25165525 PMCID: PMC4130445 DOI: 10.1002/ece3.1131] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Revised: 05/05/2014] [Accepted: 05/06/2014] [Indexed: 11/12/2022] Open
Abstract
Specialization to nectarivory is associated with radiations within different bird groups, including parrots. One of them, the Australasian lories, were shown to be unexpectedly species rich. Their shift to nectarivory may have created an ecological opportunity promoting species proliferation. Several morphological specializations of the feeding tract to nectarivory have been described for parrots. However, they have never been assessed in a quantitative framework considering phylogenetic nonindependence. Using a phylogenetic comparative approach with broad taxon sampling and 15 continuous characters of the digestive tract, we demonstrate that nectarivorous parrots differ in several traits from the remaining parrots. These trait-changes indicate phenotype-environment correlations and parallel evolution, and may reflect adaptations to feed effectively on nectar. Moreover, the diet shift was associated with significant trait shifts at the base of the radiation of the lories, as shown by an alternative statistical approach. Their diet shift might be considered as an evolutionary key innovation which promoted significant non-adaptive lineage diversification through allopatric partitioning of the same new niche. The lack of increased rates of cladogenesis in other nectarivorous parrots indicates that evolutionary innovations need not be associated one-to-one with diversification events.
Collapse
Affiliation(s)
- Manuel Schweizer
- Naturhistorisches Museum der Burgergemeinde BernBernastrasse 15, CH 3005, Bern, Switzerland
| | - Marcel Güntert
- Naturhistorisches Museum der Burgergemeinde BernBernastrasse 15, CH 3005, Bern, Switzerland
| | - Ole Seehausen
- Aquatic Ecology and Macroevolution, Institute of Ecology & Evolution, University of BernBaltzerstrasse 6, CH 3012, Bern, Switzerland
- Fish Ecology and Evolution, EAWAGSeestrasse 79, CH 6047, Kastanienbaum, Switzerland
| | - Christoph Leuenberger
- Department of Quantitative Economics, University of FribourgBoulevard de Pérolles 90, CH 1700, Fribourg, Switzerland
| | - Stefan T Hertwig
- Naturhistorisches Museum der Burgergemeinde BernBernastrasse 15, CH 3005, Bern, Switzerland
| |
Collapse
|
37
|
Rabosky DL, Donnellan SC, Grundler M, Lovette IJ. Analysis and Visualization of Complex Macroevolutionary Dynamics: An Example from Australian Scincid Lizards. Syst Biol 2014; 63:610-27. [DOI: 10.1093/sysbio/syu025] [Citation(s) in RCA: 191] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Daniel L. Rabosky
- Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA; 2Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA; 3South Australian Museum, North Terrace, Adelaide 5000, Australia; 4Australian Centre for Evolutionary Biology and Biodiversity, University of Adelaide, Adelaide 5005, Australia; 5Cornell Lab of Ornithology, Cornell University, Ithaca, New York 14850, USA
- Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA; 2Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA; 3South Australian Museum, North Terrace, Adelaide 5000, Australia; 4Australian Centre for Evolutionary Biology and Biodiversity, University of Adelaide, Adelaide 5005, Australia; 5Cornell Lab of Ornithology, Cornell University, Ithaca, New York 14850, USA
| | - Stephen C. Donnellan
- Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA; 2Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA; 3South Australian Museum, North Terrace, Adelaide 5000, Australia; 4Australian Centre for Evolutionary Biology and Biodiversity, University of Adelaide, Adelaide 5005, Australia; 5Cornell Lab of Ornithology, Cornell University, Ithaca, New York 14850, USA
- Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA; 2Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA; 3South Australian Museum, North Terrace, Adelaide 5000, Australia; 4Australian Centre for Evolutionary Biology and Biodiversity, University of Adelaide, Adelaide 5005, Australia; 5Cornell Lab of Ornithology, Cornell University, Ithaca, New York 14850, USA
| | - Michael Grundler
- Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA; 2Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA; 3South Australian Museum, North Terrace, Adelaide 5000, Australia; 4Australian Centre for Evolutionary Biology and Biodiversity, University of Adelaide, Adelaide 5005, Australia; 5Cornell Lab of Ornithology, Cornell University, Ithaca, New York 14850, USA
| | - Irby J. Lovette
- Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA; 2Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA; 3South Australian Museum, North Terrace, Adelaide 5000, Australia; 4Australian Centre for Evolutionary Biology and Biodiversity, University of Adelaide, Adelaide 5005, Australia; 5Cornell Lab of Ornithology, Cornell University, Ithaca, New York 14850, USA
| |
Collapse
|
38
|
Dos Reis M, Zhu T, Yang Z. The impact of the rate prior on Bayesian estimation of divergence times with multiple Loci. Syst Biol 2014; 63:555-65. [PMID: 24658316 PMCID: PMC4055871 DOI: 10.1093/sysbio/syu020] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Bayesian methods provide a powerful way to estimate species divergence times by combining information from molecular sequences with information from the fossil record. With the explosive increase of genomic data, divergence time estimation increasingly uses data of multiple loci (genes or site partitions). Widely used computer programs to estimate divergence times use independent and identically distributed (i.i.d.) priors on the substitution rates for different loci. The i.i.d. prior is problematic. As the number of loci (L) increases, the prior variance of the average rate across all loci goes to zero at the rate 1/L. As a consequence, the rate prior dominates posterior time estimates when many loci are analyzed, and if the rate prior is misspecified, the estimated divergence times will converge to wrong values with very narrow credibility intervals. Here we develop a new prior on the locus rates based on the Dirichlet distribution that corrects the problematic behavior of the i.i.d. prior. We use computer simulation and real data analysis to highlight the differences between the old and new priors. For a dataset for six primate species, we show that with the old i.i.d. prior, if the prior rate is too high (or too low), the estimated divergence times are too young (or too old), outside the bounds imposed by the fossil calibrations. In contrast, with the new Dirichlet prior, posterior time estimates are insensitive to the rate prior and are compatible with the fossil calibrations. We re-analyzed a phylogenomic data set of 36 mammal species and show that using many fossil calibrations can alleviate the adverse impact of a misspecified rate prior to some extent. We recommend the use of the new Dirichlet prior in Bayesian divergence time estimation. [Bayesian inference, divergence time, relaxed clock, rate prior, partition analysis.].
Collapse
Affiliation(s)
- Mario Dos Reis
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK
| | - Tianqi Zhu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK;
| |
Collapse
|
39
|
Rabosky DL. Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees. PLoS One 2014; 9:e89543. [PMID: 24586858 PMCID: PMC3935878 DOI: 10.1371/journal.pone.0089543] [Citation(s) in RCA: 635] [Impact Index Per Article: 63.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2013] [Accepted: 01/22/2014] [Indexed: 11/18/2022] Open
Abstract
A number of methods have been developed to infer differential rates of species diversification through time and among clades using time-calibrated phylogenetic trees. However, we lack a general framework that can delineate and quantify heterogeneous mixtures of dynamic processes within single phylogenies. I developed a method that can identify arbitrary numbers of time-varying diversification processes on phylogenies without specifying their locations in advance. The method uses reversible-jump Markov Chain Monte Carlo to move between model subspaces that vary in the number of distinct diversification regimes. The model assumes that changes in evolutionary regimes occur across the branches of phylogenetic trees under a compound Poisson process and explicitly accounts for rate variation through time and among lineages. Using simulated datasets, I demonstrate that the method can be used to quantify complex mixtures of time-dependent, diversity-dependent, and constant-rate diversification processes. I compared the performance of the method to the MEDUSA model of rate variation among lineages. As an empirical example, I analyzed the history of speciation and extinction during the radiation of modern whales. The method described here will greatly facilitate the exploration of macroevolutionary dynamics across large phylogenetic trees, which may have been shaped by heterogeneous mixtures of distinct evolutionary processes.
Collapse
Affiliation(s)
- Daniel L. Rabosky
- Department of Ecology and Evolutionary Biology and Museum of Zoology, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
40
|
Molecular phylogenetics and temporal diversification in the genus Aeromonas based on the sequences of five housekeeping genes. PLoS One 2014; 9:e88805. [PMID: 24586399 PMCID: PMC3930666 DOI: 10.1371/journal.pone.0088805] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Accepted: 01/16/2014] [Indexed: 12/05/2022] Open
Abstract
Several approaches have been developed to estimate both the relative and absolute rates of speciation and extinction within clades based on molecular phylogenetic reconstructions of evolutionary relationships, according to an underlying model of diversification. However, the macroevolutionary models established for eukaryotes have scarcely been used with prokaryotes. We have investigated the rate and pattern of cladogenesis in the genus Aeromonas (γ-Proteobacteria, Proteobacteria, Bacteria) using the sequences of five housekeeping genes and an uncorrelated relaxed-clock approach. To our knowledge, until now this analysis has never been applied to all the species described in a bacterial genus and thus opens up the possibility of establishing models of speciation from sequence data commonly used in phylogenetic studies of prokaryotes. Our results suggest that the genus Aeromonas began to diverge between 248 and 266 million years ago, exhibiting a constant divergence rate through the Phanerozoic, which could be described as a pure birth process.
Collapse
|
41
|
Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nat Commun 2013; 4:1958. [PMID: 23739623 DOI: 10.1038/ncomms2958] [Citation(s) in RCA: 376] [Impact Index Per Article: 34.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Accepted: 04/29/2013] [Indexed: 11/08/2022] Open
Abstract
Several evolutionary theories predict that rates of morphological change should be positively associated with the rate at which new species arise. For example, the theory of punctuated equilibrium proposes that phenotypic change typically occurs in rapid bursts associated with speciation events. However, recent phylogenetic studies have found little evidence linking these processes in nature. Here we demonstrate that rates of species diversification are highly correlated with the rate of body size evolution across the 30,000+ living species of ray-finned fishes that comprise the majority of vertebrate biological diversity. This coupling is a general feature of fish evolution and transcends vast differences in ecology and body-plan organization. Our results may reflect a widespread speciational mode of character change in living fishes. Alternatively, these findings are consistent with the hypothesis that phenotypic 'evolvability'-the capacity of organisms to evolve-shapes the dynamics of speciation through time at the largest phylogenetic scales.
Collapse
|
42
|
Undersampling taxa will underestimate molecular divergence dates: an example from the South american lizard clade liolaemini. INTERNATIONAL JOURNAL OF EVOLUTIONARY BIOLOGY 2013; 2013:628467. [PMID: 24222886 PMCID: PMC3809987 DOI: 10.1155/2013/628467] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Revised: 08/30/2013] [Accepted: 08/31/2013] [Indexed: 11/24/2022]
Abstract
Methods for estimating divergence times from molecular data have improved dramatically over the past decade, yet there are few studies examining alternative taxon sampling effects on node age estimates. Here, I investigate the effect of undersampling species diversity on node ages of the South American lizard clade Liolaemini using several alternative subsampling strategies for both time calibrations and taxa numbers. Penalized likelihood (PL) and Bayesian molecular dating analyses were conducted on a densely sampled (202 taxa) mtDNA-based phylogenetic hypothesis of Iguanidae, including 92 Liolaemini species. Using all calibrations and penalized likelihood, clades with very low taxon sampling had node age estimates younger than clades with more complete taxon sampling. The effect of Bayesian and PL methods differed when either one or two calibrations only were used with dense taxon sampling. Bayesian node ages were always older when fewer calibrations were used, whereas PL node ages were always younger. This work reinforces two important points: (1) whenever possible, authors should strongly consider adding as many taxa as possible, including numerous outgroups, prior to node age estimation to avoid considerable node age underestimation and (2) using more, critically assessed, and accurate fossil calibrations should yield improved divergence time estimates.
Collapse
|
43
|
Wagner PJ, Marcot JD. Modelling distributions of fossil sampling rates over time, space and taxa: assessment and implications for macroevolutionary studies. Methods Ecol Evol 2013. [DOI: 10.1111/2041-210x.12088] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Peter J. Wagner
- Department of Paleobiology; National Museum of Natural History; Smithsonian Institution; MRC 121; PO Box 37012; Washington; DC; 20013-7012; USA
| | - Jonathan D. Marcot
- Department of Animal Biology; University of Illinois; 515 Morrill Hall; 505 S. Goodwin Ave.; Urbana; IL; 61801; USA
| |
Collapse
|
44
|
Sjöstrand J, Arvestad L, Lagergren J, Sennblad B. GenPhyloData: realistic simulation of gene family evolution. BMC Bioinformatics 2013; 14:209. [PMID: 23803001 PMCID: PMC3703295 DOI: 10.1186/1471-2105-14-209] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2013] [Accepted: 06/23/2013] [Indexed: 11/10/2022] Open
Abstract
Background PrIME-GenPhyloData is a suite of tools for creating realistic simulated phylogenetic trees, in particular for families of homologous genes. It supports generation of trees based on a birth-death process and—perhaps more interestingly—also supports generation of gene family trees guided by a known (synthetic or biological) species tree while accounting for events such as gene duplication, gene loss, and lateral gene transfer (LGT). The suite also supports a wide range of branch rate models enabling relaxation of the molecular clock. Result Simulated data created with PrIME-GenPhyloData can be used for benchmarking phylogenetic approaches, or for characterizing models or model parameters with respect to biological data. Conclusion The concept of tree-in-tree evolution can also be used to model, for instance, biogeography or host-parasite co-evolution.
Collapse
Affiliation(s)
- Joel Sjöstrand
- Department of Numerical Analysis and Computer Science, Stockholm University, Stockholm, Sweden
| | | | | | | |
Collapse
|
45
|
Abstract
We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.
Collapse
Affiliation(s)
- Alexandre Bouchard-Côté
- Department of Statistics, University of British Columbia, Vancouver, BC, Canada V6T 1Z4; and
| | - Michael I. Jordan
- Departments of Statistics and Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720
| |
Collapse
|
46
|
Abstract
Background Reconciliation is the classical method for inferring a duplication and loss history from a set of extant genes. It is based upon the notion of embedding the gene tree into the species tree, the incongruence between the two indicating evidence for duplication and loss. However, results obtained by this method are highly dependent upon the considered species and gene trees. Thus, painstaking attention has been given to the development of methods for reconstructing accurate gene trees. Results This paper highlights the fact that errors in gene trees are not the only reasons for the inference of an erroneous duplication-loss history. More precisely, we prove that, under certain reasonable hypotheses based on the widely accepted link between function and sequence constraints, even a well-supported gene tree yield a reconciliation that does not correspond to the true history. We then provide the theoretical underpinnings for a conservative approach to infer histories given such gene trees. We apply our method to the mammalian interleukin-1 (IL) gene tree, that has been used as a model example to illustrate the role of reconciliation.
Collapse
Affiliation(s)
- Krister M Swenson
- Département d'Informatique, DIRO, Université de Montréal, H3C 3J7 Canada.
| | | |
Collapse
|
47
|
Ronquist F, Klopfstein S, Vilhelmsen L, Schulmeister S, Murray DL, Rasnitsyn AP. A total-evidence approach to dating with fossils, applied to the early radiation of the hymenoptera. Syst Biol 2012; 61:973-99. [PMID: 22723471 PMCID: PMC3478566 DOI: 10.1093/sysbio/sys058] [Citation(s) in RCA: 445] [Impact Index Per Article: 37.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Revised: 10/19/2011] [Accepted: 06/07/2012] [Indexed: 12/04/2022] Open
Abstract
Phylogenies are usually dated by calibrating interior nodes against the fossil record. This relies on indirect methods that, in the worst case, misrepresent the fossil information. Here, we contrast such node dating with an approach that includes fossils along with the extant taxa in a Bayesian total-evidence analysis. As a test case, we focus on the early radiation of the Hymenoptera, mostly documented by poorly preserved impression fossils that are difficult to place phylogenetically. Specifically, we compare node dating using nine calibration points derived from the fossil record with total-evidence dating based on 343 morphological characters scored for 45 fossil (4--20 complete) and 68 extant taxa. In both cases we use molecular data from seven markers (∼5 kb) for the extant taxa. Because it is difficult to model speciation, extinction, sampling, and fossil preservation realistically, we develop a simple uniform prior for clock trees with fossils, and we use relaxed clock models to accommodate rate variation across the tree. Despite considerable uncertainty in the placement of most fossils, we find that they contribute significantly to the estimation of divergence times in the total-evidence analysis. In particular, the posterior distributions on divergence times are less sensitive to prior assumptions and tend to be more precise than in node dating. The total-evidence analysis also shows that four of the seven Hymenoptera calibration points used in node dating are likely to be based on erroneous or doubtful assumptions about the fossil placement. With respect to the early radiation of Hymenoptera, our results suggest that the crown group dates back to the Carboniferous, ∼309 Ma (95% interval: 291--347 Ma), and diversified into major extant lineages much earlier than previously thought, well before the Triassic. [Bayesian inference; fossil dating; morphological evolution; relaxed clock; statistical phylogenetics.].
Collapse
Affiliation(s)
- Fredrik Ronquist
- Department of Biodiversity Informatics, Swedish Museum of Natural History, Box 50007, SE-104 05 Stockholm, Sweden.
| | | | | | | | | | | |
Collapse
|
48
|
Ponciano JM, Burleigh JG, Braun EL, Taper ML. Assessing parameter identifiability in phylogenetic models using data cloning. Syst Biol 2012; 61:955-72. [PMID: 22649181 PMCID: PMC3478565 DOI: 10.1093/sysbio/sys055] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2011] [Revised: 02/02/2012] [Accepted: 05/25/2012] [Indexed: 11/14/2022] Open
Abstract
The success of model-based methods in phylogenetics has motivated much research aimed at generating new, biologically informative models. This new computer-intensive approach to phylogenetics demands validation studies and sound measures of performance. To date there has been little practical guidance available as to when and why the parameters in a particular model can be identified reliably. Here, we illustrate how Data Cloning (DC), a recently developed methodology to compute the maximum likelihood estimates along with their asymptotic variance, can be used to diagnose structural parameter nonidentifiability (NI) and distinguish it from other parameter estimability problems, including when parameters are structurally identifiable, but are not estimable in a given data set (INE), and when parameters are identifiable, and estimable, but only weakly so (WE). The application of the DC theorem uses well-known and widely used Bayesian computational techniques. With the DC approach, practitioners can use Bayesian phylogenetics software to diagnose nonidentifiability. Theoreticians and practitioners alike now have a powerful, yet simple tool to detect nonidentifiability while investigating complex modeling scenarios, where getting closed-form expressions in a probabilistic study is complicated. Furthermore, here we also show how DC can be used as a tool to examine and eliminate the influence of the priors, in particular if the process of prior elicitation is not straightforward. Finally, when applied to phylogenetic inference, DC can be used to study at least two important statistical questions: assessing identifiability of discrete parameters, like the tree topology, and developing efficient sampling methods for computationally expensive posterior densities.
Collapse
|
49
|
Landis MJ, Schraiber JG, Liang M. Phylogenetic analysis using Lévy processes: finding jumps in the evolution of continuous traits. Syst Biol 2012; 62:193-204. [PMID: 23034385 DOI: 10.1093/sysbio/sys086] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Gaussian processes, a class of stochastic processes including Brownian motion and the Ornstein-Uhlenbeck process, are widely used to model continuous trait evolution in statistical phylogenetics. Under such processes, observations at the tips of a phylogenetic tree have a multivariate Gaussian distribution, which may lead to suboptimal model specification under certain evolutionary conditions, as supposed in models of punctuated equilibrium or adaptive radiation. To consider non-normally distributed continuous trait evolution, we introduce a method to compute posterior probabilities when modeling continuous trait evolution as a Lévy process. Through data simulation and model testing, we establish that single-rate Brownian motion (BM) and Lévy processes with jumps generate distinct patterns in comparative data. We then analyzed body mass and endocranial volume measurements for 126 primates. We rejected single-rate BM in favor of a Lévy process with jumps for each trait, with the lineage leading to most recent common ancestor of great apes showing particularly strong evidence against single-rate BM.
Collapse
Affiliation(s)
- Michael J Landis
- Department of Integrative Biology, University of California, Berkeley, CA 94720-3140, USA
| | | | | |
Collapse
|
50
|
Mulcahy DG, Noonan BP, Moss T, Townsend TM, Reeder TW, Sites JW, Wiens JJ. Estimating divergence dates and evaluating dating methods using phylogenomic and mitochondrial data in squamate reptiles. Mol Phylogenet Evol 2012; 65:974-91. [PMID: 22982760 DOI: 10.1016/j.ympev.2012.08.018] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2011] [Revised: 08/21/2012] [Accepted: 08/22/2012] [Indexed: 11/16/2022]
Abstract
Recently, phylogenetics has expanded to routinely include estimation of clade ages in addition to their relationships. Various dating methods have been used, but their relative performance remains understudied. Here, we generate and assemble an extensive phylogenomic data set for squamate reptiles (lizards and snakes) and evaluate two widely used dating methods, penalized likelihood in r8s (r8s-PL) and Bayesian estimation with uncorrelated relaxed rates among lineages (BEAST). We obtained sequence data from 25 nuclear loci (∼500-1,000 bp per gene; 19,020bp total) for 64 squamate species and nine outgroup taxa, estimated the phylogeny, and estimated divergence dates using 14 fossil calibrations. We then evaluated how well each method approximated these dates using random subsets of the nuclear loci (2, 5, 10, 15, and 20; replicated 10 times each), and using ∼1 kb of the mitochondrial ND2 gene. We find that estimates from r8s-PL based on 2, 5, or 10 loci can differ considerably from those based on 25 loci (mean absolute value of differences between 2-locus and 25-locus estimates were 9.0 Myr). Estimates from BEAST are somewhat more consistent given limited sampling of loci (mean absolute value of differences between 2 and 25-locus estimates were 5.0 Myr). Most strikingly, age estimates using r8s-PL for ND2 were ∼68-82 Myr older (mean=73.1) than those using 25 nuclear loci with r8s-PL. These results show that dates from r8s-PL with a limited number of loci (and especially mitochondrial data) can differ considerably from estimates derived from a large number of nuclear loci, whereas estimates from BEAST derived from fewer nuclear loci or mitochondrial data alone can be surprisingly similar to those from many nuclear loci. However, estimates from BEAST using relatively few loci and mitochondrial data could still show substantial deviations from the full data set (>50 Myr), suggesting the benefits of sampling many nuclear loci. Finally, we found that confidence intervals on ages from BEAST were not significantly different when sampling 2 vs. 25 loci, suggesting that adding loci decreased errors but did not increase confidence in those estimates.
Collapse
Affiliation(s)
- Daniel G Mulcahy
- Department of Biology, Brigham Young University, Provo, UT 84602, USA.
| | | | | | | | | | | | | |
Collapse
|