1
|
Sianga-Mete R, Hartnady P, Mandikumba WC, Rutherford K, Currin CB, Phelanyane F, Stefan S, Kosakovsky Pond SL, Martin DP. Viral genome sequence datasets display pervasive evidence of strand-specific substitution biases that are best described using non-reversible nucleotide substitution models. RESEARCH SQUARE 2022:rs.3.rs-2407778. [PMID: 36597548 PMCID: PMC9810213 DOI: 10.21203/rs.3.rs-2407778/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Background The vast majority of phylogenetic trees are inferred from molecular sequence data (nucleotides or amino acids) using time-reversible evolutionary models which assume that, for any pair of nucleotide or amino acid characters, the relative rate of X to Y substitution is the same as the relative rate of Y to X substitution. However, this reversibility assumption is unlikely to accurately reflect the actual underlying biochemical and/or evolutionary processes that lead to the fixation of substitutions. Here, we use empirical viral genome sequence data to reveal that evolutionary non-reversibility is pervasive among most groups of viruses. Specifically, we consider two non-reversible nucleotide substitution models: (1) a 6-rate non-reversible model (NREV6) in which Watson-Crick complementary substitutions occur at identical relative rates and which might therefor be most applicable to analyzing the evolution of genomes where both complementary strands are subject to the same mutational processes (such as might be expected for double-stranded (ds) RNA or dsDNA genomes); and (2) a 12-rate non-reversible model (NREV12) in which all relative substitution types are free to occur at different rates and which might therefore be applicable to analyzing the evolution of genomes where the complementary genome strands are subject to different mutational processes (such as might be expected for viruses with single-stranded (ss) RNA or ssDNA genomes). Results Using likelihood ratio and Akaike Information Criterion-based model tests, we show that, surprisingly, NREV12 provided a significantly better fit to 21/31 dsRNA and 20/30 dsDNA datasets than did the general time reversible (GTR) and NREV6 models with NREV6 providing a better fit than NREV12 and GTR in only 5/30 dsDNA and 2/31 dsRNA datasets. As expected, NREV12 provided a significantly better fit to 24/33 ssDNA and 40/47 ssRNA datasets. Next, we used simulations to show that increasing degrees of strand-specific substitution bias decrease the accuracy of phylogenetic inference irrespective of whether GTR or NREV12 is used to describe mutational processes. However, in cases where strand-specific substitution biases are extreme (such as in SARS-CoV-2 and Torque teno sus virus datasets) NREV12 tends to yield more accurate phylogenetic trees than those obtained using GTR. Conclusion We show that NREV12 should, be seriously considered during the model selection phase of phylogenetic analyses involving viral genomic sequences.
Collapse
|
2
|
Dang CC, Minh BQ, McShea H, Masel J, James JE, Vinh LS, Lanfear R. OUP accepted manuscript. Syst Biol 2022; 71:1110-1123. [PMID: 35139203 PMCID: PMC9366462 DOI: 10.1093/sysbio/syac007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 01/30/2022] [Indexed: 11/12/2022] Open
Affiliation(s)
- Cuong Cao Dang
- Faculty of Information Technology, University of Engineering and Technology, Vietnam National University, 144 Xuan Thuy, Cau Giay, Hanoi 10000, Vietnam
| | - Bui Quang Minh
- Computational Phylogenomics Lab, School of Computing, Australian National University, Canberra, Australian Capital Territory 2601, Australia
| | - Hanon McShea
- Department of Earth System Science, School of Earth, Energy, and Environmental Sciences, Stanford University, Palo Alto, CA 94305, USA
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| | - Jennifer Eleanor James
- Department of Ecology and Genetics, Plant Ecology and Evolution, Evolutionary Biology Center, Uppsala University, Uppsala, SE-752 36, Sweden
| | - Le Sy Vinh
- Correspondence to be sent to: Faculty of Information Technology, University of Engineering and Technology, Vietnam National University, Hanoi, 144 Xuan Thuy, Cau Giay, Hanoi 10000, Vietnam; E-mail: Cuong Cao Dang and Bui Quang Minh contributed equally to the work.
| | - Robert Lanfear
- Division of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| |
Collapse
|
3
|
Naser-Khdour S, Minh BQ, Lanfear R. Assessing Confidence in Root Placement on Phylogenies: An Empirical Study Using Non-Reversible Models for Mammals. Syst Biol 2021; 71:959-972. [PMID: 34387349 PMCID: PMC9260635 DOI: 10.1093/sysbio/syab067] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 08/03/2021] [Accepted: 08/11/2021] [Indexed: 11/14/2022] Open
Abstract
Using time-reversible Markov models is a very common practice in phylogenetic analysis,
because although we expect many of their assumptions to be violated by empirical data,
they provide high computational efficiency. However, these models lack the ability to
infer the root placement of the estimated phylogeny. In order to compensate for the
inability of these models to root the tree, many researchers use external information such
as using outgroup taxa or additional assumptions such as molecular clocks. In this study,
we investigate the utility of nonreversible models to root empirical phylogenies and
introduce a new bootstrap measure, the rootstrap, which provides
information on the statistical support for any given root position. [Bootstrap;
nonreversible models; phylogenetic inference; root estimation.]
Collapse
Affiliation(s)
- Suha Naser-Khdour
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Bui Quang Minh
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia.,Research School of Computer Science, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
4
|
Hannaford NE, Heaps SE, Nye TMW, Williams TA, Embley TM. Incorporating compositional heterogeneity into Lie Markov models for phylogenetic inference. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
5
|
Jermiin LS, Catullo RA, Holland BR. A new phylogenetic protocol: dealing with model misspecification and confirmation bias in molecular phylogenetics. NAR Genom Bioinform 2020; 2:lqaa041. [PMID: 33575594 PMCID: PMC7671319 DOI: 10.1093/nargab/lqaa041] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 05/18/2020] [Accepted: 06/04/2020] [Indexed: 12/15/2022] Open
Abstract
Molecular phylogenetics plays a key role in comparative genomics and has increasingly significant impacts on science, industry, government, public health and society. In this paper, we posit that the current phylogenetic protocol is missing two critical steps, and that their absence allows model misspecification and confirmation bias to unduly influence phylogenetic estimates. Based on the potential offered by well-established but under-used procedures, such as assessment of phylogenetic assumptions and tests of goodness of fit, we introduce a new phylogenetic protocol that will reduce confirmation bias and increase the accuracy of phylogenetic estimates.
Collapse
Affiliation(s)
- Lars S Jermiin
- CSIRO Land & Water, Canberra, ACT 2601, Australia
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
- School of Biology & Environment Science, University College Dublin, Belfield, Dublin 4, Ireland
- Earth Institute, University College Dublin, Belfield, Dublin 4, Ireland
| | - Renee A Catullo
- CSIRO Land & Water, Canberra, ACT 2601, Australia
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
- School of Science and Health & Hawkesbury Institute of the Environment, Western Sydney University, Penrith, NSW 2751, Australia
| | - Barbara R Holland
- School of Natural Sciences, University of Tasmania, Hobart, TAS 7001, Australia
| |
Collapse
|
6
|
Zanin M, Rodríguez-González A, Menasalvas Ruiz E, Papo D. Assessing Time Series Reversibility through Permutation Patterns. ENTROPY (BASEL, SWITZERLAND) 2018; 20:e20090665. [PMID: 33265754 PMCID: PMC7513188 DOI: 10.3390/e20090665] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 08/29/2018] [Accepted: 08/31/2018] [Indexed: 11/16/2022]
Abstract
Time irreversibility, i.e., the lack of invariance of the statistical properties of a system under time reversal, is a fundamental property of all systems operating out of equilibrium. Time reversal symmetry is associated with important statistical and physical properties and is related to the predictability of the system generating the time series. Over the past fifteen years, various methods to quantify time irreversibility in time series have been proposed, but these can be computationally expensive. Here, we propose a new method, based on permutation entropy, which is essentially parameter-free, temporally local, yields straightforward statistical tests, and has fast convergence properties. We apply this method to the study of financial time series, showing that stocks and indices present a rich irreversibility dynamics. We illustrate the comparative methodological advantages of our method with respect to a recently proposed method based on visibility graphs, and discuss the implications of our results for financial data analysis and interpretation.
Collapse
Affiliation(s)
- Massimiliano Zanin
- Center for Biomedical Technology, Universidad Politécnica de Madrid, 28223 Pozuelo de Alarcón, 28040 Madrid, Spain
- Department of Computer Science, Faculty of Science and Technology, Universidade Nova de Lisboa, 2829-516 Lisboa, Portugal
- Correspondence: ; Tel.: +34-91-336-4632
| | - Alejandro Rodríguez-González
- Center for Biomedical Technology, Universidad Politécnica de Madrid, 28223 Pozuelo de Alarcón, 28040 Madrid, Spain
| | - Ernestina Menasalvas Ruiz
- Center for Biomedical Technology, Universidad Politécnica de Madrid, 28223 Pozuelo de Alarcón, 28040 Madrid, Spain
| | - David Papo
- SCALab UMR CNRS 9193, University of Lille, 59800 Villeneuve d’Ascq, France
| |
Collapse
|
7
|
Cherlin S, Heaps SE, Nye TMW, Boys RJ, Williams TA, Embley TM. The Effect of Nonreversibility on Inferring Rooted Phylogenies. Mol Biol Evol 2018; 35:984-1002. [PMID: 29149300 PMCID: PMC5889004 DOI: 10.1093/molbev/msx294] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Most phylogenetic models assume that the evolutionary process is stationary and reversible. In addition to being biologically improbable, these assumptions also impair inference by generating models under which the likelihood does not depend on the position of the root. Consequently, the root of the tree cannot be inferred as part of the analysis. Yet identifying the root position is a key component of phylogenetic inference because it provides a point of reference for polarizing ancestor-descendant relationships and therefore interpreting the tree. In this paper, we investigate the effect of relaxing the unrealistic reversibility assumption and allowing the position of the root to be another unknown. We propose two hierarchical models that are centered on a reversible model but perturbed to allow nonreversibility. The models differ in the degree of structure imposed on the perturbations. The analysis is performed in the Bayesian framework using Markov chain Monte Carlo methods for which software is provided. We illustrate the performance of the two nonreversible models in analyses of simulated data using two types of topological priors. We then apply the models to a real biological data set, the radiation of polyploid yeasts, for which there is robust biological opinion about the root position. Finally, we apply the models to a second biological alignment for which the rooted tree is controversial: the ribosomal tree of life. We compare the two nonreversible models and conclude that both are useful in inferring the position of the root from real biological data.
Collapse
Affiliation(s)
- Svetlana Cherlin
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Sarah E Heaps
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom M W Nye
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Richard J Boys
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
8
|
Kaehler BD. Full reconstruction of non-stationary strand-symmetric models on rooted phylogenies. J Theor Biol 2017; 420:144-151. [PMID: 28286217 DOI: 10.1016/j.jtbi.2017.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 03/06/2017] [Accepted: 03/08/2017] [Indexed: 10/20/2022]
Abstract
Understanding the evolutionary relationship among species is of fundamental importance to the biological sciences. The location of the root in any phylogenetic tree is critical as it gives an order to evolutionary events. None of the popular models of nucleotide evolution currently used in likelihood or Bayesian methods are able to infer the location of the root without exogenous information. It is known that the most general Markov models of nucleotide substitution also cannot identify the location of the root or be fitted to multiple sequence alignments with fewer than three sequences. We prove that the location of the root and the full model can be identified and statistically consistently estimated for a non-stationary, strand-symmetric substitution model given a multiple sequence alignment with two or more sequences. We also generalise earlier work to provide a practical means of overcoming the computationally intractable problem of labelling hidden states in a phylogenetic model.
Collapse
Affiliation(s)
- Benjamin D Kaehler
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia.
| |
Collapse
|
9
|
Williams TA, Heaps SE, Cherlin S, Nye TMW, Boys RJ, Embley TM. New substitution models for rooting phylogenetic trees. Philos Trans R Soc Lond B Biol Sci 2015; 370:20140336. [PMID: 26323766 PMCID: PMC4571574 DOI: 10.1098/rstb.2014.0336] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/04/2015] [Indexed: 12/23/2022] Open
Abstract
The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made.
Collapse
Affiliation(s)
- Tom A Williams
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Sarah E Heaps
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Svetlana Cherlin
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Tom M W Nye
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Richard J Boys
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - T Martin Embley
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| |
Collapse
|
10
|
De Maio N, Schlötterer C, Kosiol C. Linking great apes genome evolution across time scales using polymorphism-aware phylogenetic models. Mol Biol Evol 2013; 30:2249-62. [PMID: 23906727 PMCID: PMC3773373 DOI: 10.1093/molbev/mst131] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The genomes of related species contain valuable information on the history of the considered taxa. Great apes in particular exhibit variation of evolutionary patterns along their genomes. However, the great ape data also bring new challenges, such as the presence of incomplete lineage sorting and ancestral shared polymorphisms. Previous methods for genome-scale analysis are restricted to very few individuals or cannot disentangle the contribution of mutation rates and fixation biases. This represents a limitation both for the understanding of these forces as well as for the detection of regions affected by selection. Here, we present a new model designed to estimate mutation rates and fixation biases from genetic variation within and between species. We relax the assumption of instantaneous substitutions, modeling substitutions as mutational events followed by a gradual fixation. Hence, we straightforwardly account for shared ancestral polymorphisms and incomplete lineage sorting. We analyze genome-wide synonymous site alignments of human, chimpanzee, and two orangutan species. From each taxon, we include data from several individuals. We estimate mutation rates and GC-biased gene conversion intensity. We find that both mutation rates and biased gene conversion vary with GC content. We also find lineage-specific differences, with weaker fixation biases in orangutan species, suggesting a reduced historical effective population size. Finally, our results are consistent with directional selection acting on coding sequences in relation to exonic splicing enhancers.
Collapse
Affiliation(s)
- Nicola De Maio
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| | | | | |
Collapse
|
11
|
McCandlish DM. On the findability of genotypes. Evolution 2013; 67:2592-603. [PMID: 24033169 DOI: 10.1111/evo.12128] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2012] [Accepted: 03/14/2013] [Indexed: 02/02/2023]
Abstract
Can we define a measure that describes how easy or difficult it is for a population to evolve to a specific genotype? For populations evolving under weak mutation on a time-invariant fitness landscape, I argue that one appropriate measure is the expected waiting time, starting from equilibrium, for a population to become fixed for a given genotype. Under this definition for the "findability" of genotypes, I show that for any pair of genotypes (1) a population at equilibrium is always more likely to fix at the more findable before the less findable genotype and (2) the expected time to evolve from the more findable to the less findable genotype is always greater that the expected time to evolve in the opposite direction. Although increasing the fitness of a genotype always increases its findability, in general there is no simple relationship between the rank ordering of genotypes by fitness and the rank ordering of genotypes by findability. I also present a method for quantifying the relative contributions of mutation, selection, substitution rate, and probability of reversion to a genotype's findability.
Collapse
Affiliation(s)
- David M McCandlish
- Biology Department, Duke University, Box 90338, Durham, North Carolina, 27708; Current Address: Lynch Labs, Room 204K, Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, 19104.
| |
Collapse
|
12
|
Altenberg L. The evolution of dispersal in random environments and the principle of partial control. ECOL MONOGR 2012. [DOI: 10.1890/11-1136.1] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
13
|
An Evolutionary Reduction Principle for Mutation Rates at Multiple Loci. Bull Math Biol 2010; 73:1227-70. [DOI: 10.1007/s11538-010-9557-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2009] [Accepted: 06/04/2010] [Indexed: 01/07/2023]
|
14
|
Polak P, Arndt PF. Long-range bidirectional strand asymmetries originate at CpG islands in the human genome. Genome Biol Evol 2009; 1:189-97. [PMID: 20333189 PMCID: PMC2817419 DOI: 10.1093/gbe/evp024] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/22/2009] [Indexed: 12/24/2022] Open
Abstract
In the human genome, CpG islands (CGIs), which are GC- and CpG-rich sequences, are associated with transcription starting sites (TSSs); in addition, there is evidence that CGIs harbor origins of bidirectional replication (OBRs) and are preferred sites for heteroduplex formation during recombination. Transcription, replication, and recombination processes are known to induce specific mutational patterns in various genomes, and therefore, these patterns are expected to be found around CGIs. We use triple alignments of human, chimp, and macaque to compute the rates of nucleotide substitutions in up to 1 Mbps long intergenic regions on both sides of CGIs. Our analysis revealed that around a CGI there is an asymmetry between complementary substitution rates that is similar to the one that found around the OBR in bacteria. We hypothesize that these asymmetries are induced by differences in the replication of the leading and lagging strand and that a significant number of CGIs overlap OBRs. Within CGIs, we observed a mutational signature of GC-biased gene conversion that is associated with recombination. We suggest that recombination has played a major role in the creation of CGIs.
Collapse
Affiliation(s)
- Paz Polak
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | |
Collapse
|
15
|
Schneider A, Cannarozzi GM. Support patterns from different outgroups provide a strong phylogenetic signal. Mol Biol Evol 2009; 26:1259-72. [PMID: 19240194 DOI: 10.1093/molbev/msp034] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
It is known that the accuracy of phylogenetic reconstruction decreases when more distant outgroups are used. We quantify this phenomenon with a novel scoring method, the outgroup score pOG. This score expresses if the support for a particular branch of a tree decreases with increasingly distant outgroups. Large-scale simulations confirmed that the outgroup support follows this expectation and that the pOG score captures this pattern. The score often identifies the correct topology even when the primary reconstruction methods fail, particularly in the presence of model violations. In simulations of problematic phylogenetic scenarios such as rate variation among lineages (which can lead to long-branch attraction artifacts) and quartet-based reconstruction, the pOG analysis outperformed the primary reconstruction methods. Because the pOG method does not make any assumptions about the evolutionary model (besides the decreasing support from increasingly distant outgroups), it can detect cases of violations not treated by a specific model or too strong to be fully corrected. When used as an optimization criterion in the construction of a tree of 23 mammals, the outgroup signal confirmed many well-accepted mammalian orders and superorders. It supports Atlantogenata, a clade of Afrotheria and Xenarthra, and suggests an Artiodactyla-Chiroptera clade.
Collapse
Affiliation(s)
- Adrian Schneider
- ETH Zurich, Department of Computer Science, Zurich, Switzerland.
| | | |
Collapse
|