1
|
Gupta A, Mirarab S, Turakhia Y. Accurate, scalable, and fully automated inference of species trees from raw genome assemblies using ROADIES. Proc Natl Acad Sci U S A 2025; 122:e2500553122. [PMID: 40314967 PMCID: PMC12088440 DOI: 10.1073/pnas.2500553122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2025] [Accepted: 03/31/2025] [Indexed: 05/03/2025] Open
Abstract
Current genome sequencing initiatives across a wide range of life forms offer significant potential to enhance our understanding of evolutionary relationships and support transformative biological and medical applications. Species trees play a central role in many of these applications; however, despite the widespread availability of genome assemblies, accurate inference of species trees remains challenging due to the limited automation, substantial domain expertise, and computational resources required by conventional methods. To address this limitation, we present ROADIES, a fully automated pipeline to infer species trees starting from raw genome assemblies. In contrast to the prominent approach, ROADIES incorporates a unique strategy of randomly sampling segments of the input genomes to generate gene trees. This eliminates the need for predefining a set of loci, limiting the analyses to a fixed number of genes, and performing the cumbersome gene annotation and/or whole genome alignment steps. ROADIES also eliminates the need to infer orthology by leveraging existing discordance-aware methods that allow multicopy genes. Using the genomic datasets from large-scale sequencing efforts across four diverse life forms (placental mammals, pomace flies, birds, and budding yeasts), we show that ROADIES infers species trees that are comparable in quality to the state-of-the-art studies but in a fraction of the time and effort, including on challenging datasets with rampant gene tree discordance and complex polyploidy. With its speed, accuracy, and automation, ROADIES has the potential to vastly simplify species tree inference, making it accessible to a broader range of scientists and applications.
Collapse
Affiliation(s)
- Anshu Gupta
- Department of Computer Science and Engineering, University of California, San Diego, CA92093
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego, CA92093
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California, San Diego, CA92093
| |
Collapse
|
2
|
Bernstein JM, Francioli YZ, Schield DR, Adams RH, Perry BW, Farleigh K, Smith CF, Meik JM, Mackessy SP, Castoe TA. Disentangling a genome-wide mosaic of conflicting phylogenetic signals in Western Rattlesnakes. Mol Phylogenet Evol 2025; 206:108309. [PMID: 39938672 DOI: 10.1016/j.ympev.2025.108309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 02/04/2025] [Accepted: 02/08/2025] [Indexed: 02/14/2025]
Abstract
Species tree inference is often assumed to be more accurate as datasets increase in size, with whole genomes representing the best-case-scenario for estimating a single, most-likely speciation history with high confidence. However, genomes may harbor a complex mixture of evolutionary histories among loci, which amplifies the opportunity for model misspecification and impacts phylogenetic inference. Accordingly, multiple distinct and well-supported phylogenetic trees are often recovered from genome-scale data, and approaches for biologically interpreting these distinct signatures are a major challenge for evolutionary biology in the age of genomics. Here, we analyze 32 whole genomes of nine taxa and two outgroups from the Western Rattlesnake species complex. Using concordance factors, topology weighting, and concatenated and species tree analyses with a chromosome-level reference genome, we characterize the distribution of phylogenetic signal across the genomic landscape. We find that concatenated and species tree analyses of autosomes, the Z (sex) chromosome, and mitochondrial genome yield distinct, yet strongly supported phylogenies. Analyses of site-specific likelihoods show additional patterns consistent with rampant model misspecification, a likely consequence of several evolutionary processes. Together, our results suggest that a combination of historic and recent introgression, along with natural selection, recombination rate variation, and cytonuclear co-evolution of nuclear-encoded mitochondrial genes, underlie genome-wide variation in phylogenetic signal. Our results highlight both the power and complexity of interpreting whole genomes in a phylogenetic context and illustrate how patterns of phylogenetic discordance can reveal the impacts of different evolutionary processes that contribute to genome-wide variation in phylogenetic signal.
Collapse
Affiliation(s)
- Justin M Bernstein
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA
| | - Yannick Z Francioli
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA
| | - Drew R Schield
- Department of Biology, University of Virginia, Charlottesville, VA 22903, USA
| | - Richard H Adams
- Department of Entomology and Plant Pathology, University of Arkansas Agricultural Experimental Station, University of Arkansas, Fayetteville, AR 72701, USA
| | - Blair W Perry
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Keaka Farleigh
- Department of Biology, University of Virginia, Charlottesville, VA 22903, USA
| | - Cara F Smith
- Department of Biochemistry and Molecular Genetics, 12801 East 17th Avenue, University of Colorado Denver, Aurora, CO 80045, USA
| | - Jesse M Meik
- Department of Biological Sciences, Tarleton State University, Stephenville, TX 76402, USA
| | - Stephen P Mackessy
- School of Biological Sciences, University of Northern Colorado, Greeley, CO 80639, USA
| | - Todd A Castoe
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA.
| |
Collapse
|
3
|
Höhna S, Lower SE, Duchen P, Catalán A. Robustness of divergence time estimation despite gene tree estimation error: a case study of fireflies (Coleoptera: Lampyridae). Syst Biol 2025; 74:335-348. [PMID: 39534920 DOI: 10.1093/sysbio/syae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 08/28/2024] [Accepted: 11/04/2024] [Indexed: 11/16/2024] Open
Abstract
Genomic data have become ubiquitous in phylogenomic studies, including divergence time estimation, but provide new challenges. These challenges include, among others, biological gene tree discordance, methodological gene tree estimation error, and computational limitations on performing full Bayesian inference under complex models. In this study, we use a recently published firefly (Coleoptera: Lampyridae) anchored hybrid enrichment data set (AHE; 436 loci for 88 Lampyridae species and 10 outgroup species) as a case study to explore gene tree estimation error and the robustness of divergence time estimation. First, we explored the amount of model violation using posterior predictive simulations because model violations are likely to bias phylogenetic inferences and produce gene tree estimation error. We specifically focused on missing data (either uniformly distributed or systematically) and the distribution of highly variable and conserved sites (either uniformly distributed or clustered). Our assessment of model adequacy showed that standard phylogenetic substitution models are not adequate for any of the 436 AHE loci. We tested if the model violations and alignment errors resulted indeed in gene tree estimation error by comparing the observed gene tree discordance to simulated gene tree discordance under the multispecies coalescent model. Thus, we show that the inferred gene tree discordance is not only due to biological mechanism but primarily due to inference errors. Lastly, we explored if divergence time estimation is robust despite the observed gene tree estimation error. We selected four subsets of the full AHE data set, concatenated each subset and performed a Bayesian relaxed clock divergence estimation in RevBayes. The estimated divergence times overlapped for all nodes that are shared between the topologies. Thus, divergence time estimation is robust using any well selected data subset as long as the topology inference is robust.
Collapse
Affiliation(s)
- Sebastian Höhna
- GeoBio-Center, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
- Department of Earth and Environmental Sciences, Paleontology & Geobiology, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
| | - Sarah E Lower
- Department of Biology, Bucknell University, Lewisburg, PA 17837, United States
| | - Pablo Duchen
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg Universität Mainz, 55128 Mainz, Germany
| | - Ana Catalán
- GeoBio-Center, Ludwig-Maximilians-Universität München, 80333 Munich, Germany
- Division of Evolutionary Biology, Ludeig-Maximilians-Universität München, 82152 Planegg-Martinsried, Germany
| |
Collapse
|
4
|
Zhang C, Nielsen R, Mirarab S. CASTER: Direct species tree inference from whole-genome alignments. Science 2025; 387:eadk9688. [PMID: 39847611 PMCID: PMC12038793 DOI: 10.1126/science.adk9688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 08/05/2024] [Accepted: 12/04/2024] [Indexed: 01/25/2025]
Abstract
Genomes contain mosaics of discordant evolutionary histories, challenging the accurate inference of the tree of life. Although genome-wide data are routinely used for discordance-aware phylogenomic analyses, because of modeling and scalability limitations, the current practice leaves out large chunks of genomes. As more high-quality genomes become available, we urgently need discordance-aware methods to infer the tree directly from a multiple genome alignment. In this study, we introduce Coalescence-Aware Alignment-Based Species Tree Estimator (CASTER), a theoretically justified site-based method that eliminates the need to predefine recombination-free loci. CASTER is scalable to hundreds of mammalian whole genomes. We demonstrate the accuracy and scalability of CASTER in simulations that include recombination and apply CASTER to several biological datasets, showing that its per-site scores can reveal both biological and artifactual patterns of discordance across the genome.
Collapse
Affiliation(s)
- Chao Zhang
- Bioinformatics and Systems Biology, University of
California San Diego, 9500 Gilman Drive, La Jolla, 92093, CA, USA
- Integrative Biology Department, University of California
Berkeley, 110 Sproul Hall, Berkeley, 94704, CA, USA
- Globe Institute, University of Copenhagen, Øster
Voldgade 5-7, Copenhagen, 1350, Denmark
| | - Rasmus Nielsen
- Integrative Biology Department, University of California
Berkeley, 110 Sproul Hall, Berkeley, 94704, CA, USA
- Globe Institute, University of Copenhagen, Øster
Voldgade 5-7, Copenhagen, 1350, Denmark
| | - Siavash Mirarab
- Electrical and Computer Engineering, University of
California San Diego, 9500 Gilman Drive, La Jolla, 92093, CA, USA
| |
Collapse
|
5
|
Li Z, Zhang F. Comparative mitogenomics of Cheiracanthium species (Araneae: Cheiracanthiidae) with phylogenetic implication and evolutionary insights. PeerJ 2025; 13:e18314. [PMID: 39963199 PMCID: PMC11831973 DOI: 10.7717/peerj.18314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 09/24/2024] [Indexed: 02/20/2025] Open
Abstract
The genus Cheiracanthium C. L. Koch, 1839 is the most species-rich genus of the family Cheiracanthiidae. Given the unavailability of information on the evolutionary biology and molecular taxonomy of this genus, here we sequenced nine mitochondrial genomes (mitogenomes) of Cheiracanthium species, four of which were fully annotated, and conducted comparative analyses with other well-characterized Araneae mitogenomes. We also provide phylogenetic insights on the genus Cheiracanthium. The circular mitogenomes of the Cheiracanthium contain 37 genes, including 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), two ribosomal RNA genes (rRNAs) and one putative control region (CR). All genes show a high A+T bias, characterized by a negative AT skew and positive GC skew, along with numerous overlapped regions and intergenic spacers. Approximately half of the tRNAs lack TΨC and/or dihydrouracil (DHU) arm and are characterized with unpaired amino acid acceptor arms. Most PCGs used the standard ATN start codons and TAR termination codons. The mitochondrial gene order of Cheiracanthium differs significantly from the putative ancestral gene order (Limulus polyphemus). Our novel phylogenetic analyses infer Cheiracanthiidae to be the sister group of Salticidae in BI analysis, but as sister to the node with Miturgidae, Viridasiidae, Corinnidae, Selenopidae, Salticidae, and Philodromidae in ML analysis. We confirm that Cheiracanthium is paraphyletic, for the first time using molecular phylogenetic approaches, with the earliest divergence estimated at 67 Ma. Our findings enhance our understanding of Cheiracanthium taxonomy and evolution.
Collapse
Affiliation(s)
- Zhaoyi Li
- Key Laboratory of Zoological Systematics and Application of Hebei Province, College of Life Sciences, Hebei University, Baoding, Hebei, China
| | - Feng Zhang
- Key Laboratory of Zoological Systematics and Application of Hebei Province, College of Life Sciences, Hebei University, Baoding, Hebei, China
- Hebei Basic Science Center for Biotic Interaction, Hebei University, Baoding, Hebei, China
| |
Collapse
|
6
|
Thomas GWC, Gemmell P, Shakya SB, Hu Z, Liu JS, Sackton TB, Edwards SV. Practical Guidance and Workflows for Identifying Fast Evolving Non-Coding Genomic Elements Using PhyloAcc. Integr Comp Biol 2024; 64:1513-1525. [PMID: 38816211 PMCID: PMC11579529 DOI: 10.1093/icb/icae056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 05/13/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open
Abstract
Comparative genomics provides ample ways to study genome evolution and its relationship to phenotypic traits. By developing and testing alternate models of evolution throughout a phylogeny, one can estimate rates of molecular evolution along different lineages in a phylogeny and link these rates with observations in extant species, such as convergent phenotypes. Pipelines for such work can help identify when and where genomic changes may be associated with, or possibly influence, phenotypic traits. We recently developed a set of models called PhyloAcc, using a Bayesian framework to estimate rates of nucleotide substitution on different branches of a phylogenetic tree and evaluate their association with pre-defined or estimated phenotypic traits. PhyloAcc-ST and PhyloAcc-GT both allow users to define a priori a set of target lineages and then compare different models to identify loci accelerating in one or more target lineages. Whereas ST considers only one species tree across all input loci, GT considers alternate topologies for every locus. PhyloAcc-C simultaneously models molecular rates and rates of continuous trait evolution, allowing the user to ask whether the two are associated. Here, we describe these models and provide tips and workflows on how to prepare the input data and run PhyloAcc.
Collapse
Affiliation(s)
| | - Patrick Gemmell
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| | - Subir B Shakya
- Informatics Group, Harvard University, Cambridge, MA 02138, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Zhirui Hu
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA
| | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| | | | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
7
|
Gupta A, Mirarab S, Turakhia Y. Accurate, scalable, and fully automated inference of species trees from raw genome assemblies using ROADIES. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.27.596098. [PMID: 38854139 PMCID: PMC11160643 DOI: 10.1101/2024.05.27.596098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Inference of species trees plays a crucial role in advancing our understanding of evolutionary relationships and has immense significance for diverse biological and medical applications. Extensive genome sequencing efforts are currently in progress across a broad spectrum of life forms, holding the potential to unravel the intricate branching patterns within the tree of life. However, estimating species trees starting from raw genome sequences is quite challenging, and the current cutting-edge methodologies require a series of error-prone steps that are neither entirely automated nor standardized. In this paper, we present ROADIES, a novel pipeline for species tree inference from raw genome assemblies that is fully automated, easy to use, scalable, free from reference bias, and provides flexibility to adjust the tradeoff between accuracy and runtime. The ROADIES pipeline eliminates the need to align whole genomes, choose a single reference species, or pre-select loci such as functional genes found using cumbersome annotation steps. Moreover, it leverages recent advances in phylogenetic inference to allow multi-copy genes, eliminating the need to detect orthology. Using the genomic datasets released from large-scale sequencing consortia across three diverse life forms (placental mammals, pomace flies, and birds), we show that ROADIES infers species trees that are comparable in quality with the state-of-the-art approaches but in a fraction of the time. By incorporating optimal approaches and automating all steps from assembled genomes to species and gene trees, ROADIES is poised to improve the accuracy, scalability, and reproducibility of phylogenomic analyses.
Collapse
Affiliation(s)
- Anshu Gupta
- Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego; San Diego, CA 92093, USA
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California, San Diego; San Diego, CA 92093, USA
| |
Collapse
|
8
|
Rick JA, Brock CD, Lewanski AL, Golcher-Benavides J, Wagner CE. Reference Genome Choice and Filtering Thresholds Jointly Influence Phylogenomic Analyses. Syst Biol 2024; 73:76-101. [PMID: 37881861 DOI: 10.1093/sysbio/syad065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 09/20/2023] [Accepted: 10/20/2023] [Indexed: 10/27/2023] Open
Abstract
Molecular phylogenies are a cornerstone of modern comparative biology and are commonly employed to investigate a range of biological phenomena, such as diversification rates, patterns in trait evolution, biogeography, and community assembly. Recent work has demonstrated that significant biases may be introduced into downstream phylogenetic analyses from processing genomic data; however, it remains unclear whether there are interactions among bioinformatic parameters or biases introduced through the choice of reference genome for sequence alignment and variant calling. We address these knowledge gaps by employing a combination of simulated and empirical data sets to investigate the extent to which the choice of reference genome in upstream bioinformatic processing of genomic data influences phylogenetic inference, as well as the way that reference genome choice interacts with bioinformatic filtering choices and phylogenetic inference method. We demonstrate that more stringent minor allele filters bias inferred trees away from the true species tree topology, and that these biased trees tend to be more imbalanced and have a higher center of gravity than the true trees. We find the greatest topological accuracy when filtering sites for minor allele count (MAC) >3-4 in our 51-taxa data sets, while tree center of gravity was closest to the true value when filtering for sites with MAC >1-2. In contrast, filtering for missing data increased accuracy in the inferred topologies; however, this effect was small in comparison to the effect of minor allele filters and may be undesirable due to a subsequent mutation spectrum distortion. The bias introduced by these filters differs based on the reference genome used in short read alignment, providing further support that choosing a reference genome for alignment is an important bioinformatic decision with implications for downstream analyses. These results demonstrate that attributes of the study system and dataset (and their interaction) add important nuance for how best to assemble and filter short-read genomic data for phylogenetic inference.
Collapse
Affiliation(s)
- Jessica A Rick
- School of Natural Resources & the Environment, University of Arizona, Tucson, AZ 85719, USA
| | - Chad D Brock
- Department of Biological Sciences, Tarleton State University, Stephenville, TX 76401, USA
| | - Alexander L Lewanski
- Department of Integrative Biology and W.K. Kellogg Biological Station, Michigan State University, East Lansing, MI 48824, USA
| | - Jimena Golcher-Benavides
- Department of Natural Resource Ecology and Management, Iowa State University, Ames, IA 50011, USA
| | - Catherine E Wagner
- Program in Ecology and Evolution, University of Wyoming, Laramie, WY 82071, USA
- Department of Botany, University of Wyoming, Laramie, WY 82071, USA
| |
Collapse
|
9
|
Edwards SV, Cloutier A, Cockburn G, Driver R, Grayson P, Katoh K, Baldwin MW, Sackton TB, Baker AJ. A nuclear genome assembly of an extinct flightless bird, the little bush moa. SCIENCE ADVANCES 2024; 10:eadj6823. [PMID: 38781323 PMCID: PMC11809649 DOI: 10.1126/sciadv.adj6823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 04/17/2024] [Indexed: 05/25/2024]
Abstract
We present a draft genome of the little bush moa (Anomalopteryx didiformis)-one of approximately nine species of extinct flightless birds from Aotearoa, New Zealand-using ancient DNA recovered from a fossil bone from the South Island. We recover a complete mitochondrial genome at 249.9× depth of coverage and almost 900 megabases of a male moa nuclear genome at ~4 to 5× coverage, with sequence contiguity sufficient to identify more than 85% of avian universal single-copy orthologs. We describe a diverse landscape of transposable elements and satellite repeats, estimate a long-term effective population size of ~240,000, identify a diverse suite of olfactory receptor genes and an opsin repertoire with sensitivity in the ultraviolet range, show that the wingless moa phenotype is likely not attributable to gene loss or pseudogenization, and identify potential function-altering coding sequence variants in moa that could be synthesized for future functional assays. This genomic resource should support further studies of avian evolution and morphological divergence.
Collapse
Affiliation(s)
- Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Alison Cloutier
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Glenn Cockburn
- Evolution of Sensory Systems Research Group, Max Planck Institute for Biological Intelligence, 82319 Seewiesen, Germany
| | - Robert Driver
- Department of Biology, East Carolina University, E 5th Street, Greenville, NC 27605, USA
| | - Phil Grayson
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Kazutaka Katoh
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita 565-0871, Japan
| | - Maude W. Baldwin
- Evolution of Sensory Systems Research Group, Max Planck Institute for Biological Intelligence, 82319 Seewiesen, Germany
| | - Timothy B. Sackton
- Informatics Group, Harvard University, 38 Oxford Street, Cambridge, MA 02138, USA
| | - Allan J. Baker
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcox Street, Toronto, ON M5S 3B2, Canada
- Department of Natural History, Royal Ontario Museum, 100 Queen’s Park, Toronto, ON M5S 2C6, Canada
| |
Collapse
|
10
|
Steenwyk JL, King N. The promise and pitfalls of synteny in phylogenomics. PLoS Biol 2024; 22:e3002632. [PMID: 38768403 PMCID: PMC11105162 DOI: 10.1371/journal.pbio.3002632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
Reconstructing the tree of life remains a central goal in biology. Early methods, which relied on small numbers of morphological or genetic characters, often yielded conflicting evolutionary histories, undermining confidence in the results. Investigations based on phylogenomics, which use hundreds to thousands of loci for phylogenetic inquiry, have provided a clearer picture of life's history, but certain branches remain problematic. To resolve difficult nodes on the tree of life, 2 recent studies tested the utility of synteny, the conserved collinearity of orthologous genetic loci in 2 or more organisms, for phylogenetics. Synteny exhibits compelling phylogenomic potential while also raising new challenges. This Essay identifies and discusses specific opportunities and challenges that bear on the value of synteny data and other rare genomic changes for phylogenomic studies. Synteny-based analyses of highly contiguous genome assemblies mark a new chapter in the phylogenomic era and the quest to reconstruct the tree of life.
Collapse
Affiliation(s)
- Jacob L. Steenwyk
- Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| | - Nicole King
- Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| |
Collapse
|
11
|
Stiller J, Feng S, Chowdhury AA, Rivas-González I, Duchêne DA, Fang Q, Deng Y, Kozlov A, Stamatakis A, Claramunt S, Nguyen JMT, Ho SYW, Faircloth BC, Haag J, Houde P, Cracraft J, Balaban M, Mai U, Chen G, Gao R, Zhou C, Xie Y, Huang Z, Cao Z, Yan Z, Ogilvie HA, Nakhleh L, Lindow B, Morel B, Fjeldså J, Hosner PA, da Fonseca RR, Petersen B, Tobias JA, Székely T, Kennedy JD, Reeve AH, Liker A, Stervander M, Antunes A, Tietze DT, Bertelsen MF, Lei F, Rahbek C, Graves GR, Schierup MH, Warnow T, Braun EL, Gilbert MTP, Jarvis ED, Mirarab S, Zhang G. Complexity of avian evolution revealed by family-level genomes. Nature 2024; 629:851-860. [PMID: 38560995 PMCID: PMC11111414 DOI: 10.1038/s41586-024-07323-1] [Citation(s) in RCA: 52] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 03/15/2024] [Indexed: 04/04/2024]
Abstract
Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species sampled, phylogenetic method and the choice of genomic regions1-3. Here we address these issues by analysing the genomes of 363 bird species4 (218 taxonomic families, 92% of total). Using intergenic regions and coalescent methods, we present a well-supported tree but also a marked degree of discordance. The tree confirms that Neoaves experienced rapid radiation at or near the Cretaceous-Palaeogene boundary. Sufficient loci rather than extensive taxon sampling were more effective in resolving difficult nodes. Remaining recalcitrant nodes involve species that are a challenge to model due to either extreme DNA composition, variable substitution rates, incomplete lineage sorting or complex evolutionary events such as ancient hybridization. Assessment of the effects of different genomic partitions showed high heterogeneity across the genome. We discovered sharp increases in effective population size, substitution rates and relative brain size following the Cretaceous-Palaeogene extinction event, supporting the hypothesis that emerging ecological opportunities catalysed the diversification of modern birds. The resulting phylogenetic estimate offers fresh insights into the rapid radiation of modern birds and provides a taxon-rich backbone tree for future comparative studies.
Collapse
Affiliation(s)
- Josefin Stiller
- Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Shaohong Feng
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Department of General Surgery, Sir Run-Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, China
| | - Al-Aabid Chowdhury
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | | | - David A Duchêne
- Center for Evolutionary Hologenomics, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Qi Fang
- BGI Research, Shenzhen, China
| | - Yuan Deng
- BGI Research, Shenzhen, China
- BGI Research, Wuhan, China
| | - Alexey Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute of Computer Science, Foundation for Research and Technology Hellas, Heraklion, Greece
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Santiago Claramunt
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
- Department of Natural History, Royal Ontario Museum, Toronto, Ontario, Canada
| | - Jacqueline M T Nguyen
- College of Science and Engineering, Flinders University, Adelaide, South Australia, Australia
- Australian Museum Research Institute, Sydney, New South Wales, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Brant C Faircloth
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
| | - Julia Haag
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Peter Houde
- Department of Biology, New Mexico State University, Las Cruces, NM, USA
| | - Joel Cracraft
- Department of Ornithology, American Museum of Natural History, New York, NY, USA
| | - Metin Balaban
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Uyen Mai
- Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Guangji Chen
- BGI Research, Wuhan, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Rongsheng Gao
- BGI Research, Wuhan, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | | | - Yulong Xie
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zijian Huang
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zhen Cao
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Zhi Yan
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Huw A Ogilvie
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Bent Lindow
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Benoit Morel
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute of Computer Science, Foundation for Research and Technology Hellas, Heraklion, Greece
| | - Jon Fjeldså
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Peter A Hosner
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
- Center for Global Mountain Biodiversity, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Rute R da Fonseca
- Center for Global Mountain Biodiversity, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Bent Petersen
- Center for Evolutionary Hologenomics, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Centre of Excellence for Omics-Driven Computational Biodiscovery, Faculty of Applied Sciences, AIMST University, Bedong, Malaysia
| | - Joseph A Tobias
- Department of Life Sciences, Imperial College London, Silwood Park, Ascot, UK
| | - Tamás Székely
- Milner Centre for Evolution, University of Bath, Bath, UK
- ELKH-DE Reproductive Strategies Research Group, University of Debrecen, Debrecen, Hungary
| | - Jonathan David Kennedy
- Center for Macroecology, Evolution, and Climate, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Andrew Hart Reeve
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Andras Liker
- HUN-REN-PE Evolutionary Ecology Research Group, University of Pannonia, Veszprém, Hungary
- Behavioural Ecology Research Group, Center for Natural Sciences, University of Pannonia, Veszprém, Hungary
| | | | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Porto, Portugal
| | | | - Mads F Bertelsen
- Centre for Zoo and Wild Animal Health, Copenhagen Zoo, Frederiksberg, Denmark
| | - Fumin Lei
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Science, University of Chinese Academy of Sciences, Beijing, China
| | - Carsten Rahbek
- Center for Global Mountain Biodiversity, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Center for Macroecology, Evolution, and Climate, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Institute of Ecology, Peking University, Beijing, China
- Danish Institute for Advanced Study, University of Southern Denmark, Odense, Denmark
| | - Gary R Graves
- Center for Macroecology, Evolution, and Climate, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | | | - Tandy Warnow
- University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL, USA
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- University Museum, NTNU, Trondheim, Norway
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Durham, NC, USA
| | | | - Guojie Zhang
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
- Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, China.
- BGI Research, Wuhan, China.
- Villum Center for Biodiversity Genomics, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
12
|
Jiang Z, Zang W, Ericson PGP, Song G, Wu S, Feng S, Drovetski SV, Liu G, Zhang D, Saitoh T, Alström P, Edwards SV, Lei F, Qu Y. Gene flow and an anomaly zone complicate phylogenomic inference in a rapidly radiated avian family (Prunellidae). BMC Biol 2024; 22:49. [PMID: 38413944 PMCID: PMC10900574 DOI: 10.1186/s12915-024-01848-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 02/15/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Resolving the phylogeny of rapidly radiating lineages presents a challenge when building the Tree of Life. An Old World avian family Prunellidae (Accentors) comprises twelve species that rapidly diversified at the Pliocene-Pleistocene boundary. RESULTS Here we investigate the phylogenetic relationships of all species of Prunellidae using a chromosome-level de novo assembly of Prunella strophiata and 36 high-coverage resequenced genomes. We use homologous alignments of thousands of exonic and intronic loci to build the coalescent and concatenated phylogenies and recover four different species trees. Topology tests show a large degree of gene tree-species tree discordance but only 40-54% of intronic gene trees and 36-75% of exonic genic trees can be explained by incomplete lineage sorting and gene tree estimation errors. Estimated branch lengths for three successive internal branches in the inferred species trees suggest the existence of an empirical anomaly zone. The most common topology recovered for species in this anomaly zone was not similar to any coalescent or concatenated inference phylogenies, suggesting presence of anomalous gene trees. However, this interpretation is complicated by the presence of gene flow because extensive introgression was detected among these species. When exploring tree topology distributions, introgression, and regional variation in recombination rate, we find that many autosomal regions contain signatures of introgression and thus may mislead phylogenetic inference. Conversely, the phylogenetic signal is concentrated to regions with low-recombination rate, such as the Z chromosome, which are also more resistant to interspecific introgression. CONCLUSIONS Collectively, our results suggest that phylogenomic inference should consider the underlying genomic architecture to maximize the consistency of phylogenomic signal.
Collapse
Affiliation(s)
- Zhiyong Jiang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Wenqing Zang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Per G P Ericson
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, Stockholm, SE-104 05, Sweden
| | - Gang Song
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Shaoyuan Wu
- Jiangsu International Joint Center of Genomics, Jiangsu Key Laboratory of Phylogenomics & Comparative Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou, 221116, Jiangsu, China
| | - Shaohong Feng
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, 314102, China
| | - Sergei V Drovetski
- National Museum of Natural History, Smithsonian Institution, Washington, DC, 20004, USA
- Present address: U.S. Geological Survey, Eastern Ecological Science Center at Patuxent Research Refuge, Laurel, MD, 20708, USA
| | - Gang Liu
- Chinese Academy of Forestry, Institute of Ecological Conservation and Restoration, Beijing, 100091, China
| | - Dezhi Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Takema Saitoh
- Yamashina Institute for Ornithology, Abiko, Chiba, Japan
| | - Per Alström
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- Animal Ecology, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18 D, 752 36, Uppsala, Sweden
| | - Scott V Edwards
- Museum of Comparative Zoology and Department of Organismic & Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
| | - Fumin Lei
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Yanhua Qu
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, Stockholm, SE-104 05, Sweden.
| |
Collapse
|
13
|
Wu S, Rheindt FE, Zhang J, Wang J, Zhang L, Quan C, Li Z, Wang M, Wu F, Qu Y, Edwards SV, Zhou Z, Liu L. Genomes, fossils, and the concurrent rise of modern birds and flowering plants in the Late Cretaceous. Proc Natl Acad Sci U S A 2024; 121:e2319696121. [PMID: 38346181 PMCID: PMC10895254 DOI: 10.1073/pnas.2319696121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 12/29/2023] [Indexed: 02/15/2024] Open
Abstract
The phylogeny and divergence timing of the Neoavian radiation remain controversial despite recent progress. We analyzed the genomes of 124 species across all Neoavian orders, using data from 25,460 loci spanning four DNA classes, including 5,756 coding sequences, 12,449 conserved nonexonic elements, 4,871 introns, and 2,384 intergenic segments. We conducted a comprehensive sensitivity analysis to account for the heterogeneity across different DNA classes, leading to an optimal tree of Neoaves with high resolution. This phylogeny features a novel Neoavian dichotomy comprising two monophyletic clades: a previously recognized Telluraves (land birds) and a newly circumscribed Aquaterraves (waterbirds and relatives). Molecular dating analyses with 20 fossil calibrations indicate that the diversification of modern birds began in the Late Cretaceous and underwent a constant and steady radiation across the KPg boundary, concurrent with the rise of angiosperms as well as other major Cenozoic animal groups including placental and multituberculate mammals. The KPg catastrophe had a limited impact on avian evolution compared to the Paleocene-Eocene Thermal Maximum, which triggered a rapid diversification of seabirds. Our findings suggest that the evolution of modern birds followed a slow process of gradualism rather than a rapid process of punctuated equilibrium, with limited interruption by the KPg catastrophe. This study places bird evolution into a new context within vertebrates, with ramifications for the evolution of the Earth's biota.
Collapse
Affiliation(s)
- Shaoyuan Wu
- Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu International Joint Center of Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
| | - Frank E Rheindt
- Department of Biological Sciences, National University of Singapore, Singapore 117543, Singapore
| | - Jin Zhang
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, Hunan 410114, China
| | - Jiajia Wang
- Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu International Joint Center of Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
| | - Lei Zhang
- Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu International Joint Center of Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
| | - Cheng Quan
- School of Earth Science and Resources, Chang'an University, Xi'an, Shaanxi 710054, China
| | - Zhiheng Li
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing 100044, China
| | - Min Wang
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing 100044, China
| | - Feixiang Wu
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing 100044, China
| | - Yanhua Qu
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138
| | - Zhonghe Zhou
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing 100044, China
| | - Liang Liu
- Department of Statistics, Institute of Bioinformatics, University of Georgia, Athens, GA 30606
| |
Collapse
|
14
|
Kiat Y, O’Connor JK. Functional constraints on the number and shape of flight feathers. Proc Natl Acad Sci U S A 2024; 121:e2306639121. [PMID: 38346196 PMCID: PMC10895369 DOI: 10.1073/pnas.2306639121] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 12/30/2023] [Indexed: 02/15/2024] Open
Abstract
As a fundamental ecological aspect of most organisms, locomotor function significantly constrains morphology. At the same time, the evolution of novel locomotor abilities has produced dramatic morphological transformations, initiating some of the most significant diversifications in life history. Despite significant new fossil evidence, it remains unclear whether volant locomotion had a single or multiple origins in pennaraptoran dinosaurs and the volant abilities of individual taxa are controversial. The evolution of powered flight in modern birds involved exaptation of feathered surfaces extending off the limbs and tail yet most studies concerning flight potential in pennaraptorans do not account for the structure and morphology of the wing feathers themselves. Analysis of the number and shape of remex and rectrix feathers across a large dataset of extant birds indicates that the number of remiges and rectrices and the degree of primary vane asymmetry strongly correlate with locomotor ability revealing important functional constraints. Among these traits, phenotypic flexibility varies reflected by the different rates at which morphological changes evolve, such that some traits reflect the ancestral condition, whereas others reflect current locomotor function. While Mesozoic birds and Microraptor have remex morphologies consistent with extant volant birds, that of anchiornithines deviate significantly providing strong evidence this clade was not volant. The results of these analyses support a single origin of dinosaurian flight and indicate the early stages of feathered wing evolution are not sampled by the currently available fossil record.
Collapse
Affiliation(s)
- Yosef Kiat
- Negaunee Integrative Research Center, Field Museum of Natural History, Chicago, IL60605
| | - Jingmai K. O’Connor
- Negaunee Integrative Research Center, Field Museum of Natural History, Chicago, IL60605
| |
Collapse
|
15
|
Rivas-González I, Schierup MH, Wakeley J, Hobolth A. TRAILS: Tree reconstruction of ancestry using incomplete lineage sorting. PLoS Genet 2024; 20:e1010836. [PMID: 38330138 PMCID: PMC10880969 DOI: 10.1371/journal.pgen.1010836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 02/21/2024] [Accepted: 01/22/2024] [Indexed: 02/10/2024] Open
Abstract
Genome-wide genealogies of multiple species carry detailed information about demographic and selection processes on individual branches of the phylogeny. Here, we introduce TRAILS, a hidden Markov model that accurately infers time-resolved population genetics parameters, such as ancestral effective population sizes and speciation times, for ancestral branches using a multi-species alignment of three species and an outgroup. TRAILS leverages the information contained in incomplete lineage sorting fragments by modelling genealogies along the genome as rooted three-leaved trees, each with a topology and two coalescent events happening in discretized time intervals within the phylogeny. Posterior decoding of the hidden Markov model can be used to infer the ancestral recombination graph for the alignment and details on demographic changes within a branch. Since TRAILS performs posterior decoding at the base-pair level, genome-wide scans based on the posterior probabilities can be devised to detect deviations from neutrality. Using TRAILS on a human-chimp-gorilla-orangutan alignment, we recover speciation parameters and extract information about the topology and coalescent times at high resolution.
Collapse
Affiliation(s)
| | - Mikkel H. Schierup
- Bioinformatics Research Center (BiRC), Aarhus University, Aarhus, Denmark
| | - John Wakeley
- Department of Organismic and Evolutionary Biology, Harvard University, Massachusetts, United States of America
| | - Asger Hobolth
- Department of Mathematics, Aarhus University, Aarhus, Denmark
| |
Collapse
|
16
|
Dai J, Rubel T, Han Y, Molloy EK. Dollo-CDP: a polynomial-time algorithm for the clade-constrained large Dollo parsimony problem. Algorithms Mol Biol 2024; 19:2. [PMID: 38191515 PMCID: PMC10775561 DOI: 10.1186/s13015-023-00249-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 12/10/2023] [Indexed: 01/10/2024] Open
Abstract
The last decade of phylogenetics has seen the development of many methods that leverage constraints plus dynamic programming. The goal of this algorithmic technique is to produce a phylogeny that is optimal with respect to some objective function and that lies within a constrained version of tree space. The popular species tree estimation method ASTRAL, for example, returns a tree that (1) maximizes the quartet score computed with respect to the input gene trees and that (2) draws its branches (bipartitions) from the input constraint set. This technique has yet to be used for parsimony problems where the input are binary characters, sometimes with missing values. Here, we introduce the clade-constrained character parsimony problem and present an algorithm that solves this problem for the Dollo criterion score in [Formula: see text] time, where n is the number of leaves, k is the number of characters, and [Formula: see text] is the set of clades used as constraints. Dollo parsimony, which requires traits/mutations to be gained at most once but allows them to be lost any number of times, is widely used for tumor phylogenetics as well as species phylogenetics, for example analyses of low-homoplasy retroelement insertions across the vertebrate tree of life. This motivated us to implement our algorithm in a software package, called Dollo-CDP, and evaluate its utility for analyzing retroelement insertion presence / absence patterns for bats, birds, toothed whales as well as simulated data. Our results show that Dollo-CDP can improve upon heuristic search from a single starting tree, often recovering a better scoring tree. Moreover, Dollo-CDP scales to data sets with much larger numbers of taxa than branch-and-bound while still having an optimality guarantee, albeit a more restricted one. Lastly, we show that our algorithm for Dollo parsimony can easily be adapted to Camin-Sokal parsimony but not Fitch parsimony.
Collapse
Affiliation(s)
- Junyan Dai
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Tobias Rubel
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Yunheng Han
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, MD, USA.
- University of Maryland Institute for Advanced Computer Studies, College Park, MD, USA.
| |
Collapse
|
17
|
Kulkarni S, Wood HM, Hormiga G. Advances in the reconstruction of the spider tree of life: A roadmap for spider systematics and comparative studies. Cladistics 2023; 39:479-532. [PMID: 37787157 DOI: 10.1111/cla.12557] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 07/27/2023] [Accepted: 08/17/2023] [Indexed: 10/04/2023] Open
Abstract
In the last decade and a half, advances in genetic sequencing technologies have revolutionized systematics, transforming the field from studying morphological characters or a few genetic markers, to genomic datasets in the phylogenomic era. A plethora of molecular phylogenetic studies on many taxonomic groups have come about, converging on, or refuting prevailing morphology or legacy-marker-based hypotheses about evolutionary affinities. Spider systematics has been no exception to this transformation and the inter-relationships of several groups have now been studied using genomic data. About 51 500 extant spider species have been described, all with a conservative body plan, but innumerable morphological and behavioural peculiarities. Inferring the spider tree of life using morphological data has been a challenging task. Molecular data have corroborated many hypotheses of higher-level relationships, but also resulted in new groups that refute previous hypotheses. In this review, we discuss recent advances in the reconstruction of the spider tree of life and highlight areas where additional effort is needed with potential solutions. We base this review on the most comprehensive spider phylogeny to date, representing 131 of the 132 spider families. To achieve this sampling, we combined six Sanger-based markers with newly generated and publicly available genome-scale datasets. We find that some inferred relationships between major lineages of spiders (such as Austrochiloidea, Palpimanoidea and Synspermiata) are robust across different classes of data. However, several new hypotheses have emerged with different classes of molecular data. We identify and discuss the robust and controversial hypotheses and compile this blueprint to design future studies targeting systematic revisions of these problematic groups. We offer an evolutionary framework to explore comparative questions such as evolution of venoms, silk, webs, morphological traits and reproductive strategies.
Collapse
Affiliation(s)
- Siddharth Kulkarni
- Department of Biological Sciences, The George Washington University, 2029 G St. NW, Washington, DC, 20052, USA
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, 1000 Constitution Avenue NW, Washington, DC, 20560, USA
| | - Hannah M Wood
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, 1000 Constitution Avenue NW, Washington, DC, 20560, USA
| | - Gustavo Hormiga
- Department of Biological Sciences, The George Washington University, 2029 G St. NW, Washington, DC, 20052, USA
| |
Collapse
|
18
|
Han Y, Molloy EK. Quartets enable statistically consistent estimation of cell lineage trees under an unbiased error and missingness model. Algorithms Mol Biol 2023; 18:19. [PMID: 38041123 PMCID: PMC10691101 DOI: 10.1186/s13015-023-00248-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 11/19/2023] [Indexed: 12/03/2023] Open
Abstract
Cancer progression and treatment can be informed by reconstructing its evolutionary history from tumor cells. Although many methods exist to estimate evolutionary trees (called phylogenies) from molecular sequences, traditional approaches assume the input data are error-free and the output tree is fully resolved. These assumptions are challenged in tumor phylogenetics because single-cell sequencing produces sparse, error-ridden data and because tumors evolve clonally. Here, we study the theoretical utility of methods based on quartets (four-leaf, unrooted phylogenetic trees) in light of these barriers. We consider a popular tumor phylogenetics model, in which mutations arise on a (highly unresolved) tree and then (unbiased) errors and missing values are introduced. Quartets are then implied by mutations present in two cells and absent from two cells. Our main result is that the most probable quartet identifies the unrooted model tree on four cells. This motivates seeking a tree such that the number of quartets shared between it and the input mutations is maximized. We prove an optimal solution to this problem is a consistent estimator of the unrooted cell lineage tree; this guarantee includes the case where the model tree is highly unresolved, with error defined as the number of false negative branches. Lastly, we outline how quartet-based methods might be employed when there are copy number aberrations and other challenges specific to tumor phylogenetics.
Collapse
Affiliation(s)
- Yunheng Han
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, MD, USA.
- University of Maryland Institute for Advanced Computer Studies, College Park, MD, USA.
| |
Collapse
|
19
|
Widrig KE, Bhullar BS, Field DJ. 3D atlas of tinamou (Neornithes: Tinamidae) pectoral morphology: Implications for reconstructing the ancestral neornithine flight apparatus. J Anat 2023; 243:729-757. [PMID: 37358291 PMCID: PMC10557402 DOI: 10.1111/joa.13919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 06/06/2023] [Accepted: 06/12/2023] [Indexed: 06/27/2023] Open
Abstract
Palaeognathae, the extant avian clade comprising the flightless ratites and flight-capable tinamous (Tinamidae), is the sister group to all other living birds, and recent phylogenetic studies illustrate that tinamous are phylogenetically nested within a paraphyletic assemblage of ratites. As the only extant palaeognaths that have retained the ability to fly, tinamous may provide key information on the nature of the flight apparatus of ancestral crown palaeognaths-and, in turn, crown birds-as well as insight into convergent modifications to the wing apparatus among extant ratite lineages. To reveal new information about the musculoskeletal anatomy of tinamous and facilitate development of computational biomechanical models of tinamou wing function, we generated a three-dimensional musculoskeletal model of the flight apparatus of the extant Andean tinamou (Nothoprocta pentlandii) using diffusible iodine-based contrast-enhanced computed tomography (diceCT). Origins and insertions of the pectoral flight musculature of N. pentlandii are generally consistent with those of other extant volant birds specialized for burst flight, and the entire suite of presumed ancestral neornithine flight muscles are present in N. pentlandii with the exception of the biceps slip. The pectoralis and supracoracoideus muscles are robust, similar to the condition in other extant burst-flying birds such as many extant Galliformes. Contrary to the condition in most extant Neognathae (the sister clade to Palaeognathae), the insertion of the pronator superficialis has a greater distal extent than the pronator profundus, although most other anatomical observations are broadly consistent with the conditions observed in extant neognaths. This work will help form a basis for future comparative studies of the avian musculoskeletal system, with implications for reconstructing the flight apparatus of ancestral crown birds and clarifying musculoskeletal modifications underlying the convergent origins of ratite flightlessness.
Collapse
Affiliation(s)
- Klara E. Widrig
- Department of Earth SciencesUniversity of CambridgeCambridgeUK
| | - Bhart‐Anjan S. Bhullar
- Department of Earth and Planetary SciencesYale UniversityNew HavenConnecticutUSA
- Peabody Museum of Natural HistoryYale UniversityNew HavenConnecticutUSA
| | - Daniel J. Field
- Department of Earth SciencesUniversity of CambridgeCambridgeUK
- Museum of ZoologyUniversity of CambridgeCambridgeUK
| |
Collapse
|
20
|
Simmons MP, Goloboff PA, Stöver BC, Springer MS, Gatesy J. Quantification of congruence among gene trees with polytomies using overall success of resolution for phylogenomic coalescent analyses. Cladistics 2023; 39:418-436. [PMID: 37096985 DOI: 10.1111/cla.12540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/22/2023] [Accepted: 03/24/2023] [Indexed: 04/26/2023] Open
Abstract
Gene-tree-inference error can cause species-tree-inference artefacts in summary phylogenomic coalescent analyses. Here we integrate two ways of accommodating these inference errors: collapsing arbitrarily or dubiously resolved gene-tree branches, and subsampling gene trees based on their pairwise congruence. We tested the effect of collapsing gene-tree branches with 0% approximate-likelihood-ratio-test (SH-like aLRT) support in likelihood analyses and strict consensus trees for parsimony, and then subsampled those partially resolved trees based on congruence measures that do not penalize polytomies. For this purpose we developed a new TNT script for congruence sorting (congsort), and used it to calculate topological incongruence for eight phylogenomic datasets using three distance measures: standard Robinson-Foulds (RF) distances; overall success of resolution (OSR), which is based on counting both matching and contradicting clades; and RF contradictions, which only counts contradictory clades. As expected, we found that gene-tree incongruence was often concentrated in clades that are arbitrarily or dubiously resolved and that there was greater congruence between the partially collapsed gene trees and the coalescent and concatenation topologies inferred from those genes. Coalescent branch lengths typically increased as the most incongruent gene trees were excluded, although branch supports typically did not. We investigated two successful and complementary approaches to prioritizing genes for investigation of alignment or homology errors. Coalescent-tree clades that contradicted concatenation-tree clades were generally less robust to gene-tree subsampling than congruent clades. Our preferred approach to collapsing likelihood gene-tree clades (0% SH-like aLRT support) and subsampling those trees (OSR) generally outperformed competing approaches for a large fungal dataset with respect to branch lengths, support and congruence. We recommend widespread application of this approach (and strict consensus trees for parsimony-based analyses) for improving quantification of gene-tree congruence/conflict, estimating coalescent branch lengths, testing robustness of coalescent analyses to gene-tree-estimation error, and improving topological robustness of summary coalescent analyses. This approach is quick and easy to implement, even for huge datasets.
Collapse
Affiliation(s)
- Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO, 80523, USA
| | - Pablo A Goloboff
- CONICET, INSUE, Fundación Miguel Lillo, Miguel Lillo 251, 4000, S.M. de Tucumán, Argentina
| | - Ben C Stöver
- Institute for Evolution and Biodiversity, WMU Münster, 48149, Münster, Germany
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, 92521, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY, 10024, USA
| |
Collapse
|
21
|
Yang Z, Ma X, Wang Q, Tian X, Sun J, Zhang Z, Xiao S, De Clerck O, Leliaert F, Zhong B. Phylotranscriptomics unveil a Paleoproterozoic-Mesoproterozoic origin and deep relationships of the Viridiplantae. Nat Commun 2023; 14:5542. [PMID: 37696791 PMCID: PMC10495350 DOI: 10.1038/s41467-023-41137-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 08/23/2023] [Indexed: 09/13/2023] Open
Abstract
The Viridiplantae comprise two main clades, the Chlorophyta (including a diverse array of marine and freshwater green algae) and the Streptophyta (consisting of the freshwater charophytes and the land plants). Lineages sister to core Chlorophyta, informally refer to as prasinophytes, form a grade of mainly planktonic green algae. Recently, one of these lineages, Prasinodermophyta, which is previously grouped with prasinophytes, has been identified as the sister lineage to both Chlorophyta and Streptophyta. Resolving the deep relationships among green plants is crucial for understanding the historical impact of green algal diversity on marine ecology and geochemistry, but has been proven difficult given the ancient timing of the diversification events. Through extensive taxon and gene sampling, we conduct large-scale phylogenomic analyses to resolve deep relationships and reveal the Prasinodermophyta as the lineage sister to Chlorophyta, raising questions about the necessity of classifying the Prasinodermophyta as a distinct phylum. We unveil that incomplete lineage sorting is the main cause of discordance regarding the placement of Prasinodermophyta. Molecular dating analyses suggest that crown-group green plants and crown-group Prasinodermophyta date back to the Paleoproterozoic-Mesoproterozoic. Our study establishes a plausible link between oxygen levels in the Paleoproterozoic-Mesoproterozoic and the origin of Viridiplantae.
Collapse
Affiliation(s)
- Zhiping Yang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xiaoya Ma
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Qiuping Wang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xiaolin Tian
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Jingyan Sun
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Zhenhua Zhang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Shuhai Xiao
- Department of Geosciences and Global Change Center, Virginia Tech, Blacksburg, VA, USA
| | - Olivier De Clerck
- Phycology Research Group and Center for Molecular Phylogenetics and Evolution, Ghent University, Ghent, Belgium
| | | | - Bojian Zhong
- College of Life Sciences, Nanjing Normal University, Nanjing, China.
| |
Collapse
|
22
|
Tan HZ, Jansen JJFJ, Allport GA, Garg KM, Chattopadhyay B, Irestedt M, Pang SEH, Chilton G, Gwee CY, Rheindt FE. Megafaunal extinctions, not climate change, may explain Holocene genetic diversity declines in Numenius shorebirds. eLife 2023; 12:e85422. [PMID: 37549057 PMCID: PMC10406428 DOI: 10.7554/elife.85422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 06/27/2023] [Indexed: 08/09/2023] Open
Abstract
Understanding the relative contributions of historical and anthropogenic factors to declines in genetic diversity is important for informing conservation action. Using genome-wide DNA of fresh and historic specimens, including that of two species widely thought to be extinct, we investigated fluctuations in genetic diversity and present the first complete phylogenomic tree for all nine species of the threatened shorebird genus Numenius, known as whimbrels and curlews. Most species faced sharp declines in effective population size, a proxy for genetic diversity, soon after the Last Glacial Maximum (around 20,000 years ago). These declines occurred prior to the Anthropocene and in spite of an increase in the breeding area predicted by environmental niche modeling, suggesting that they were not caused by climatic or recent anthropogenic factors. Crucially, these genetic diversity declines coincide with mass extinctions of mammalian megafauna in the Northern Hemisphere. Among other factors, the demise of ecosystem-engineering megafauna which maintained open habitats may have been detrimental for grassland and tundra-breeding Numenius shorebirds. Our work suggests that the impact of historical factors such as megafaunal extinction may have had wider repercussions on present-day population dynamics of open habitat biota than previously appreciated.
Collapse
Affiliation(s)
- Hui Zhen Tan
- Department of Biological Sciences, National University of SingaporeSingaporeSingapore
| | | | | | - Kritika M Garg
- Department of Biological Sciences, National University of SingaporeSingaporeSingapore
| | - Balaji Chattopadhyay
- Department of Biological Sciences, National University of SingaporeSingaporeSingapore
| | - Martin Irestedt
- Department of Bioinformatics and Genetics, Swedish Museum of Natural HistoryStockholmSweden
| | - Sean EH Pang
- Department of Biological Sciences, National University of SingaporeSingaporeSingapore
| | - Glen Chilton
- Department of Biology, St. Mary's UniversityCalgaryCanada
| | - Chyi Yin Gwee
- Department of Biological Sciences, National University of SingaporeSingaporeSingapore
| | - Frank E Rheindt
- Department of Biological Sciences, National University of SingaporeSingaporeSingapore
| |
Collapse
|
23
|
Ma X, Shi X, Wang Q, Zhao M, Zhang Z, Zhong B. A Reinvestigation of Multiple Independent Evolution and Triassic-Jurassic Origins of Multicellular Volvocine Algae. Genome Biol Evol 2023; 15:evad142. [PMID: 37498572 PMCID: PMC10410301 DOI: 10.1093/gbe/evad142] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 07/09/2023] [Accepted: 07/22/2023] [Indexed: 07/28/2023] Open
Abstract
The evolution of multicellular organisms is considered to be a major evolutionary transition, profoundly affecting the ecology and evolution of nearly all life on earth. The volvocine algae, a unique clade of chlorophytes with diverse cell morphology, provide an appealing model for investigating the evolution of multicellularity and development. However, the phylogenetic relationship and timescale of the volvocine algae are not fully resolved. Here, we use extensive taxon and gene sampling to reconstruct the phylogeny of the volvocine algae. Our results support that the colonial volvocine algae are not monophyletic group and multicellularity independently evolve at least twice in the volvocine algae, once in Tetrabaenaceae and another in the Goniaceae + Volvocaceae. The simulation analyses suggest that incomplete lineage sorting is a major factor for the tree topology discrepancy, which imply that the multispecies coalescent model better fits the data used in this study. The coalescent-based species tree supports that the Goniaceae is monophyletic and Crucicarteria is the earliest diverging lineage, followed by Hafniomonas and Radicarteria within the Volvocales. By considering the multiple uncertainties in divergence time estimation, the dating analyses indicate that the volvocine algae occurred during the Cryogenian to Ediacaran (696.6-551.1 Ma) and multicellularity in the volvocine algae originated from the Triassic to Jurassic. Our phylogeny and timeline provide an evolutionary framework for studying the evolution of key traits and the origin of multicellularity in the volvocine algae.
Collapse
Affiliation(s)
- Xiaoya Ma
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xuan Shi
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Qiuping Wang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Mengru Zhao
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Zhenhua Zhang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Bojian Zhong
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| |
Collapse
|
24
|
Pardo-De la Hoz CJ, Magain N, Piatkowski B, Cornet L, Dal Forno M, Carbone I, Miadlikowska J, Lutzoni F. Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-Supported Phylogenomic Trees of Nostocalean Cyanobacteria. Syst Biol 2023; 72:694-712. [PMID: 36827095 DOI: 10.1093/sysbio/syad008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 02/12/2023] [Accepted: 02/22/2023] [Indexed: 02/25/2023] Open
Abstract
Prokaryotic genomes are often considered to be mosaics of genes that do not necessarily share the same evolutionary history due to widespread horizontal gene transfers (HGTs). Consequently, representing evolutionary relationships of prokaryotes as bifurcating trees has long been controversial. However, studies reporting conflicts among gene trees derived from phylogenomic data sets have shown that these conflicts can be the result of artifacts or evolutionary processes other than HGT, such as incomplete lineage sorting, low phylogenetic signal, and systematic errors due to substitution model misspecification. Here, we present the results of an extensive exploration of phylogenetic conflicts in the cyanobacterial order Nostocales, for which previous studies have inferred strongly supported conflicting relationships when using different concatenated phylogenomic data sets. We found that most of these conflicts are concentrated in deep clusters of short internodes of the Nostocales phylogeny, where the great majority of individual genes have low resolving power. We then inferred phylogenetic networks to detect HGT events while also accounting for incomplete lineage sorting. Our results indicate that most conflicts among gene trees are likely due to incomplete lineage sorting linked to an ancient rapid radiation, rather than to HGTs. Moreover, the short internodes of this radiation fit the expectations of the anomaly zone, i.e., a region of the tree parameter space where a species tree is discordant with its most likely gene tree. We demonstrated that concatenation of different sets of loci can recover up to 17 distinct and well-supported relationships within the putative anomaly zone of Nostocales, corresponding to the observed conflicts among well-supported trees based on concatenated data sets from previous studies. Our findings highlight the important role of rapid radiations as a potential cause of strongly conflicting phylogenetic relationships when using phylogenomic data sets of bacteria. We propose that polytomies may be the most appropriate phylogenetic representation of these rapid radiations that are part of anomaly zones, especially when all possible genomic markers have been considered to infer these phylogenies. [Anomaly zone; bacteria; horizontal gene transfer; incomplete lineage sorting; Nostocales; phylogenomic conflict; rapid radiation; Rhizonema.].
Collapse
Affiliation(s)
| | - Nicolas Magain
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
| | - Bryan Piatkowski
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Luc Cornet
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
- BCCM/IHEM, Mycology and Aerobiology, Sciensano, Brussels, Belgium
| | | | - Ignazio Carbone
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27606, USA
| | | | | |
Collapse
|
25
|
Nuñez LP, Gray LN, Weisrock DW, Burbrink FT. The Phylogenomic and Biogeographic History of the Gartersnakes, Watersnakes, and Allies (Natricidae: Thamnophiini). Mol Phylogenet Evol 2023:107844. [PMID: 37301486 DOI: 10.1016/j.ympev.2023.107844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 06/01/2023] [Accepted: 06/03/2023] [Indexed: 06/12/2023]
Abstract
North American Thamnophiini (gartersnakes, watersnakes, brownsnakes, and swampsnakes) are an ecologically and phenotypically diverse temperate clade of snakes representing 61 species across 10 genera. In this study, we estimate phylogenetic trees using ∼3,700 ultraconserved elements (UCEs) for 76 specimens representing 75% of all Thamnophiini species. We infer phylogenies using multispecies coalescent methods and time calibrate them using the fossil record. We also conducted ancestral area estimation to identify how major biogeographic boundaries in North America affect broadscale diversification in the group. While most nodes exhibited strong statistical support, analysis of concordant data across gene trees reveals substantial heterogeneity. Ancestral area estimation demonstrated that the genus Thamnophis was the only taxon in this subfamily to cross the Western Continental Divide, even as other taxa dispersed southward toward the tropics. Additionally, levels of gene tree discordance are overall higher in transition zones between bioregions, including the Rocky Mountains. Therefore, the Western Continental Divide may be a significant transition zone structuring the diversification of Thamnophiini during the Neogene and Pleistocene. Here we show that despite high levels of discordance across gene trees, we were able to infer a highly resolved and well-supported phylogeny for Thamnophiini, which allows us to understand broadscale patterns of diversity and biogeography.
Collapse
Affiliation(s)
- Leroy P Nuñez
- Department of Herpetology, American Museum of Natural History, New York, NY, USA; Richard Gilder Graduate School, American Museum of Natural History, New York, NY, USA.
| | - Levi N Gray
- Fort Collins Science Center, United States Geological Survey, Guam, USA
| | - David W Weisrock
- Department of Biology, University of Kentucky, Lexington, KY, USA
| | - Frank T Burbrink
- Department of Herpetology, American Museum of Natural History, New York, NY, USA
| |
Collapse
|
26
|
Choi S, Hauber ME, Legendre LJ, Kim NH, Lee YN, Varricchio DJ. Microstructural and crystallographic evolution of palaeognath (Aves) eggshells. eLife 2023; 12:e81092. [PMID: 36719067 PMCID: PMC9889092 DOI: 10.7554/elife.81092] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 12/11/2022] [Indexed: 02/01/2023] Open
Abstract
The avian palaeognath phylogeny has been recently revised significantly due to the advancement of genome-wide comparative analyses and provides the opportunity to trace the evolution of the microstructure and crystallography of modern dinosaur eggshells. Here, eggshells of all major clades of Palaeognathae (including extinct taxa) and selected eggshells of Neognathae and non-avian dinosaurs are analysed with electron backscatter diffraction. Our results show the detailed microstructures and crystallographies of (previously) loosely categorized ostrich-, rhea-, and tinamou-style morphotypes of palaeognath eggshells. All rhea-style eggshell appears homologous, while respective ostrich-style and tinamou-style morphotypes are best interpreted as homoplastic morphologies (independently acquired). Ancestral state reconstruction and parsimony analysis additionally show that rhea-style eggshell represents the ancestral state of palaeognath eggshells both in microstructure and crystallography. The ornithological and palaeontological implications of the current study are not only helpful for the understanding of evolution of modern and extinct dinosaur eggshells, but also aid other disciplines where palaeognath eggshells provide useful archive for comparative contrasts (e.g. palaeoenvironmental reconstructions, geochronology, and zooarchaeology).
Collapse
Affiliation(s)
- Seung Choi
- Department of Earth Sciences, Montana State UniversityBozemanUnited States
- Key Laboratory of Vertebrate Evolution and Human Origins of Chinese Academy of Sciences, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of SciencesBeijingChina
| | - Mark E Hauber
- Department of Evolution, Ecology, and Behavior, School of Integrative Biology, University of Illinois Urbana-ChampaignUrbanaUnited States
| | - Lucas J Legendre
- Department of Geological Sciences, University of Texas at AustinAustinUnited States
| | - Noe-Heon Kim
- School of Earth and Environmental Sciences, Seoul National UniversitySeoulRepublic of Korea
- Department of Geosciences, Princeton UniversityPrincetonUnited States
| | - Yuong-Nam Lee
- School of Earth and Environmental Sciences, Seoul National UniversitySeoulRepublic of Korea
| | - David J Varricchio
- Department of Earth Sciences, Montana State UniversityBozemanUnited States
| |
Collapse
|
27
|
Ancient proteins resolve controversy over the identity of Genyornis eggshell. Proc Natl Acad Sci U S A 2022; 119:e2109326119. [PMID: 35609205 PMCID: PMC9995833 DOI: 10.1073/pnas.2109326119] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
The realization that ancient biomolecules are preserved in "fossil" samples has revolutionized archaeological science. Protein sequences survive longer than DNA, but their phylogenetic resolution is inferior; therefore, careful assessment of the research questions is required. Here, we show the potential of ancient proteins preserved in Pleistocene eggshell in addressing a longstanding controversy in human and animal evolution: the identity of the extinct bird that laid large eggs which were exploited by Australia's indigenous people. The eggs had been originally attributed to the iconic extinct flightless bird Genyornis newtoni (†Dromornithidae, Galloanseres) and were subsequently dated to before 50 ± 5 ka by Miller et al. [Nat. Commun. 7, 10496 (2016)]. This was taken to represent the likely extinction date for this endemic megafaunal species and thus implied a role of humans in its demise. A contrasting hypothesis, according to which the eggs were laid by a large mound-builder megapode (Megapodiidae, Galliformes), would therefore acquit humans of their responsibility in the extinction of Genyornis. Ancient protein sequences were reconstructed and used to assess the evolutionary proximity of the undetermined eggshell to extant birds, rejecting the megapode hypothesis. Authentic ancient DNA could not be confirmed from these highly degraded samples, but morphometric data also support the attribution of the eggshell to Genyornis. When used in triangulation to address well-defined hypotheses, paleoproteomics is a powerful tool for reconstructing the evolutionary history in ancient samples. In addition to the clarification of phylogenetic placement, these data provide a more nuanced understanding of the modes of interactions between humans and their environment.
Collapse
|
28
|
Maderspacher F. Flightless birds. Curr Biol 2022; 32:R1155-R1162. [DOI: 10.1016/j.cub.2022.09.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
29
|
Simmons MP, Maurin O, Bailey P, Brewer GE, Roy S, Lombardi JA, Forest F, Baker WJ. Benefits of alignment quality-control processing steps and an Angiosperms353 phylogenomics pipeline applied to the Celastrales. Cladistics 2022; 38:595-611. [PMID: 35569142 DOI: 10.1111/cla.12507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/24/2022] [Indexed: 01/31/2023] Open
Abstract
We examined the impact of successive alignment quality-control steps on downstream phylogenomic analyses. We applied a recently published phylogenomics pipeline that was developed for the Angiosperms353 target-sequence-capture probe set to the flowering plant order Celastrales. Our final dataset consists of 158 species, including at least one exemplar from all 109 currently recognized Celastrales genera. We performed nine quality-control steps and compared the inferred resolution, branch support, and topological congruence of the inferred gene and species trees with those generated after each of the first six steps. We describe and justify each of our quality-control steps, including manual masking, in detail so that they may be readily applied to other lineages. We found that highly supported clades could generally be relied upon even if stringent orthology and alignment quality-control measures had not been applied. But separate instances were identified, for both concatenation and coalescence, wherein a clade was highly supported before manual masking but then subsequently contradicted. These results are generally reassuring for broad-scale analyses that use phylogenomics pipelines, but also indicate that we cannot rely exclusively on these analyses to conclude how challenging phylogenetic problems are best resolved.
Collapse
Affiliation(s)
- Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, Colorado, 80523-1878, USA
| | - Olivier Maurin
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | - Paul Bailey
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | - Grace E Brewer
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | - Shyamali Roy
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | - Julio A Lombardi
- Departamento de Botânica, Instituto de Biociências de Rio Claro, Universidade Estadual Paulista - UNESP, Av. 24-A 1515 - Bela Vista, Caixa Postal 199, São Paulo, Brazil
| | - Félix Forest
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | | |
Collapse
|
30
|
Flouri T, Huang J, Jiao X, Kapli P, Rannala B, Yang Z. Bayesian phylogenetic inference using relaxed-clocks and the multispecies coalescent. Mol Biol Evol 2022; 39:6652437. [PMID: 35907248 PMCID: PMC9366188 DOI: 10.1093/molbev/msac161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The multispecies coalescent (MSC) model accommodates both species divergences and within-species coalescent and provides a natural framework for phylogenetic analysis of genomic data when the gene trees vary across the genome. The MSC model implemented in the program bpp assumes a molecular clock and the Jukes–Cantor model, and is suitable for analyzing genomic data from closely related species. Here we extend our implementation to more general substitution models and relaxed clocks to allow the rate to vary among species. The MSC-with-relaxed-clock model allows the estimation of species divergence times and ancestral population sizes using genomic sequences sampled from contemporary species when the strict clock assumption is violated, and provides a simulation framework for evaluating species tree estimation methods. We conducted simulations and analyzed two real datasets to evaluate the utility of the new models. We confirm that the clock-JC model is adequate for inference of shallow trees with closely related species, but it is important to account for clock violation for distant species. Our simulation suggests that there is valuable phylogenetic information in the gene-tree branch lengths even if the molecular clock assumption is seriously violated, and the relaxed-clock models implemented in bpp are able to extract such information. Our Markov chain Monte Carlo algorithms suffer from mixing problems when used for species tree estimation under the relaxed clock and we discuss possible improvements. We conclude that the new models are currently most effective for estimating population parameters such as species divergence times when the species tree is fixed.
Collapse
Affiliation(s)
- Tomáš Flouri
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Jun Huang
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK.,School of Biomedical Engineering, Capital Medical University, Beijing, 100069, China
| | - Xiyun Jiao
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK.,Department of Statistics and Data Science, China Southern University of Science and Technology, Shenzhen, Guangdong 518055, China
| | - Paschalia Kapli
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Bruce Rannala
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Ziheng Yang
- Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
31
|
Mehta RS, Steel M, Rosenberg NA. The Probability of Joint Monophyly of Samples of Gene Lineages for All Species in an Arbitrary Species Tree. J Comput Biol 2022; 29:679-703. [PMID: 35544237 DOI: 10.1089/cmb.2021.0647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Monophyly is a feature of a set of genetic lineages in which every lineage in the set is more closely related to all other members of the set than it is to any lineage outside the set. Multiple sets of lineages that are separately monophyletic are said to be reciprocally monophyletic, or jointly monophyletic. The prevalence of reciprocal monophyly, or joint monophyly (JM), has been used to evaluate phylogenetic and phylogeographic hypotheses, as well as to delimit species. These applications often make use of a probability of JM under models of gene lineage evolution. Studies in coalescent theory have computed this JM probability for small numbers of separate groups in arbitrary species trees and for arbitrary numbers of separate groups in trivial species trees. In this study, generalizing existing results on monophyly probabilities under the multispecies coalescent, we derive the probability of JM for arbitrary numbers of separate groups in arbitrary species trees. We illustrate how our result collapses to previously examined cases. We also study the effect of tree height, sample size, and number of species on the probability of JM. We obtain relatively simple lower and upper bounds on the JM probability. Our results expand the scope of JM calculations beyond small numbers of species, subsuming past formulas that have been used in simpler cases.
Collapse
Affiliation(s)
- Rohan S Mehta
- Department of Physics, Emory University, Atlanta, Georgia, USA
| | - Mike Steel
- Biomathematics Research Centre, University of Canterbury, Christchurch, New Zealand
| | - Noah A Rosenberg
- Department of Biology, Stanford University, Stanford, California, USA
| |
Collapse
|
32
|
Pozzi L, Penna A. Rocks and clocks revised: New promises and challenges in dating the primate tree of life. Evol Anthropol 2022; 31:138-153. [PMID: 35102633 DOI: 10.1002/evan.21940] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 10/04/2021] [Accepted: 01/12/2022] [Indexed: 01/14/2023]
Abstract
In recent years, multiple technological and methodological advances have increased our ability to estimate phylogenies, leading to more accurate dating of the primate tree of life. Here we provide an overview of the limitations and potentials of some of these advancements and discuss how dated phylogenies provide the crucial temporal scale required to understand primate evolution. First, we review new methods, such as the total-evidence dating approach, that promise a better integration between the fossil record and molecular data. We then explore how the ever-increasing availability of genomic-level data for more primate species can impact our ability to accurately estimate timetrees. Finally, we discuss more recent applications of mutation rates to date divergence times. We highlight example studies that have applied these approaches to estimate divergence dates within primates. Our goal is to provide a critical overview of these new developments and explore the promises and challenges of their application in evolutionary anthropology.
Collapse
Affiliation(s)
- Luca Pozzi
- Department of Anthropology, The University of Texas at San Antonio, San Antonio, Texas, USA
| | - Anna Penna
- Department of Anthropology, The University of Texas at San Antonio, San Antonio, Texas, USA
| |
Collapse
|
33
|
Lum D, Rheindt FE, Chisholm RA. Tracking scientific discovery of avian phylogenetic diversity over 250 years. Proc Biol Sci 2022; 289:20220088. [PMID: 35440208 PMCID: PMC9019523 DOI: 10.1098/rspb.2022.0088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Estimating the total number of species on Earth has been a longstanding pursuit. Models project anywhere between 2 and 10 million species, and discovery of new species continues to the present day. Despite this, we hypothesized that our current knowledge of phylogenetic diversity (PD) may be almost complete because new discoveries may be less phylogenetically distinct than past discoveries. Focusing on birds, which are well studied, we generated a robust phylogenetic tree for most extant species by combining existing published trees and calculated each discovery's marginal contribution to known PD since the first formal species descriptions in 1758. We found that PD contributions began to plateau in the early 1900s, about half a century earlier than species richness. Relative contributions of each phylogenetic order to known PD shifted over the first 150 years, with a growing contribution of the hyper-diverse perching birds (Passeriformes) in particular, but after the early 1900s this has remained relatively stable. Altogether, this suggests that our knowledge of the evolutionary history of extant birds is mostly complete, with few discoveries of high evolutionary novelty left to be made, and that conclusions of studies using avian phylogenies are likely to be robust to future species discoveries.
Collapse
Affiliation(s)
- Deon Lum
- Department of Earth and Environmental Sciences, University of Manchester, Oxford Road, Manchester M13 9PT, UK.,Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore 117558, Singapore
| | - Frank E Rheindt
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore 117558, Singapore
| | - Ryan A Chisholm
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore 117558, Singapore
| |
Collapse
|
34
|
Hou Z, Ma X, Shi X, Li X, Yang L, Xiao S, De Clerck O, Leliaert F, Zhong B. Phylotranscriptomic insights into a Mesoproterozoic-Neoproterozoic origin and early radiation of green seaweeds (Ulvophyceae). Nat Commun 2022; 13:1610. [PMID: 35318329 PMCID: PMC8941102 DOI: 10.1038/s41467-022-29282-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 03/09/2022] [Indexed: 01/09/2023] Open
Abstract
The Ulvophyceae, a major group of green algae, is of particular evolutionary interest because of its remarkable morphological and ecological diversity. Its phylogenetic relationships and diversification timeline, however, are still not fully resolved. In this study, using an extensive nuclear gene dataset, we apply coalescent- and concatenation-based approaches to reconstruct the phylogeny of the Ulvophyceae and to explore the sources of conflict in previous phylogenomic studies. The Ulvophyceae is recovered as a paraphyletic group, with the Bryopsidales being a sister group to the Chlorophyceae, and the remaining taxa forming a clade (Ulvophyceae sensu stricto). Molecular clock analyses with different calibration strategies emphasize the large impact of fossil calibrations, and indicate a Meso-Neoproterozoic origin of the Ulvophyceae (sensu stricto), earlier than previous estimates. The results imply that ulvophyceans may have had a profound influence on oceanic redox structures and global biogeochemical cycles at the Mesoproterozoic-Neoproterozoic transition. “Ulvophyceae is a remarkably morphologically and ecologically diverse clade of green algae. Here, the authors reconstruct the Ulvophyceae phylogeny, showing that these algae originated earlier than expected and may have influenced biogeochemical cycles at the Mesoproterozoic-Neoproterozoic transition.”
Collapse
Affiliation(s)
- Zheng Hou
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xiaoya Ma
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xuan Shi
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Xi Li
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Lingxiao Yang
- College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Shuhai Xiao
- Department of Geosciences and Global Change Center, Virginia Tech, Blacksburg, VA, USA
| | - Olivier De Clerck
- Phycology Research Group and Center for Molecular Phylogenetics and Evolution, Ghent University, Ghent, Belgium
| | - Frederik Leliaert
- Phycology Research Group and Center for Molecular Phylogenetics and Evolution, Ghent University, Ghent, Belgium.,Meise Botanic Garden, Meise, Belgium
| | - Bojian Zhong
- College of Life Sciences, Nanjing Normal University, Nanjing, China.
| |
Collapse
|
35
|
Mayr G. A survey of the uncinate bone and other poorly known ossicles associated with the lacrimal/ectethmoid complex of the avian skull. Anat Rec (Hoboken) 2022; 305:2312-2330. [PMID: 35068074 DOI: 10.1002/ar.24869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 12/04/2021] [Accepted: 12/07/2021] [Indexed: 11/06/2022]
Abstract
In several taxa of Neornithes (crown group birds), the lacrimal/ectethmoid complex exhibits small bones, the comparative osteology of which is poorly studied. Some of these ossicles - which are commonly known as uncinate bones (ossa uncinata or ossa lacrimopalatina) - were already described two centuries ago, but knowledge of their distribution and morphological variability in higher-level clades is incomplete. In the present study, a detailed survey of the occurrence of uncinate bones is given, and these ossicles are for the first time reported in the gruiform Psophiidae, some Rallidae, and in the Otidiformes. Their presence in the latter taxon is of particular interest, because in current molecular analyses the Otidiformes result as close relatives of the Musophagiformes, in which the uncinate bone is particularly large. The uncinate bones of most other neornithine clades, however, appear to have evolved multiple times independently through parallel evolution from the same ligamentous structures. A few earlier authors assumed that the uncinate bone is homologous to the ectopterygoid of non-avian theropods. Although this remains a viable hypothesis, more data on the occurrence of the ectopterygoid in Mesozoic birds are needed for well-supported conclusions. Here it is noted that the ontogenetic development of the uncinate bone appears to be correlated with that of the ectethmoid, which is another bone in the skull of neornithine birds that is of unknown origin. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Gerald Mayr
- Ornithological Section, Senckenberg Research Institute and Natural History Museum Frankfurt, Senckenberganlage 25, Frankfurt am Main, Germany
| |
Collapse
|
36
|
Jiao X, Flouri T, Yang Z. Multispecies coalescent and its applications to infer species phylogenies and cross-species gene flow. Natl Sci Rev 2022; 8:nwab127. [PMID: 34987842 PMCID: PMC8692950 DOI: 10.1093/nsr/nwab127] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 07/10/2021] [Accepted: 07/11/2021] [Indexed: 02/06/2023] Open
Abstract
Multispecies coalescent (MSC) is the extension of the single-population coalescent model to multiple species. It integrates the phylogenetic process of species divergences and the population genetic process of coalescent, and provides a powerful framework for a number of inference problems using genomic sequence data from multiple species, including estimation of species divergence times and population sizes, estimation of species trees accommodating discordant gene trees, inference of cross-species gene flow and species delimitation. In this review, we introduce the major features of the MSC model, discuss full-likelihood and heuristic methods of species tree estimation and summarize recent methodological advances in inference of cross-species gene flow. We discuss the statistical and computational challenges in the field and research directions where breakthroughs may be likely in the next few years.
Collapse
Affiliation(s)
- Xiyun Jiao
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| |
Collapse
|
37
|
Singhal S, Derryberry GE, Bravo GA, Derryberry EP, Brumfield RT, Harvey MG. The dynamics of introgression across an avian radiation. Evol Lett 2021; 5:568-581. [PMID: 34917397 PMCID: PMC8645201 DOI: 10.1002/evl3.256] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/11/2021] [Accepted: 08/31/2021] [Indexed: 01/20/2023] Open
Abstract
Hybridization and resulting introgression can play both a destructive and a creative role in the evolution of diversity. Thus, characterizing when and where introgression is most likely to occur can help us understand the causes of diversification dynamics. Here, we examine the prevalence of and variation in introgression using phylogenomic data from a large (1300+ species), geographically widespread avian group, the suboscine birds. We first examine patterns of gene tree discordance across the geographic distribution of the entire clade. We then evaluate the signal of introgression in a subset of 206 species triads using Patterson's D‐statistic and test for associations between introgression signal and evolutionary, geographic, and environmental variables. We find that gene tree discordance varies across lineages and geographic regions. The signal of introgression is highest in cases where species occur in close geographic proximity and in regions with more dynamic climates since the Pleistocene. Our results highlight the potential of phylogenomic datasets for examining broad patterns of hybridization and suggest that the degree of introgression between diverging lineages might be predictable based on the setting in which they occur.
Collapse
Affiliation(s)
- Sonal Singhal
- Department of Biology California State University, Dominguez Hills Carson California 90747
| | - Graham E Derryberry
- Department of Ecology and Evolutionary Biology University of Tennessee Knoxville Tennessee 37996
| | - Gustavo A Bravo
- Department of Organismic and Evolutionary Biology Harvard University Cambridge Massachusetts 02138.,Museum of Comparative Zoology Harvard University Cambridge Massachusetts 02138
| | - Elizabeth P Derryberry
- Department of Ecology and Evolutionary Biology University of Tennessee Knoxville Tennessee 37996
| | - Robb T Brumfield
- Museum of Natural Science Louisiana State University Baton Rouge Louisiana 70803.,Department of Biological Sciences Louisiana State University Baton Rouge Louisiana 70803
| | - Michael G Harvey
- Department of Biological Sciences The University of Texas at El Paso El Paso Texas 79968.,Biodiversity Collections The University of Texas at El Paso El Paso Texas 79968
| |
Collapse
|
38
|
Galbraith JD, Kortschak RD, Suh A, Adelson DL. Genome Stability Is in the Eye of the Beholder: CR1 Retrotransposon Activity Varies Significantly across Avian Diversity. Genome Biol Evol 2021; 13:6433158. [PMID: 34894225 PMCID: PMC8665684 DOI: 10.1093/gbe/evab259] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/12/2021] [Indexed: 12/20/2022] Open
Abstract
Since the sequencing of the zebra finch genome it has become clear that avian genomes, while largely stable in terms of chromosome number and gene synteny, are more dynamic at an intrachromosomal level. A multitude of intrachromosomal rearrangements and significant variation in transposable element (TE) content have been noted across the avian tree. TEs are a source of genome plasticity, because their high similarity enables chromosomal rearrangements through nonallelic homologous recombination, and they have potential for exaptation as regulatory and coding sequences. Previous studies have investigated the activity of the dominant TE in birds, chicken repeat 1 (CR1) retrotransposons, either focusing on their expansion within single orders, or comparing passerines with nonpasserines. Here, we comprehensively investigate and compare the activity of CR1 expansion across orders of birds, finding levels of CR1 activity vary significantly both between and within orders. We describe high levels of TE expansion in genera which have speciated in the last 10 Myr including kiwis, geese, and Amazon parrots; low levels of TE expansion in songbirds across their diversification, and near inactivity of TEs in the cassowary and emu for millions of years. CR1s have remained active over long periods of time across most orders of neognaths, with activity at any one time dominated by one or two families of CR1s. Our findings of higher TE activity in species-rich clades and dominant families of TEs within lineages mirror past findings in mammals and indicate that genome evolution in amniotes relies on universal TE-driven processes.
Collapse
Affiliation(s)
- James D Galbraith
- School of Biological Sciences, The University of Adelaide, South Australia, Australia
| | | | - Alexander Suh
- School of Biological Sciences, University of East Anglia, Norwich, United Kingdom.,Department of Organismal Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, Sweden
| | - David L Adelson
- School of Biological Sciences, The University of Adelaide, South Australia, Australia
| |
Collapse
|
39
|
Simmons MP, Springer MS, Gatesy J. Gene-tree misrooting drives conflicts in phylogenomic coalescent analyses of palaeognath birds. Mol Phylogenet Evol 2021; 167:107344. [PMID: 34748873 DOI: 10.1016/j.ympev.2021.107344] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 10/08/2021] [Accepted: 11/02/2021] [Indexed: 10/19/2022]
Abstract
Phylogenomic analyses of ancient rapid radiations can produce conflicting results that are driven by differential sampling of taxa and characters as well as the limitations of alternative analytical methods. We re-examine basal relationships of palaeognath birds (ratites and tinamous) using recently published datasets of nucleotide characters from 20,850 loci as well as 4301 retroelement insertions. The original studies attributed conflicting resolutions of rheas in their inferred coalescent and concatenation trees to concatenation failing in the anomaly zone. By contrast, we find that the coalescent-based resolution of rheas is premised upon extensive gene-tree estimation errors. Furthermore, retroelement insertions contain much more conflict than originally reported and multiple insertion loci support the basal position of rheas found in concatenation trees, while none were reported in the original publication. We demonstrate how even remarkable congruence in phylogenomic studies may be driven by long-branch misplacement of a divergent outgroup, highly incongruent gene trees, differential taxon sampling that can result in gene-tree misrooting errors that bias species-tree inference, and gross homology errors. What was previously interpreted as broad, robustly supported corroboration for a single resolution in coalescent analyses may instead indicate a common bias that taints phylogenomic results across multiple genome-scale datasets. The updated retroelement dataset now supports a species tree with branch lengths that suggest an ancient anomaly zone, and both concatenation and coalescent analyses of the huge nucleotide datasets fail to yield coherent, reliable results in this challenging phylogenetic context.
Collapse
Affiliation(s)
- Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO 80523, USA.
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA 92521, USA
| | - John Gatesy
- Division of Vertebrate Zoology and Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
| |
Collapse
|
40
|
Bravo GA, Schmitt CJ, Edwards SV. What Have We Learned from the First 500 Avian Genomes? ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021. [DOI: 10.1146/annurev-ecolsys-012121-085928] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The increased capacity of DNA sequencing has significantly advanced our understanding of the phylogeny of birds and the proximate and ultimate mechanisms molding their genomic diversity. In less than a decade, the number of available avian reference genomes has increased to over 500—approximately 5% of bird diversity—placing birds in a privileged position to advance the fields of phylogenomics and comparative, functional, and population genomics. Whole-genome sequence data, as well as indels and rare genomic changes, are further resolving the avian tree of life. The accumulation of bird genomes, increasingly with long-read sequence data, greatly improves the resolution of genomic features such as germline-restricted chromosomes and the W chromosome, and is facilitating the comparative integration of genotypes and phenotypes. Community-based initiatives such as the Bird 10,000 Genomes Project and Vertebrate Genome Project are playing a fundamental role in amplifying and coalescing a vibrant international program in avian comparative genomics.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| | - C. Jonathan Schmitt
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| |
Collapse
|
41
|
Davis SN, Clarke JA. Estimating the distribution of carotenoid coloration in skin and integumentary structures of birds and extinct dinosaurs. Evolution 2021; 76:42-57. [PMID: 34719783 DOI: 10.1111/evo.14393] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 09/22/2021] [Accepted: 09/24/2021] [Indexed: 11/27/2022]
Abstract
Carotenoids are pigments responsible for most bright yellow, red, and orange hues in birds. Their distribution has been investigated in avian plumage, but the evolution of their expression in skin and other integumentary structures has not been approached in detail. Here, we investigate the expression of carotenoid-consistent coloration across tissue types in all extant, nonpasserine species (n = 4022) and archelosaur outgroups in a phylogenetic framework. We collect dietary data for a subset of birds and investigate how dietary carotenoid intake may relate to carotenoid expression in various tissues. We find that carotenoid-consistent expression in skin or nonplumage keratin has a 50% probability of being present in the most recent common ancestor of Archosauria. Skin expression has a similar probability at the base of the avian crown clade, but plumage expression is unambiguously absent in that ancestor and shows hundreds of independent gains within nonpasserine neognaths, consistent with previous studies. Although our data do not support a strict sequence of tissue expression in nonpasserine birds, we find support that expression of carotenoid-consistent color in nonplumage integument structures might evolve in a correlated manner and feathers are rarely the only region of expression. Taxa with diets high in carotenoid content also show expression in more body regions and tissue types. Our results may inform targeted assays for carotenoids in tissues other than feathers, and expectations of these pigments in nonavian dinosaurs. In extinct groups, bare-skin regions and the rhamphotheca, especially in species with diets rich in plants, may express these pigments, which are not expected in feathers or feather homologues.
Collapse
Affiliation(s)
- Sarah N Davis
- Department of Geological Sciences, Jackson School of Geosciences, The University of Texas at Austin, Austin, Texas, 78712
| | - Julia A Clarke
- Department of Geological Sciences, Jackson School of Geosciences, The University of Texas at Austin, Austin, Texas, 78712.,Department of Integrative Biology, The University of Texas at Austin, Austin, Texas, 78712
| |
Collapse
|
42
|
Molloy EK, Gatesy J, Springer MS. Theoretical and practical considerations when using retroelement insertions to estimate species trees in the anomaly zone. Syst Biol 2021; 71:721-740. [PMID: 34677617 DOI: 10.1093/sysbio/syab086] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
A potential shortcoming of concatenation methods for species tree estimation is their failure to account for incomplete lineage sorting. Coalescent methods address this problem but make various assumptions that, if violated, can result in worse performance than concatenation. Given the challenges of analyzing DNA sequences with both concatenation and coalescent methods, retroelement insertions (RIs) have emerged as powerful phylogenomic markers for species tree estimation. Here, we show that two recently proposed quartet-based methods, SDPquartets and ASTRAL_BP, are statistically consistent estimators of the unrooted species tree topology under the coalescent when RIs follow a neutral infinite-sites model of mutation and the expected number of new RIs per generation is constant across the species tree. The accuracy of these (and other) methods for inferring species trees from RIs has yet to be assessed on simulated data sets, where the true species tree topology is known. Therefore, we evaluated eight methods given RIs simulated from four model species trees, all of which have short branches and at least three of which are in the anomaly zone. In our simulation study, ASTRAL_BP and SDPquartets always recovered the correct species tree topology when given a sufficiently large number of RIs, as predicted. A distance-based method (ASTRID_BP) and Dollo parsimony also performed well in recovering the species tree topology. In contrast, unordered, polymorphism, and Camin-Sokal parsimony typically fail to recover the correct species tree topology in anomaly zone situations with more than four ingroup taxa. Of the methods studied, only ASTRAL_BP automatically estimates internal branch lengths (in coalescent units) and support values (i.e. local posterior probabilities). We examined the accuracy of branch length estimation, finding that estimated lengths were accurate for short branches but upwardly biased otherwise. This led us to derive the maximum likelihood (branch length) estimate for when RIs are given as input instead of binary gene trees; this corrected formula produced accurate estimates of branch lengths in our simulation study, provided that a sufficiently large number of RIs were given as input. Lastly, we evaluated the impact of data quantity on species tree estimation by repeating the above experiments with input sizes varying from 100 to 100 000 parsimony-informative RIs. We found that, when given just 1 000 parsimony-informative RIs as input, ASTRAL_BP successfully reconstructed major clades (i.e clades separated by branches > 0.3 CUs) with high support and identified rapid radiations (i.e. shorter connected branches), although not their precise branching order. The local posterior probability was effective for controlling false positive branches in these scenarios.
Collapse
Affiliation(s)
- Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, College Park, 20742, USA
| | - John Gatesy
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, 10024, USA
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, Riverside, 92521, USA
| |
Collapse
|
43
|
Almeida FC, Porzecanski AL, Cracraft JL, Bertelli S. The evolution of tinamous (Palaeognathae: Tinamidae) in light of molecular and combined analyses. Zool J Linn Soc 2021. [DOI: 10.1093/zoolinnean/zlab080] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Abstract
The Neotropical tinamous are of particular interest in our efforts to understand the evolution of modern birds. They inhabit both forested and open environments and, although volant, have limited flight capabilities. Numerous studies have recognized the monophyly of tinamous and their relationships either as sister to the flightless ratites (ostriches, emus and their relatives) or within the ratites themselves. Despite the numerous bird phylogenies published recently, modern investigations of relationships within the tinamous themselves have been limited. Here, we present the first detailed phylogenetic analysis and divergence-date estimation including a significant number of tinamou species, both extant and fossil. The monophyly of most currently recognized polytypic genera is recovered with high support, with the exception of the paraphyletic Nothura and Nothoprocta. The traditional subdivision between those tinamous inhabiting open areas (Nothurinae) and forest environments (Tinaminae) is also confirmed. A temporal calibration of the resultant phylogeny estimates that the basal divergence of crown Tinamidae took place between 31 and 40 Mya.
Collapse
Affiliation(s)
- Francisca C Almeida
- Instituto de Ecología, Genética y Evolución (IEGEBA), Consejo Nacional de Investigaciones Científicas y Tecnológicas (CONICET)/ Universidad de Buenos Aires (UBA), Ciudad Autónoma de Buenos Aires, Argentina
| | - Ana L Porzecanski
- American Museum of Natural History, 200 Central Park West, New York, NY, 10024-5102, USA
| | - Joel L Cracraft
- American Museum of Natural History, 200 Central Park West, New York, NY, 10024-5102, USA
| | - Sara Bertelli
- American Museum of Natural History, 200 Central Park West, New York, NY, 10024-5102, USA
- Fundación Miguel Lillo (FML), Miguel Lillo 251, 4000 San Miguel de Tucumán, Argentina
- Unidad Ejecutora Lillo (UEL) - Consejo Nacional de Investigaciones Científicas y Tecnológicas (CONICET), San Miguel de Tucumán, Tucumán, Argentina
| |
Collapse
|
44
|
Forthman M, Braun EL, Kimball RT. Gene tree quality affects empirical coalescent branch length estimation. ZOOL SCR 2021. [DOI: 10.1111/zsc.12512] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Michael Forthman
- Department of Entomology & Nematology University of Florida Gainesville FL USA
- California State Collection of Arthropods Plant Pest Diagnostics Branch California Department of Food & Agriculture Sacramento CA USA
| | - Edward L. Braun
- Department of Biology University of Florida Gainesville FL USA
| | | |
Collapse
|
45
|
Vázquez-Miranda H, Barker FK. Autosomal, sex-linked and mitochondrial loci resolve evolutionary relationships among wrens in the genus Campylorhynchus. Mol Phylogenet Evol 2021; 163:107242. [PMID: 34224849 DOI: 10.1016/j.ympev.2021.107242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 06/14/2021] [Accepted: 06/29/2021] [Indexed: 01/18/2023]
Abstract
Although there is general consensus that sampling of multiple genetic loci is critical in accurate reconstruction of species trees, the exact numbers and the best types of molecular markers remain an open question. In particular, the phylogenetic utility of sex-linked loci is underexplored. Here, we sample all species and 70% of the named diversity of the New World wren genus Campylorhynchus using sequences from 23 loci, to evaluate the effects of linkage on efficiency in recovering a well-supported tree for the group. At a tree-wide level, we found that most loci supported fewer than half the possible clades and that sex-linked loci produced similar resolution to slower-coalescing autosomal markers, controlling for locus length. By contrast, we did find evidence that linkage affected the efficiency of recovery of individual relationships; as few as two sex-linked loci were necessary to resolve a selection of clades with long to medium subtending branches, whereas 4-6 autosomal loci were necessary to achieve comparable results. These results support an expanded role for sampling of the avian Z chromosome in phylogenetic studies, including target enrichment approaches. Our concatenated and species tree analyses represent significant improvements in our understanding of diversification in Campylorhynchus, and suggest a relatively complex scenario for its radiation across the Miocene/Pliocene boundary, with multiple invasions of South America.
Collapse
Affiliation(s)
- Hernán Vázquez-Miranda
- Departamento de Zoología, Instituto de Biología, Universidad Nacional Autónoma de México, Ciudad de México C.P. 04510, Mexico
| | - F Keith Barker
- Department of Ecology, Evolution and Behavior, Bell Museum of Natural History, University of Minnesota, 40 Gortner Laboratory, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| |
Collapse
|
46
|
Kim A, Degnan JH. Heuristics for unrooted, unranked, and ranked anomaly zones under birth-death models. Mol Phylogenet Evol 2021; 161:107162. [PMID: 33831548 DOI: 10.1016/j.ympev.2021.107162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2019] [Revised: 10/21/2020] [Accepted: 03/23/2021] [Indexed: 10/21/2022]
Abstract
Species trees that can generate a nonmatching gene tree topology that is more probable than the topology matching the species tree are said to be in an anomaly zone. We introduce some heuristic approaches to infer whether species trees are in anomaly zones when it is difficult or impossible to compute the entire distribution of gene tree topologies. Here, probabilities of unrooted, unranked, and ranked gene tree topologies under the multispecies coalescent are used. A ranked tree can be viewed as an unranked tree with a temporal ordering of its internal nodes. Overall, considering probabilities of unrooted or unranked gene tree topologies within one nearest neighbor interchange from the species tree topology is a reasonable heuristic to infer the existence of anomalous unrooted or unranked gene trees, respectively. We investigated a test proposed by Linkem et al. (2016) which classifies a species tree as being in an unranked anomaly zone if there is a subset of four taxa in an unranked anomaly zone. We find this test to have high true positive rates, but it can also have high false positive rates. For ranked trees, because at least one of the most probable ranked gene tree topologies must have the same unranked topology as the species tree, we propose to use only those ranked gene trees that have topologies that match the unranked species tree topology. We find that the probability that the species tree is in unrooted and unranked anomaly zones tends to increase with the speciation rate, and the probability of all three types of anomaly zones increases rapidly with the number of taxa. We find that probabilities that species trees are in an anomaly zone can be quite high for moderately high speciation rates.
Collapse
Affiliation(s)
- Anastasiia Kim
- Department of Mathematics and Statistics, University of New Mexico, United States
| | - James H Degnan
- Department of Mathematics and Statistics, University of New Mexico, United States
| |
Collapse
|
47
|
Arcila D, Hughes LC, Meléndez-Vazquez F, Baldwin CC, White W, Carpenter K, Williams JT, Santos MD, Pogonoski J, Miya M, Ortí G, Betancur-R R. Testing the utility of alternative metrics of branch support to address the ancient evolutionary radiation of tunas, stromateoids, and allies (Teleostei: Pelagiaria). Syst Biol 2021; 70:1123-1144. [PMID: 33783539 DOI: 10.1093/sysbio/syab018] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 03/13/2021] [Indexed: 12/19/2022] Open
Abstract
The use of high-throughput sequencing technologies to produce genome-scale datasets was expected to settle some long-standing controversies across the Tree of Life, particularly in areas where short branches occur at deep timescales. Instead, these datasets have often yielded many well-supported but conflicting topologies, and highly variable gene-tree distributions. A variety of branch-support metrics beyond the nonparametric bootstrap are now available to assess how robust a phylogenetic hypothesis may be, as well as new methods to quantify gene-tree discordance. We applied multiple branch support metrics to an ancient group of marine fishes (Teleostei: Pelagiaria) whose interfamilial relationships have proven difficult to resolve due to a rapid accumulation of lineages very early in its history. We analyzed hundreds of loci including published UCE data and newly generated exonic data along with their flanking regions to represent all 16 extant families for more than 150 out of 284 valid species in the group. Branch support was lower for interfamilial relationships (except the SH-like aLRT and aBayes methods) regardless of the type of marker used. Several nodes that were highly supported with bootstrap had very low site and gene-tree concordance, revealing underlying conflict. Despite this conflict, we were able to identify four consistent interfamilial clades, each comprised of two or three families. Combining exons with their flanking regions also produced increased branch lengths in the deep branches of the pelagiarian tree. Our results demonstrate the limitations of employing current metrics of branch support and species-tree estimation when assessing the confidence of ancient evolutionary radiations and emphasize the necessity to embrace alternative measurements to explore phylogenetic uncertainty and discordance in phylogenomic datasets.
Collapse
Affiliation(s)
- Dahiana Arcila
- Department of Ichthyology, Sam Noble Oklahoma Museum of Natural History, Norman, Oklahoma, U.S.A.,Department of Biology, University of Oklahoma, Norman, Oklahoma, U.S.A
| | - Lily C Hughes
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, U.S.A.,Department of Organismal Biology and Anatomy, The University of Chicago, Illinois, Chicago, U.S.A.,Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | - Fernando Meléndez-Vazquez
- Department of Ichthyology, Sam Noble Oklahoma Museum of Natural History, Norman, Oklahoma, U.S.A.,Department of Biology, University of Oklahoma, Norman, Oklahoma, U.S.A
| | - Carole C Baldwin
- Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | - William White
- CSIRO Australian National Fish Collection, National Research Collections Australia, Hobart, Hobart, Tasmania, Australia
| | - Kent Carpenter
- Department of Biological Sciences, Old Dominion University, Norfolk, Virginia, U.S.A
| | - Jeffrey T Williams
- Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | | | - John Pogonoski
- CSIRO Australian National Fish Collection, National Research Collections Australia, Hobart, Hobart, Tasmania, Australia
| | - Masaki Miya
- Natural History Museum and Institute, Chiba, Aoba-cho, Chuo-ku, Chiba, Japan
| | - Guillermo Ortí
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, U.S.A.,Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | | |
Collapse
|
48
|
Kulkarni S, Kallal RJ, Wood H, Dimitrov D, Giribet G, Hormiga G. Interrogating Genomic-Scale Data to Resolve Recalcitrant Nodes in the Spider Tree of Life. Mol Biol Evol 2021; 38:891-903. [PMID: 32986823 PMCID: PMC7947752 DOI: 10.1093/molbev/msaa251] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genome-scale data sets are converging on robust, stable phylogenetic hypotheses for many lineages; however, some nodes have shown disagreement across classes of data. We use spiders (Araneae) as a system to identify the causes of incongruence in phylogenetic signal between three classes of data: exons (as in phylotranscriptomics), noncoding regions (included in ultraconserved elements [UCE] analyses), and a combination of both (as in UCE analyses). Gene orthologs, coded as amino acids and nucleotides (with and without third codon positions), were generated by querying published transcriptomes for UCEs, recovering 1,931 UCE loci (codingUCEs). We expected that congeners represented in the codingUCE and UCEs data would form clades in the presence of phylogenetic signal. Noncoding regions derived from UCE sequences were recovered to test the stability of relationships. Phylogenetic relationships resulting from all analyses were largely congruent. All nucleotide data sets from transcriptomes, UCEs, or a combination of both recovered similar topologies in contrast with results from transcriptomes analyzed as amino acids. Most relationships inferred from low-occupancy data sets, containing several hundreds of loci, were congruent across Araneae, as opposed to high occupancy data matrices with fewer loci, which showed more variation. Furthermore, we found that low-occupancy data sets analyzed as nucleotides (as is typical of UCE data sets) can result in more congruent relationships than high occupancy data sets analyzed as amino acids (as in phylotranscriptomics). Thus, omitting data, through amino acid translation or via retention of only high occupancy loci, may have a deleterious effect in phylogenetic reconstruction.
Collapse
Affiliation(s)
- Siddharth Kulkarni
- Department of Biological Sciences, The George Washington University, Washington, DC
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC
| | - Robert J Kallal
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC
| | - Hannah Wood
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC
| | - Dimitar Dimitrov
- Department of Natural History, University Museum of Bergen, University of Bergen, Bergen, Norway
| | - Gonzalo Giribet
- Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA
| | - Gustavo Hormiga
- Department of Biological Sciences, The George Washington University, Washington, DC
| |
Collapse
|
49
|
Collapsing dubiously resolved gene-tree branches in phylogenomic coalescent analyses. Mol Phylogenet Evol 2021; 158:107092. [PMID: 33545272 DOI: 10.1016/j.ympev.2021.107092] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 12/30/2020] [Accepted: 01/28/2021] [Indexed: 01/15/2023]
Abstract
In two-step coalescent analyses of phylogenomic data, gene-tree topologies are treated as fixed prior to species-tree inference. Although all gene-tree conflict is assumed to be caused by lineage sorting when applying these methods, in empirical datasets much of the conflict can be caused by estimation error. Weakly supported and even arbitrarily resolved clades are important sources of this estimation error for gene trees inferred from few informative characters relative to the number of sampled terminals, and the resulting extraneous conflict among gene trees can negatively impact species-tree inference. In this study, we quantified the relative severity of alternative methods for collapsing gene-tree branches for seven empirical datasets and quantified their effects on species-tree inference. The branch-collapsing methods that we employed were based on the strict consensus of optimal topologies, various bootstrap thresholds, and 0% approximate likelihood ratio test (SH-like aLRT) support. Up to 86% of internal gene-tree branches are dubiously or arbitrarily resolved in reanalyses of these published phylogenomic datasets, and collapsing these branches increased inferred species-tree coalescent branch lengths by up to 455%. For two datasets, the longer inferred branch lengths sometimes impacted inference of anomaly-zone conditions. Although branch-collapsing methods did not consistently affect the species-tree topology, they often increased branch support. The more severe and clearly justified gene-tree branch-collapsing methods, which we recommend be broadly applied for two-step coalescent analyses, are use of the strict consensus in parsimony analyses and the collapse clades with 0% SH-like aLRT support in likelihood analyses. Collapsing dubiously or arbitrarily resolved branches in gene trees sometimes improved congruence between coalescent-based results and concatenation trees. In such cases, we contend that the resolution provided by concatenation should be preferred and that incomplete lineage sorting is a poor explanation for the initial conflict between phylogenetic approaches.
Collapse
|
50
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Brown RM. Target-capture phylogenomics provide insights on gene and species tree discordances in Old World treefrogs (Anura: Rhacophoridae). Proc Biol Sci 2020; 287:20202102. [PMID: 33290680 PMCID: PMC7739936 DOI: 10.1098/rspb.2020.2102] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 11/13/2020] [Indexed: 11/12/2022] Open
Abstract
Genome-scale data have greatly facilitated the resolution of recalcitrant nodes that Sanger-based datasets have been unable to resolve. However, phylogenomic studies continue to use traditional methods such as bootstrapping to estimate branch support; and high bootstrap values are still interpreted as providing strong support for the correct topology. Furthermore, relatively little attention has been given to assessing discordances between gene and species trees, and the underlying processes that produce phylogenetic conflict. We generated novel genomic datasets to characterize and determine the causes of discordance in Old World treefrogs (Family: Rhacophoridae)-a group that is fraught with conflicting and poorly supported topologies among major clades. Additionally, a suite of data filtering strategies and analytical methods were applied to assess their impact on phylogenetic inference. We showed that incomplete lineage sorting was detected at all nodes that exhibited high levels of discordance. Those nodes were also associated with extremely short internal branches. We also clearly demonstrate that bootstrap values do not reflect uncertainty or confidence for the correct topology and, hence, should not be used as a measure of branch support in phylogenomic datasets. Overall, we showed that phylogenetic discordances in Old World treefrogs resulted from incomplete lineage sorting and that species tree inference can be improved using a multi-faceted, total-evidence approach, which uses the most amount of data and considers results from different analytical methods and datasets.
Collapse
Affiliation(s)
- Kin Onn Chan
- Lee Kong Chian Natural History Museum, National University of Singapore, 2 Conservatory Drive, Singapore 117377, Republic of Singapore
| | - Carl R. Hutter
- Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Perry L. Wood
- Department of Biological Sciences and Museum of Natural History, Auburn University, Auburn, AL 36849, USA
| | - L. Lee Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, Riverside, CA 92505, USA
| | - Rafe M. Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|