1
|
De Cahsan B, Sandoval Velasco M, Westbury MV, Duchêne DA, Strander Sinding MH, Morales HE, Kalthoff DC, Barnes I, Brace S, Portela Miguez R, Roca AL, Greenwood AD, Johnson RN, Lott MJ, Gilbert MTP. Road to Extinction? Past and Present Population Structure and Genomic Diversity in the Koala. Mol Biol Evol 2025; 42:msaf057. [PMID: 40129172 PMCID: PMC12014528 DOI: 10.1093/molbev/msaf057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Revised: 01/28/2025] [Accepted: 02/24/2025] [Indexed: 03/26/2025] Open
Abstract
Koalas are arboreal herbivorous marsupials, endemic to Australia. During the late 1800s and early 1900s, the number of koalas declined dramatically due to hunting for their furs. In addition, anthropogenic activities have further decimated their available habitat, and decreased population numbers. Here, we utilize 37 historic and 25 modern genomes sampled from across their historic and present geographic range, to gain insights into how their population structure and genetic diversity have changed across time; assess the genetic consequences of the period of intense hunting, and the current genetic status of this iconic Australian species. Our analyses reveal how genome-wide heterozygosity has decreased through time and unveil previously uncharacterized mitochondrial haplotypes and nuclear genotypes in the historic dataset, which are absent from today's koala populations.
Collapse
Affiliation(s)
- Binia De Cahsan
- Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark
| | - Marcela Sandoval Velasco
- Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark
- Center for Genome Sciences (CCG), National Autonomous University of Mexico (UNAM), Cuernavaca, Mexico
| | | | - David A Duchêne
- Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark
| | | | - Hernán E Morales
- Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark
| | - Daniela C Kalthoff
- Department of Zoology, Swedish Museum of Natural History, SE-104 05 Stockholm, Sweden
| | - Ian Barnes
- Department of Earth Sciences, Natural History Museum, London SW7 5BD, England, UK
| | - Selina Brace
- Department of Earth Sciences, Natural History Museum, London SW7 5BD, England, UK
| | | | - Alfred L Roca
- Department of Animal Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Alex D Greenwood
- Department of Wildlife Diseases, Leibniz Institute for Zoo and Wildlife Research, 10315 Berlin, Germany
- Department of Veterinary Medicine, Freie Universität Berlin, 14163 Berlin, Germany
| | - Rebecca N Johnson
- Smithsonian National Museum of Natural History, Washington, D.C. 20560, USA
| | - Matthew J Lott
- Australian Centre for Wildlife Genomics, Australian Museum, Sydney, NSW 2010, Australia
| | - M Thomas P Gilbert
- Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark
- Norwegian University of Science and Technology, University Museum, 7491 Trondheim, Norway
| |
Collapse
|
2
|
Tabatabaee Y, Roch S, Warnow T. QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees Under the Coalescent. J Comput Biol 2023; 30:1146-1181. [PMID: 37902986 DOI: 10.1089/cmb.2023.0185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2023] Open
Abstract
We address the problem of rooting an unrooted species tree given a set of unrooted gene trees, under the assumption that gene trees evolve within the model species tree under the multispecies coalescent (MSC) model. Quintet Rooting (QR) is a polynomial time algorithm that was recently proposed for this problem, which is based on the theory developed by Allman, Degnan, and Rhodes that proves the identifiability of rooted 5-taxon trees from unrooted gene trees under the MSC. However, although QR had good accuracy in simulations, its statistical consistency was left as an open problem. We present QR-STAR, a variant of QR with an additional step and a different cost function, and prove that it is statistically consistent under the MSC. Moreover, we derive sample complexity bounds for QR-STAR and show that a particular variant of it based on "short quintets" has polynomial sample complexity. Finally, our simulation study under a variety of model conditions shows that QR-STAR matches or improves on the accuracy of QR. QR-STAR is available in open-source form on github.
Collapse
Affiliation(s)
- Yasamin Tabatabaee
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Sebastien Roch
- Department of Mathematics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
3
|
Tabatabaee Y, Zhang C, Warnow T, Mirarab S. Phylogenomic branch length estimation using quartets. Bioinformatics 2023; 39:i185-i193. [PMID: 37387151 PMCID: PMC10311336 DOI: 10.1093/bioinformatics/btad221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Branch lengths and topology of a species tree are essential in most downstream analyses, including estimation of diversification dates, characterization of selection, understanding adaptation, and comparative genomics. Modern phylogenomic analyses often use methods that account for the heterogeneity of evolutionary histories across the genome due to processes such as incomplete lineage sorting. However, these methods typically do not generate branch lengths in units that are usable by downstream applications, forcing phylogenomic analyses to resort to alternative shortcuts such as estimating branch lengths by concatenating gene alignments into a supermatrix. Yet, concatenation and other available approaches for estimating branch lengths fail to address heterogeneity across the genome. RESULTS In this article, we derive expected values of gene tree branch lengths in substitution units under an extension of the multispecies coalescent (MSC) model that allows substitutions with varying rates across the species tree. We present CASTLES, a new technique for estimating branch lengths on the species tree from estimated gene trees that uses these expected values, and our study shows that CASTLES improves on the most accurate prior methods with respect to both speed and accuracy. AVAILABILITY AND IMPLEMENTATION CASTLES is available at https://github.com/ytabatabaee/CASTLES.
Collapse
Affiliation(s)
- Yasamin Tabatabaee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States
| | - Chao Zhang
- Department of Integrative Biology, University of California at Berkeley, Berkeley, CA 94720, United States
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA 92093, United States
| |
Collapse
|
4
|
Willson J, Tabatabaee Y, Liu B, Warnow T. DISCO+QR: rooting species trees in the presence of GDL and ILS. BIOINFORMATICS ADVANCES 2023; 3:vbad015. [PMID: 36789293 PMCID: PMC9923442 DOI: 10.1093/bioadv/vbad015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 01/21/2023] [Accepted: 02/06/2023] [Indexed: 02/10/2023]
Abstract
Motivation Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees is challenging, and the estimation of rooted species trees presents additional analytical challenges. Two of the methods developed for this problem are STRIDE, which roots species trees by considering GDL events, and Quintet Rooting (QR), which roots species trees by considering ILS. Results We present DISCO+QR, a new approach to rooting species trees that first uses DISCO to address GDL and then uses QR to perform rooting in the presence of ILS. DISCO+QR operates by taking the input gene family trees and decomposing them into single-copy trees using DISCO and then roots the given species tree using the information in the single-copy gene trees using QR. We show that the relative accuracy of STRIDE and DISCO+QR depend on the properties of the dataset (number of species, genes, rate of gene duplication, degree of ILS and gene tree estimation error), and that each provides advantages over the other under some conditions. Availability and implementation DISCO and QR are available in github. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- James Willson
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Yasamin Tabatabaee
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Baqiao Liu
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | | |
Collapse
|