1
|
Hibbins MS, Rifkin JL, Choudhury BI, Voznesenska O, Sacchi B, Yuan M, Gong Y, Barrett SCH, Wright SI. Phylogenomics resolves key relationships in Rumex and uncovers a dynamic history of independently evolving sex chromosomes. Evol Lett 2025; 9:221-235. [PMID: 40191415 PMCID: PMC11968192 DOI: 10.1093/evlett/qrae060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 09/13/2024] [Accepted: 10/22/2024] [Indexed: 04/09/2025] Open
Abstract
Sex chromosomes have evolved independently many times across eukaryotes. Despite a considerable body of literature on sex chromosome evolution, the causes and consequences of variation in their formation, degeneration, and turnover remain poorly understood. Chromosomal rearrangements are thought to play an important role in these processes by promoting or extending the suppression of recombination on sex chromosomes. Sex chromosome variation may also contribute to barriers to gene flow, limiting introgression among species. Comparative approaches in groups with sexual system variation can be valuable for understanding these questions. Rumex is a diverse genus of flowering plants harboring significant sexual system and karyotypic variation, including hermaphroditic and dioecious clades with XY (and XYY) sex chromosomes. Previous disagreement in the phylogenetic relationships among key species has rendered the history of sex chromosome evolution uncertain. Resolving this history is important for investigating the interplay of chromosomal rearrangements, introgression, and sex chromosome evolution in the genus. Here, we use new transcriptome assemblies from 11 species representing major clades in the genus, along with a whole-genome assembly generated for a key hermaphroditic species. Using phylogenomic approaches, we find evidence for the independent evolution of sex chromosomes across two major clades, and introgression from unsampled lineages likely predating the formation of sex chromosomes in the genus. Comparative genomic approaches revealed high rates of chromosomal rearrangement, especially in dioecious species, with evidence for a complex origin of the sex chromosomes through multiple chromosomal fusions. However, we found no evidence of elevated rates of fusion on the sex chromosomes in comparison with autosomes, providing no support for an adaptive hypothesis of sex chromosome expansion due to sexually antagonistic selection. Overall, our results highlight a complex history of karyotypic evolution in Rumex, raising questions about the role that chromosomal rearrangements might play in the evolution of large heteromorphic sex chromosomes.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Joanna L Rifkin
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
- Genome Sequencing Center, HudsonAlpha Institute for Biotechnology, 601 Genome Way Northwest, Huntsville, AL 35806, USA
| | - Baharul I Choudhury
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Olena Voznesenska
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Bianca Sacchi
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Meng Yuan
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Yunchen Gong
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Spencer C H Barrett
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada
| |
Collapse
|
2
|
Holtgrefe N, Huber KT, van Iersel L, Jones M, Martin S, Moulton V. Squirrel: Reconstructing Semi-directed Phylogenetic Level-1 Networks from Four-Leaved Networks or Sequence Alignments. Mol Biol Evol 2025; 42:msaf067. [PMID: 40152498 PMCID: PMC11979102 DOI: 10.1093/molbev/msaf067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 01/21/2025] [Accepted: 03/04/2025] [Indexed: 03/29/2025] Open
Abstract
With the increasing availability of genomic data, biologists aim to find more accurate descriptions of evolutionary histories influenced by secondary contact, where diverging lineages reconnect before diverging again. Such reticulate evolutionary events can be more accurately represented in phylogenetic networks than in phylogenetic trees. Since the root location of phylogenetic networks cannot be inferred from biological data under several evolutionary models, we consider semi-directed (phylogenetic) networks: partially directed graphs without a root in which the directed edges represent reticulate evolutionary events. By specifying a known outgroup, the rooted topology can be recovered from such networks. We introduce the algorithm Squirrel (Semi-directed Quarnet-based Inference to Reconstruct Level-1 Networks) which constructs a semi-directed level-1 network from a full set of quarnets (four-leaf semi-directed networks). Our method also includes a heuristic to construct such a quarnet set directly from sequence alignments. We demonstrate Squirrel's performance through simulations and on real sequence data sets, the largest of which contains 29 aligned sequences close to 1.7 Mb long. The resulting networks are obtained on a standard laptop within a few minutes. Lastly, we prove that Squirrel is combinatorially consistent: given a full set of quarnets coming from a triangle-free semi-directed level-1 network, it is guaranteed to reconstruct the original network. Squirrel is implemented in Python, has an easy-to-use graphical user interface that takes sequence alignments or quarnets as input, and is freely available at https://github.com/nholtgrefe/squirrel.
Collapse
Affiliation(s)
- Niels Holtgrefe
- Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, Delft 2628 CD, The Netherlands
| | - Katharina T Huber
- School of Computing Sciences, University of East Anglia, Norwich NR4 7TJ, UK
| | - Leo van Iersel
- Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, Delft 2628 CD, The Netherlands
| | - Mark Jones
- Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, Delft 2628 CD, The Netherlands
| | - Samuel Martin
- European Bioinformatics Institute, Hinxton CB10 1SD, UK
| | - Vincent Moulton
- School of Computing Sciences, University of East Anglia, Norwich NR4 7TJ, UK
| |
Collapse
|
3
|
Zhang C, Nielsen R, Mirarab S. CASTER: Direct species tree inference from whole-genome alignments. Science 2025; 387:eadk9688. [PMID: 39847611 PMCID: PMC12038793 DOI: 10.1126/science.adk9688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 08/05/2024] [Accepted: 12/04/2024] [Indexed: 01/25/2025]
Abstract
Genomes contain mosaics of discordant evolutionary histories, challenging the accurate inference of the tree of life. Although genome-wide data are routinely used for discordance-aware phylogenomic analyses, because of modeling and scalability limitations, the current practice leaves out large chunks of genomes. As more high-quality genomes become available, we urgently need discordance-aware methods to infer the tree directly from a multiple genome alignment. In this study, we introduce Coalescence-Aware Alignment-Based Species Tree Estimator (CASTER), a theoretically justified site-based method that eliminates the need to predefine recombination-free loci. CASTER is scalable to hundreds of mammalian whole genomes. We demonstrate the accuracy and scalability of CASTER in simulations that include recombination and apply CASTER to several biological datasets, showing that its per-site scores can reveal both biological and artifactual patterns of discordance across the genome.
Collapse
Affiliation(s)
- Chao Zhang
- Bioinformatics and Systems Biology, University of
California San Diego, 9500 Gilman Drive, La Jolla, 92093, CA, USA
- Integrative Biology Department, University of California
Berkeley, 110 Sproul Hall, Berkeley, 94704, CA, USA
- Globe Institute, University of Copenhagen, Øster
Voldgade 5-7, Copenhagen, 1350, Denmark
| | - Rasmus Nielsen
- Integrative Biology Department, University of California
Berkeley, 110 Sproul Hall, Berkeley, 94704, CA, USA
- Globe Institute, University of Copenhagen, Øster
Voldgade 5-7, Copenhagen, 1350, Denmark
| | - Siavash Mirarab
- Electrical and Computer Engineering, University of
California San Diego, 9500 Gilman Drive, La Jolla, 92093, CA, USA
| |
Collapse
|
4
|
Kong S, Swofford DL, Kubatko LS. Inference of Phylogenetic Networks From Sequence Data Using Composite Likelihood. Syst Biol 2025; 74:53-69. [PMID: 39387633 DOI: 10.1093/sysbio/syae054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 09/13/2024] [Accepted: 10/08/2024] [Indexed: 10/12/2024] Open
Abstract
While phylogenies have been essential in understanding how species evolve, they do not adequately describe some evolutionary processes. For instance, hybridization, a common phenomenon where interbreeding between 2 species leads to formation of a new species, must be depicted by a phylogenetic network, a structure that modifies a phylogenetic tree by allowing 2 branches to merge into 1, resulting in reticulation. However, existing methods for estimating networks become computationally expensive as the dataset size and/or topological complexity increase. The lack of methods for scalable inference hampers phylogenetic networks from being widely used in practice, despite accumulating evidence that hybridization occurs frequently in nature. Here, we propose a novel method, PhyNEST (Phylogenetic Network Estimation using SiTe patterns), that estimates binary, level-1 phylogenetic networks with a fixed, user-specified number of reticulations directly from sequence data. By using the composite likelihood as the basis for inference, PhyNEST is able to use the full genomic data in a computationally tractable manner, eliminating the need to summarize the data as a set of gene trees prior to network estimation. To search network space, PhyNEST implements both hill climbing and simulated annealing algorithms. PhyNEST assumes that the data are composed of coalescent independent sites that evolve according to the Jukes-Cantor substitution model and that the network has a constant effective population size. Simulation studies demonstrate that PhyNEST is often more accurate than 2 existing composite likelihood summary methods (SNaQand PhyloNet) and that it is robust to at least one form of model misspecification (assuming a less complex nucleotide substitution model than the true generating model). We applied PhyNEST to reconstruct the evolutionary relationships among Heliconius butterflies and Papionini primates, characterized by hybrid speciation and widespread introgression, respectively. PhyNEST is implemented in an open-source Julia package and is publicly available at https://github.com/sungsik-kong/PhyNEST.jl.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - David L Swofford
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - Laura S Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
- Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
5
|
Thomas GWC, Hughes JJ, Kumon T, Berv JS, Nordgren CE, Lampson M, Levine M, Searle JB, Good JM. The Genomic Landscape, Causes, and Consequences of Extensive Phylogenomic Discordance in Murine Rodents. Genome Biol Evol 2025; 17:evaf017. [PMID: 39903560 PMCID: PMC11837218 DOI: 10.1093/gbe/evaf017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 01/08/2025] [Accepted: 01/23/2025] [Indexed: 02/06/2025] Open
Abstract
A species tree is a central concept in evolutionary biology whereby a single branching phylogeny reflects relationships among species. However, the phylogenies of different genomic regions often differ from the species tree. Although tree discordance is widespread in phylogenomic studies, we still lack a clear understanding of how variation in phylogenetic patterns is shaped by genome biology or the extent to which discordance may compromise comparative studies. We characterized patterns of phylogenomic discordance across the murine rodents-a large and ecologically diverse group that gave rise to the laboratory mouse and rat model systems. Combining recently published linked-read genome assemblies for seven murine species with other available rodent genomes, we first used ultraconserved elements (UCEs) to infer a robust time-calibrated species tree. We then used whole genomes to examine finer-scale patterns of discordance across ∼12 million years of divergence. We found that proximate chromosomal regions tended to have more similar phylogenetic histories. There was no clear relationship between local tree similarity and recombination rates in house mice, but we did observe a correlation between recombination rates and average similarity to the species tree. We also detected a strong influence of linked selection whereby purifying selection at UCEs led to appreciably less discordance. Finally, we show that assuming a single species tree can result in substantial deviation from the results with gene trees when testing for positive selection under different models. Collectively, our results highlight the complex relationship between phylogenetic inference and genome biology and underscore how failure to account for this complexity can mislead comparative genomic studies.
Collapse
Affiliation(s)
- Gregg W C Thomas
- Division of Biological Sciences, University of Montana, Missoula, MT 59801, USA
- Informatics Group, Harvard University, Cambridge, MA 02138, USA
| | - Jonathan J Hughes
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
- Department of Evolution, Ecology, and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Tomohiro Kumon
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jacob S Berv
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - C Erik Nordgren
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Michael Lampson
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Mia Levine
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jeremy B Searle
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
| | - Jeffrey M Good
- Division of Biological Sciences, University of Montana, Missoula, MT 59801, USA
| |
Collapse
|
6
|
Booth TJ, Shaw S, Cruz-Morales P, Weber T. getphylo: rapid and automatic generation of multi-locus phylogenetic trees. BMC Bioinformatics 2025; 26:21. [PMID: 39827349 PMCID: PMC11748604 DOI: 10.1186/s12859-025-06035-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 01/03/2025] [Indexed: 01/22/2025] Open
Abstract
BACKGROUND The increasing amount of genomic data calls for tools that can create genome-scale phylogenies quickly and efficiently. Existing tools rely on large reference databases or require lengthy de novo calculations to identify orthologues, meaning that they have long run times and are limited in their taxonomic scope. To address this, we created getphylo, a python tool for the rapid generation of phylogenetic trees de novo from annotated sequences. RESULTS We present getphylo (Genbank to Phylogeny), a tool that automatically builds phylogenetic trees from annotated genomes alone. Orthologues are identified heuristically by searching for singletons (single copy genes) across all input genomes and the phylogeny is inferred from a concatenated alignment of all coding sequences by maximum likelihood. We performed a thorough benchmarking of getphylo against two existing tools, autoMLST and GTDB-tk, to show that it can produce trees of comparable quality in a fraction of the time. We also demonstrate the flexibility of getphylo across four case studies including bacterial and eukaryotic genomes, and biosynthetic gene clusters. CONCLUSIONS getphylo is a quick and reliable tool for the automated generation of genome-scale phylogenetic trees. getphylo can produce phylogenies comparable to other software in a fraction of the time, without the need large local databases or intense computation. getphylo can rapidly identify orthologues from a wide variety of datasets regardless of taxonomic or genomic scope. The usability, speed, flexibility of getphylo makes it a valuable addition to the phylogenetics toolkit.
Collapse
Affiliation(s)
- T J Booth
- The Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark.
| | - S Shaw
- The Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark
| | - P Cruz-Morales
- The Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark
| | - T Weber
- The Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske Universitet, Kongens Lyngby, Denmark
| |
Collapse
|
7
|
Hayakawa T, Kishida T, Go Y, Inoue E, Kawaguchi E, Aizu T, Ishizaki H, Toyoda A, Fujiyama A, Matsuzawa T, Hashimoto C, Furuichi T, Agata K. Genome-scale evolution in local populations of wild chimpanzees. Sci Rep 2025; 15:548. [PMID: 39747985 PMCID: PMC11696052 DOI: 10.1038/s41598-024-84163-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 12/20/2024] [Indexed: 01/04/2025] Open
Abstract
Analysis of genome-scale evolution has been difficult in large, endangered animals because opportunities to collect high-quality genetic samples are limited. There is a need for novel field-friendly, cost-effective genetic techniques. This study conducted an exome-wide analysis of a total of 42 chimpanzees (Pan troglodytes) across six African regions, providing insights into population discrimination techniques. Wild chimpanzee DNA was extracted noninvasively from collected fecal samples using the lysis-buffer storage method. To target genome-scale regions of host DNA, exome-capture sequencing was performed using cost-effective baits originally designed for humans (closely related to chimpanzees). Multivariate analysis effectively discriminated differences in local populations, aiding in the identification of samples' geographical origins. Exome-wide heterozygosity was negatively correlated significantly with genome-wide nonsynonymous-synonymous substitution ratios, suggesting that mutation loads exist at the local population level. Exome sequences revealed functional diversity and protein-coding gene divergence. Segregating pseudogenes were comprehensively annotated, with many being population-specific and others shared among populations. Focusing on multicopy chemosensory receptor genes, the segregating pseudogenes OR7D4 (an olfactory receptor) and TAS2R42 (a bitter taste receptor) were shared among western and eastern chimpanzees. Overall, our analytical framework offers ecological insights into chimpanzees and may be applicable to other organisms.
Collapse
Grants
- 12J04270, 25257409, 16K18630, 19K16241, 21H04919, 22770240, 24113511, 25711027, 25304019 Japan Society for the Promotion of Science
- 12J04270, 25257409, 16K18630, 19K16241, 21H04919, 22770240, 24113511, 25711027, 25304019 Japan Society for the Promotion of Science
- 12J04270, 25257409, 16K18630, 19K16241, 21H04919, 22770240, 24113511, 25711027, 25304019 Japan Society for the Promotion of Science
Collapse
Affiliation(s)
- Takashi Hayakawa
- Faculty of Environmental Earth Science, Hokkaido University, Sapporo, Hokkaido, Japan.
| | - Takushi Kishida
- College of Bioresource Sciences, Nihon University, Fujisawa, Kanagawa, Japan
- Wildlife Research Center, Kyoto University, Kyoto, Japan
| | - Yasuhiro Go
- Graduate School of Information Science, University of Hyogo, Kobe, Hyogo, Japan
- Department of System Neuroscience, Division of Behavioral Development, National Institute for Physiological Sciences, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
- Cognitive Genomics Research Group, Exploratory Research Center on Life and Living Systems (ExCELLS), National Institutes of Natural Sciences, Okazaki, Aichi, Japan
| | - Eiji Inoue
- Faculty of Science, Toho University, Funabashi, Chiba, Japan
| | - Eri Kawaguchi
- Center for iPS Cell Research and Application, Kyoto University, Kyoto, Japan
| | - Tomoyuki Aizu
- Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka, Japan
- Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Hinako Ishizaki
- Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka, Japan
- Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Atsushi Toyoda
- Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka, Japan
- Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Asao Fujiyama
- Advanced Genomics Center, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Tetsuro Matsuzawa
- Department of Pedagogy, Chubu Gakuin University, Gifu, Japan
- College of Life Sciences, Northwest University, Xi'an, China
| | - Chie Hashimoto
- Wildlife Research Center, Kyoto University, Kyoto, Japan
| | | | - Kiyokazu Agata
- Laboratory for Regenerative Biology, National Institute for Basic Biology, Okazaki, Aichi, Japan
| |
Collapse
|
8
|
Pereira AB, Marano M, Bathala R, Zaragoza RA, Neira A, Samano A, Owoyemi A, Casola C. Orphan genes are not a distinct biological entity. Bioessays 2025; 47:e2400146. [PMID: 39491810 DOI: 10.1002/bies.202400146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Revised: 10/06/2024] [Accepted: 10/11/2024] [Indexed: 11/05/2024]
Abstract
The genome sequencing revolution has revealed that all species possess a large number of unique genes critical for trait variation, adaptation, and evolutionary innovation. One widely used approach to identify such genes consists of detecting protein-coding sequences with no homology in other genomes, termed orphan genes. These genes have been extensively studied, under the assumption that they represent valid proxies for species-specific genes. Here, we critically evaluate taxonomic, phylogenetic, and sequence evolution evidence showing that orphan genes belong to a range of evolutionary ages and thus cannot be assigned to a single lineage. Furthermore, we show that the processes generating orphan genes are substantially more diverse than generally thought and include horizontal gene transfer, transposable element domestication, and overprinting. Thus, orphan genes represent a heterogeneous collection of genes rather than a single biological entity, making them unsuitable as a subject for meaningful investigation of gene evolution and phenotypic innovation.
Collapse
Affiliation(s)
- Andres Barboza Pereira
- Interdisciplinary Graduate Program in Genetics & Genomics, Texas A&M University, College Station, Texas, USA
- Interdisciplinary Doctoral Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
| | - Matthew Marano
- Interdisciplinary Doctoral Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
| | - Ramya Bathala
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas, USA
| | | | - Andres Neira
- School of Pharmacy, Texas A&M University, College Station, Texas, USA
| | - Alex Samano
- Department of Biology, Texas A&M University, College Station, Texas, USA
| | - Adekola Owoyemi
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, Texas, USA
| | - Claudio Casola
- Interdisciplinary Graduate Program in Genetics & Genomics, Texas A&M University, College Station, Texas, USA
- Interdisciplinary Doctoral Program in Ecology and Evolutionary Biology, Texas A&M University, College Station, Texas, USA
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
9
|
Li J, Ai Q, Xie S, Huang C, Qiu F, Fu C, Zhao M, Fu J, Wu H. Contrast and Genomic Characterisation of Ancient and Recent Interspecific Introgression Between Deeply Diverged Moustache Toads (Leptobrachium). Mol Ecol 2024; 33:e17569. [PMID: 39465507 DOI: 10.1111/mec.17569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 10/03/2024] [Accepted: 10/14/2024] [Indexed: 10/29/2024]
Abstract
Recent genomic analyses have provided new insights into the process of interspecific introgression and its consequences on species evolution. Most recent studies, however, focused on hybridization between recently radiated species, with few examining the genomic outcomes of ancient hybridization across deeply diverged species. Using whole genome data of moustache toads (Leptobrachium), we identified signals of three hybridization events among nine species that diverged at the Eocene. An ancient introgression from L. leishanense to the ancestral branch (C1) of L. liui introduced adaptive variants. The highly introgressed regions include genes with important functions in odorant detection and immune responses. These genes are preserved in all three descendent populations of L. liui_C1, and these regions likely have been positively selected over a long filtering process. A recent introgression occurred from L. huashen to L. tengchongense, with the introgressed regions being mostly neutral. Furthermore, one F1 hybrid individual was detected between sympatric L. ailaonicum and L. promustache. The signals of introgression largely disappeared after removing the hybrid individual, indicating an occasional hybridization but minimal introgression. Further examination of highly divergent but low introgressed genomic regions revealed both pre-mating isolation and genetic incompatibility as potential mechanisms of resisting introgression and maintaining species boundaries. Additionally, no large X-effect was found in these introgression events. Hybridization between deeply diverged amphibian species may be common, but detectable introgressions are likely less so, with recent introgression being mostly neutral and the rare ancient one potentially adaptive. Our findings complement recent genomic work, and together they provide a better understanding of the genomic characteristics of interspecific introgression and its significance in species adaptation and evolution.
Collapse
Affiliation(s)
- Jun Li
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Qingbo Ai
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Siyu Xie
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Chunhua Huang
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Fuyuan Qiu
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Chao Fu
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Mian Zhao
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| | - Jinzhong Fu
- Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada
| | - Hua Wu
- Hubei Key Laboratory of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, Wuhan, Hubei, People's Republic of China
| |
Collapse
|
10
|
Allman ES, Baños H, Mitchell JD, Rhodes JA. TINNiK: inference of the tree of blobs of a species network under the coalescent model. Algorithms Mol Biol 2024; 19:23. [PMID: 39501362 PMCID: PMC11539473 DOI: 10.1186/s13015-024-00266-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Accepted: 08/22/2024] [Indexed: 11/08/2024] Open
Abstract
The tree of blobs of a species network shows only the tree-like aspects of relationships of taxa on a network, omitting information on network substructures where hybridization or other types of lateral transfer of genetic information occur. By isolating such regions of a network, inference of the tree of blobs can serve as a starting point for a more detailed investigation, or indicate the limit of what may be inferrable without additional assumptions. Building on our theoretical work on the identifiability of the tree of blobs from gene quartet distributions under the Network Multispecies Coalescent model, we develop an algorithm, TINNiK, for statistically consistent tree of blobs inference. We provide examples of its application to both simulated and empirical datasets, utilizing an implementation in the MSCquartets 2.0 R package.
Collapse
Affiliation(s)
- Elizabeth S Allman
- Department of Mathematics and Statistics, University of Alaska, Fairbanks, AK, USA.
| | - Hector Baños
- Department of Mathematics, California State University San Bernadino, San Bernadino, CA, USA
| | - Jonathan D Mitchell
- School of Natural Sciences (Mathematics), University of Tasmania, Hobart, TAS, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, University of Tasmania, Hobart, TAS, Australia
| | - John A Rhodes
- Department of Mathematics and Statistics, University of Alaska, Fairbanks, AK, USA
| |
Collapse
|
11
|
Lanfear R, Hahn MW. The Meaning and Measure of Concordance Factors in Phylogenomics. Mol Biol Evol 2024; 41:msae214. [PMID: 39418118 PMCID: PMC11532913 DOI: 10.1093/molbev/msae214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 09/25/2024] [Accepted: 10/04/2024] [Indexed: 10/19/2024] Open
Abstract
As phylogenomic datasets have grown in size, researchers have developed new ways to measure biological variation and to assess statistical support for specific branches. Larger datasets have more sites and loci and therefore less sampling variance. While we can more accurately measure the mean signal in these datasets, lower sampling variance is often reflected in uniformly high measures of branch support-such as the bootstrap and posterior probability-limiting their utility. Larger datasets have also revealed substantial biological variation in the topologies found across individual loci, such that the single species tree inferred by most phylogenetic methods represents a limited summary of the data for many purposes. In contrast to measures of statistical support, the degree of underlying topological variation among loci should be approximately constant regardless of the size of the dataset. "Concordance factors" (CFs) and similar statistics have therefore become increasingly important tools in phylogenetics. In this review, we explain why CFs should be thought of as descriptors of topological variation rather than as measures of statistical support, and argue that they provide important information about the predictive power of the species tree not contained in measures of support. We review a growing suite of statistics for measuring concordance, compare them in a common framework that reveals their interrelationships, and demonstrate how to calculate them using an example from birds. We also discuss how measures of topological variation might change in the future as we move beyond estimating a single "tree of life" toward estimating the myriad evolutionary histories underlying genomic variation.
Collapse
Affiliation(s)
- Robert Lanfear
- Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australia
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN, USA
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| |
Collapse
|
12
|
Koppetsch T, Malinsky M, Matschiner M. Towards Reliable Detection of Introgression in the Presence of Among-Species Rate Variation. Syst Biol 2024; 73:769-788. [PMID: 38912803 PMCID: PMC11639170 DOI: 10.1093/sysbio/syae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 05/23/2024] [Accepted: 06/19/2024] [Indexed: 06/25/2024] Open
Abstract
The role of interspecific hybridization has recently seen increasing attention, especially in the context of diversification dynamics. Genomic research has now made it abundantly clear that both hybridization and introgression-the exchange of genetic material through hybridization and backcrossing-are far more common than previously thought. Besides cases of ongoing or recent genetic exchange between taxa, an increasing number of studies report "ancient introgression"- referring to results of hybridization that took place in the distant past. However, it is not clear whether commonly used methods for the detection of introgression are applicable to such old systems, given that most of these methods were originally developed for analyses at the level of populations and recently diverged species, affected by recent or ongoing genetic exchange. In particular, the assumption of constant evolutionary rates, which is implicit in many commonly used approaches, is more likely to be violated as evolutionary divergence increases. To test the limitations of introgression detection methods when being applied to old systems, we simulated thousands of genomic datasets under a wide range of settings, with varying degrees of among-species rate variation and introgression. Using these simulated datasets, we showed that some commonly applied statistical methods, including the D-statistic and certain tests based on sets of local phylogenetic trees, can produce false-positive signals of introgression between divergent taxa that have different rates of evolution. These misleading signals are caused by the presence of homoplasies occurring at different rates in different lineages. To distinguish between the patterns caused by rate variation and genuine introgression, we developed a new test that is based on the expected clustering of introgressed sites along the genome and implemented this test in the program Dsuite.
Collapse
Affiliation(s)
- Thore Koppetsch
- Natural History Museum, University of Oslo, 0318 Oslo, Norway
| | - Milan Malinsky
- Institute of Ecology and Evolution, Department of Biology, University of Bern, 3012 Bern, Switzerland
- Department of Fish Ecology and Evolution, EAWAG Swiss Federal Institute of Aquatic Science and Technology, Kastanienbaum, Switzerland
| | | |
Collapse
|
13
|
Bessa MH, Gottschalk MS, Robe LJ. Whole genome phylogenomics helps to resolve the phylogenetic position of the Zygothrica genus group (Diptera, Drosophilidae) and the causes of previous incongruences. Mol Phylogenet Evol 2024; 199:108158. [PMID: 39025321 DOI: 10.1016/j.ympev.2024.108158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 06/28/2024] [Accepted: 07/14/2024] [Indexed: 07/20/2024]
Abstract
Incomplete Lineage Sorting (ILS) and introgression are among the two main factors causing incongruence between gene and species trees. Advances in phylogenomic studies have allowed us to overcome most of these issues, providing reliable phylogenetic hypotheses while revealing the underlying evolutionary scenario. Across the last century, many incongruent phylogenetic reconstructions were recovered for Drosophilidae, employing a limited sampling of genetic markers or species. In these studies, the monophyly and the phylogenetic positioning of the Zygothrica genus group stood out as one of the most controversial questions. Thus, here, we addressed these issues using a phylogenomic approach, while accessing the influence of ILS and introgressions on the diversification of these species and addressing the spatio-temporal scenario associated with their evolution. For this task, the genomes of nine specimens from six Neotropical species belonging to the Zygothrica genus group were sequenced and evaluated in a phylogenetic framework encompassing other 39 species of Drosophilidae. Nucleotide and amino acid sequences recovered for a set of 2,534 single-copy genes by BUSCO were employed to reconstruct maximum likelihood (ML) concatenated and multi-species coalescent (MSC) trees. Likelihood mapping, quartet sampling, and reticulation tests were employed to infer the level and causes of incongruence. Lastly, a penalized-likelihood molecular clock strategy with fossil calibrations was performed to infer divergence times. Taken together, our results recovered the subdivision of Drosophila into six different lineages, one of which clusters species of the Zygothrica genus group (except for H. duncani). The divergence of this lineage was dated to Oligocene ∼ 31 Mya and seems to have occurred in the same timeframe as other key diversification within Drosophila. According to the concatenated and MSC strategies, this lineage is sister to the clade joining Drosophila (Siphlodora) with the Hawaiian Drosophila and Scaptomyza. Likelihood mapping, quartet sampling, reticulation reconstructions as well as introgression tests revealed that this lineage was the target of several hybridization events involving the ancestors of different Drosophila lineages. Thus, our results generally show introgression as a major source of previous incongruence. Nevertheless, the similar diversification times recovered for several of the Neotropical Drosophila lineages also support the scenario of multiple and simultaneous diversifications taking place at the base of Drosophilidae phylogeny, at least in the Neotropics.
Collapse
Affiliation(s)
- Maiara Hartwig Bessa
- Programa de Pós-Graduação em Biodiversidade Animal (PPGBA), Universidade Federal de Santa Maria (UFSM), Brazil
| | - Marco Silva Gottschalk
- Programa de Pós-Graduação em Biodiversidade Animal (PPGBDiv), Instituto de Biologia, Universidade Federal de Pelotas (UFPel), Brazil
| | - Lizandra Jaqueline Robe
- Programa de Pós-Graduação em Biodiversidade Animal (PPGBA), Universidade Federal de Santa Maria (UFSM), Brazil.
| |
Collapse
|
14
|
Jofre GI, Dagilis AJ, Sepúlveda VE, Anspach T, Singh A, Chowdhary A, Matute DR. Admixture in the fungal pathogen Blastomyces. Genetics 2024; 228:iyae155. [PMID: 39315610 PMCID: PMC11631411 DOI: 10.1093/genetics/iyae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 08/19/2024] [Accepted: 09/20/2024] [Indexed: 09/25/2024] Open
Abstract
Blastomyces is an emerging primary fungal pathogen that affects patients worldwide. The evolutionary processes that have resulted in the current diversity in the genus remain largely unexplored. We used whole genome sequences from 99 Blastomyces isolates, including two sequenced in this study using long-read technologies, to infer the phylogenetic relationships between Blastomyces species. We find that five different methods infer five different phylogenetic trees. Additionally, we find gene tree discordance along the genome with differences in the relative phylogenetic placement of several species of Blastomyces, which we hypothesize is caused by introgression. Our results suggest the urgent need to systematically collect Blastomyces samples around the world and study the evolutionary processes that govern intra- and interspecific variation in these medically important fungi.
Collapse
Affiliation(s)
- Gaston I Jofre
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Andrius J Dagilis
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
- Department of Ecology, Evolution and Behavior, University of Connecticut, Storrs, CT 06269, USA
| | | | - Tayte Anspach
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Ashutosh Singh
- National Reference Laboratory for Antimicrobial Resistance in Fungal Pathogens, Medical Mycology Unit, Department of Microbiology, Vallabhbhai Patel Chest Institute, University of Delhi, Delhi 110021, India
| | - Anuradha Chowdhary
- National Reference Laboratory for Antimicrobial Resistance in Fungal Pathogens, Medical Mycology Unit, Department of Microbiology, Vallabhbhai Patel Chest Institute, University of Delhi, Delhi 110021, India
| | - Daniel R Matute
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
15
|
Wong TKF, Cherryh C, Rodrigo AG, Hahn MW, Minh BQ, Lanfear R. MAST: Phylogenetic Inference with Mixtures Across Sites and Trees. Syst Biol 2024; 73:375-391. [PMID: 38421146 PMCID: PMC11282360 DOI: 10.1093/sysbio/syae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 12/18/2023] [Accepted: 02/27/2024] [Indexed: 03/02/2024] Open
Abstract
Hundreds or thousands of loci are now routinely used in modern phylogenomic studies. Concatenation approaches to tree inference assume that there is a single topology for the entire dataset, but different loci may have different evolutionary histories due to incomplete lineage sorting (ILS), introgression, and/or horizontal gene transfer; even single loci may not be treelike due to recombination. To overcome this shortcoming, we introduce an implementation of a multi-tree mixture model that we call mixtures across sites and trees (MAST). This model extends a prior implementation by Boussau et al. (2009) by allowing users to estimate the weight of each of a set of pre-specified bifurcating trees in a single alignment. The MAST model allows each tree to have its own weight, topology, branch lengths, substitution model, nucleotide or amino acid frequencies, and model of rate heterogeneity across sites. We implemented the MAST model in a maximum-likelihood framework in the popular phylogenetic software, IQ-TREE. Simulations show that we can accurately recover the true model parameters, including branch lengths and tree weights for a given set of tree topologies, under a wide range of biologically realistic scenarios. We also show that we can use standard statistical inference approaches to reject a single-tree model when data are simulated under multiple trees (and vice versa). We applied the MAST model to multiple primate datasets and found that it can recover the signal of ILS in the Great Apes, as well as the asymmetry in minor trees caused by introgression among several macaque species. When applied to a dataset of 4 Platyrrhine species for which standard concatenated maximum likelihood (ML) and gene tree approaches disagree, we observe that MAST gives the highest weight (i.e., the largest proportion of sites) to the tree also supported by gene tree approaches. These results suggest that the MAST model is able to analyze a concatenated alignment using ML while avoiding some of the biases that come with assuming there is only a single tree. We discuss how the MAST model can be extended in the future.
Collapse
Affiliation(s)
- Thomas K F Wong
- School of Computing, Australian National University, Canberra, ACT 2601, Australia
| | - Caitlin Cherryh
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Allen G Rodrigo
- School of Biological Sciences, University of Auckland, Auckland 1142, New Zealand
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana 47405, USA
| | - Bui Quang Minh
- School of Computing, Australian National University, Canberra, ACT 2601, Australia
| | - Robert Lanfear
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| |
Collapse
|
16
|
Fan Z, Zhang R, Zhou A, Hey J, Song Y, Osada N, Hamada Y, Yue B, Xing J, Li J. Genomic Evidence for the Complex Evolutionary History of Macaques (Genus Macaca). J Mol Evol 2024; 92:286-299. [PMID: 38634872 DOI: 10.1007/s00239-024-10166-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 03/20/2024] [Indexed: 04/19/2024]
Abstract
The genus Macaca is widely distributed, occupies a variety of habitats, shows diverse phenotypic characteristics, and is one of the best-studied genera of nonhuman primates. Here, we reported five re-sequencing Macaca genomes, including one M. cyclopis, one M. fuscata, one M. thibetana, one M. silenus, and one M. sylvanus. Together with published genomes of other macaque species, we combined 20 genome sequences of 10 macaque species to investigate the gene introgression and genetic differences among the species. The network analysis of the SNV-fragment trees indicates a reticular phylogeny of macaque species. Combining the results from various analytical methods, we identified extensive ancient introgression events among macaque species. The multiple introgression signals between different species groups were also observed, such as between fascicularis group species and silenus group species. However, gene flow signals between fascicularis and sinica group were not as strong as those between fascicularis group and silenus group. On the other hand, the unidirect gene flow in M. arctoides probably occurred between the progenitor of M. arctoides and the common ancestor of fascicularis group. Our study also shows that the genetic backgrounds and genetic diversity of different macaques vary dramatically among species, even among populations of the same species. In conclusion, using whole genome sequences and multiple methods, we have studied the evolutionary history of the genus Macaca and provided evidence for extensive introgression among the species.
Collapse
Affiliation(s)
- Zhenxin Fan
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, Sichuan, People's Republic of China
| | - Rusong Zhang
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, Sichuan, People's Republic of China
| | - Anbo Zhou
- Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Jody Hey
- Department of Biology, Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, USA
| | - Yang Song
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, Sichuan, People's Republic of China
| | - Naoki Osada
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Hokkaido, 060-0814, Japan
| | - Yuzuru Hamada
- National Primate Research Center of Thailand, Chulalongkorn University, Bangkok, Thailand
| | - Bisong Yue
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, Sichuan, People's Republic of China
| | - Jinchuan Xing
- Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Jing Li
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, Sichuan, People's Republic of China.
| |
Collapse
|
17
|
Zhang R, Drummond AJ, Mendes FK. Fast Bayesian Inference of Phylogenies from Multiple Continuous Characters. Syst Biol 2024; 73:102-124. [PMID: 38085256 PMCID: PMC11129596 DOI: 10.1093/sysbio/syad067] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 03/23/2023] [Accepted: 11/07/2023] [Indexed: 05/28/2024] Open
Abstract
Time-scaled phylogenetic trees are an ultimate goal of evolutionary biology and a necessary ingredient in comparative studies. The accumulation of genomic data has resolved the tree of life to a great extent, yet timing evolutionary events remain challenging if not impossible without external information such as fossil ages and morphological characters. Methods for incorporating morphology in tree estimation have lagged behind their molecular counterparts, especially in the case of continuous characters. Despite recent advances, such tools are still direly needed as we approach the limits of what molecules can teach us. Here, we implement a suite of state-of-the-art methods for leveraging continuous morphology in phylogenetics, and by conducting extensive simulation studies we thoroughly validate and explore our methods' properties. While retaining model generality and scalability, we make it possible to estimate absolute and relative divergence times from multiple continuous characters while accounting for uncertainty. We compile and analyze one of the most data-type diverse data sets to date, comprised of contemporaneous and ancient molecular sequences, and discrete and continuous morphological characters from living and extinct Carnivora taxa. We conclude by synthesizing lessons about our method's behavior, and suggest future research venues.
Collapse
Affiliation(s)
- Rong Zhang
- Programme in Emerging Infectious Diseases, Duke-NUS Medical School 169857, Singapore
| | - Alexei J Drummond
- Centre for Computational Evolution, The University of Auckland, Auckland 1010, New Zealand
- School of Biological Sciences, The University of Auckland, Auckland 1010, New Zealand
| | - Fábio K Mendes
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
| |
Collapse
|
18
|
Das S, Greenbaum E, Brecko J, Pauwels OSG, Ruane S, Pirro S, Merilä J. Phylogenomics of Psammodynastes and Buhoma (Elapoidea: Serpentes), with the description of a new Asian snake family. Sci Rep 2024; 14:9489. [PMID: 38664489 PMCID: PMC11045840 DOI: 10.1038/s41598-024-60215-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Accepted: 04/19/2024] [Indexed: 04/28/2024] Open
Abstract
Asian mock vipers of the genus Psammodynastes and African forest snakes of the genus Buhoma are two genera belonging to the snake superfamily Elapoidea. The phylogenetic placements of Psammodynastes and Buhoma within Elapoidea has been extremely unstable which has resulted in their uncertain and debated taxonomy. We used ultraconserved elements and traditional nuclear and mitochondrial markers to infer the phylogenetic relationships of these two genera with other elapoids. Psammodynastes, for which a reference genome has been sequenced, were found, with strong branch support, to be a relatively early diverging split within Elapoidea that is sister to a clade consisting of Elapidae, Micrelapidae and Lamprophiidae. Hence, we allocate Psammodynastes to its own family, Psammodynastidae new family. However, the phylogenetic position of Buhoma could not be resolved with a high degree of confidence. Attempts to identify the possible sources of conflict in the rapid radiation of elapoid snakes suggest that both hybridisation/introgression during the rapid diversification, including possible ghost introgression, as well as incomplete lineage sorting likely have had a confounding role. The usual practice of combining mitochondrial loci with nuclear genomic data appears to mislead phylogeny reconstructions in rapid radiation scenarios, especially in the absence of genome scale data.
Collapse
Affiliation(s)
- Sunandan Das
- Ecological Genetics Research Unit, Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, 00014, Helsinki, Finland.
| | - Eli Greenbaum
- Department of Biological Sciences, University of Texas at El Paso, 500 W. University Avenue, El Paso, TX, 79968, USA
| | - Jonathan Brecko
- Royal Belgian Institute of Natural Sciences, Rue Vautier 29, 1000, Brussels, Belgium
- Royal Museum for Central Africa, Tervuren, Belgium
| | - Olivier S G Pauwels
- Royal Belgian Institute of Natural Sciences, Rue Vautier 29, 1000, Brussels, Belgium
| | - Sara Ruane
- Life Sciences Section, Negaunee Integrative Research Center, Field Museum, Chicago, IL, USA
| | - Stacy Pirro
- Iridian Genomes Inc., Bethesda, MD, 20817, USA
| | - Juha Merilä
- Ecological Genetics Research Unit, Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, 00014, Helsinki, Finland
- Area of Ecology and Biodiversity, School of Biological Sciences, Kadoorie Biological Sciences Building, The University of Hong Kong, Pokfulam Road, Hong Kong, SAR, China
| |
Collapse
|
19
|
Mao Y, Harvey WT, Porubsky D, Munson KM, Hoekzema K, Lewis AP, Audano PA, Rozanski A, Yang X, Zhang S, Yoo D, Gordon DS, Fair T, Wei X, Logsdon GA, Haukness M, Dishuck PC, Jeong H, Del Rosario R, Bauer VL, Fattor WT, Wilkerson GK, Mao Y, Shi Y, Sun Q, Lu Q, Paten B, Bakken TE, Pollen AA, Feng G, Sawyer SL, Warren WC, Carbone L, Eichler EE. Structurally divergent and recurrently mutated regions of primate genomes. Cell 2024; 187:1547-1562.e13. [PMID: 38428424 PMCID: PMC10947866 DOI: 10.1016/j.cell.2024.01.052] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 11/26/2023] [Accepted: 01/31/2024] [Indexed: 03/03/2024]
Abstract
We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or ∼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.
Collapse
Affiliation(s)
- Yafei Mao
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David S Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Tyler Fair
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Xiaoxi Wei
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ricardo Del Rosario
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vanessa L Bauer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Will T Fattor
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Gregory K Wilkerson
- Department of Veterinary Sciences, Michale E. Keeling Center for Comparative Medicine and Research, The University of Texas MD Anderson Cancer Center, Bastrop, TX, USA; Department of Clinical Sciences, North Carolina State University, Raleigh, NC, USA
| | - Yuxiang Mao
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China; Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Qiang Sun
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Alex A Pollen
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Guoping Feng
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sara L Sawyer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Wesley C Warren
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA; Department of Surgery, School of Medicine, University of Missouri, Columbia, MO, USA; Institute of Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Lucia Carbone
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA; Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA; Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, USA; Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
20
|
Rivas-González I, Schierup MH, Wakeley J, Hobolth A. TRAILS: Tree reconstruction of ancestry using incomplete lineage sorting. PLoS Genet 2024; 20:e1010836. [PMID: 38330138 PMCID: PMC10880969 DOI: 10.1371/journal.pgen.1010836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 02/21/2024] [Accepted: 01/22/2024] [Indexed: 02/10/2024] Open
Abstract
Genome-wide genealogies of multiple species carry detailed information about demographic and selection processes on individual branches of the phylogeny. Here, we introduce TRAILS, a hidden Markov model that accurately infers time-resolved population genetics parameters, such as ancestral effective population sizes and speciation times, for ancestral branches using a multi-species alignment of three species and an outgroup. TRAILS leverages the information contained in incomplete lineage sorting fragments by modelling genealogies along the genome as rooted three-leaved trees, each with a topology and two coalescent events happening in discretized time intervals within the phylogeny. Posterior decoding of the hidden Markov model can be used to infer the ancestral recombination graph for the alignment and details on demographic changes within a branch. Since TRAILS performs posterior decoding at the base-pair level, genome-wide scans based on the posterior probabilities can be devised to detect deviations from neutrality. Using TRAILS on a human-chimp-gorilla-orangutan alignment, we recover speciation parameters and extract information about the topology and coalescent times at high resolution.
Collapse
Affiliation(s)
| | - Mikkel H. Schierup
- Bioinformatics Research Center (BiRC), Aarhus University, Aarhus, Denmark
| | - John Wakeley
- Department of Organismic and Evolutionary Biology, Harvard University, Massachusetts, United States of America
| | - Asger Hobolth
- Department of Mathematics, Aarhus University, Aarhus, Denmark
| |
Collapse
|
21
|
Ray DD, Flagel L, Schrider DR. IntroUNET: Identifying introgressed alleles via semantic segmentation. PLoS Genet 2024; 20:e1010657. [PMID: 38377104 PMCID: PMC10906877 DOI: 10.1371/journal.pgen.1010657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/01/2024] [Accepted: 01/29/2024] [Indexed: 02/22/2024] Open
Abstract
A growing body of evidence suggests that gene flow between closely related species is a widespread phenomenon. Alleles that introgress from one species into a close relative are typically neutral or deleterious, but sometimes confer a significant fitness advantage. Given the potential relevance to speciation and adaptation, numerous methods have therefore been devised to identify regions of the genome that have experienced introgression. Recently, supervised machine learning approaches have been shown to be highly effective for detecting introgression. One especially promising approach is to treat population genetic inference as an image classification problem, and feed an image representation of a population genetic alignment as input to a deep neural network that distinguishes among evolutionary models (i.e. introgression or no introgression). However, if we wish to investigate the full extent and fitness effects of introgression, merely identifying genomic regions in a population genetic alignment that harbor introgressed loci is insufficient-ideally we would be able to infer precisely which individuals have introgressed material and at which positions in the genome. Here we adapt a deep learning algorithm for semantic segmentation, the task of correctly identifying the type of object to which each individual pixel in an image belongs, to the task of identifying introgressed alleles. Our trained neural network is thus able to infer, for each individual in a two-population alignment, which of those individual's alleles were introgressed from the other population. We use simulated data to show that this approach is highly accurate, and that it can be readily extended to identify alleles that are introgressed from an unsampled "ghost" population, performing comparably to a supervised learning method tailored specifically to that task. Finally, we apply this method to data from Drosophila, showing that it is able to accurately recover introgressed haplotypes from real data. This analysis reveals that introgressed alleles are typically confined to lower frequencies within genic regions, suggestive of purifying selection, but are found at much higher frequencies in a region previously shown to be affected by adaptive introgression. Our method's success in recovering introgressed haplotypes in challenging real-world scenarios underscores the utility of deep learning approaches for making richer evolutionary inferences from genomic data.
Collapse
Affiliation(s)
- Dylan D. Ray
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Lex Flagel
- Division of Data Science, Gencove Inc., New York, New York, United States of America
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Daniel R. Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
22
|
Ray DD, Flagel L, Schrider DR. IntroUNET: identifying introgressed alleles via semantic segmentation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.07.527435. [PMID: 36865105 PMCID: PMC9979274 DOI: 10.1101/2023.02.07.527435] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
A growing body of evidence suggests that gene flow between closely related species is a widespread phenomenon. Alleles that introgress from one species into a close relative are typically neutral or deleterious, but sometimes confer a significant fitness advantage. Given the potential relevance to speciation and adaptation, numerous methods have therefore been devised to identify regions of the genome that have experienced introgression. Recently, supervised machine learning approaches have been shown to be highly effective for detecting introgression. One especially promising approach is to treat population genetic inference as an image classification problem, and feed an image representation of a population genetic alignment as input to a deep neural network that distinguishes among evolutionary models (i.e. introgression or no introgression). However, if we wish to investigate the full extent and fitness effects of introgression, merely identifying genomic regions in a population genetic alignment that harbor introgressed loci is insufficient-ideally we would be able to infer precisely which individuals have introgressed material and at which positions in the genome. Here we adapt a deep learning algorithm for semantic segmentation, the task of correctly identifying the type of object to which each individual pixel in an image belongs, to the task of identifying introgressed alleles. Our trained neural network is thus able to infer, for each individual in a two-population alignment, which of those individual's alleles were introgressed from the other population. We use simulated data to show that this approach is highly accurate, and that it can be readily extended to identify alleles that are introgressed from an unsampled "ghost" population, performing comparably to a supervised learning method tailored specifically to that task. Finally, we apply this method to data from Drosophila, showing that it is able to accurately recover introgressed haplotypes from real data. This analysis reveals that introgressed alleles are typically confined to lower frequencies within genic regions, suggestive of purifying selection, but are found at much higher frequencies in a region previously shown to be affected by adaptive introgression. Our method's success in recovering introgressed haplotypes in challenging real-world scenarios underscores the utility of deep learning approaches for making richer evolutionary inferences from genomic data.
Collapse
Affiliation(s)
- Dylan D. Ray
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Lex Flagel
- Division of Data Science, Gencove Inc., New York, NY 11101, USA
- Department of Plant and Microbial Biology, University of Minnesota, St Paul MN, 55108, USA
| | - Daniel R. Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
23
|
Bowman J, Enard D, Lynch VJ. Phylogenomics reveals an almost perfect polytomy among the almost ungulates ( Paenungulata). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570590. [PMID: 38106080 PMCID: PMC10723481 DOI: 10.1101/2023.12.07.570590] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Phylogenetic studies have resolved most relationships among Eutherian Orders. However, the branching order of elephants (Proboscidea), hyraxes (Hyracoidea), and sea cows (Sirenia) (i.e., the Paenungulata) has remained uncertain since at least 1758, when Linnaeus grouped elephants and manatees into a single Order (Bruta) to the exclusion of hyraxes. Subsequent morphological, molecular, and large-scale phylogenomic datasets have reached conflicting conclusions on the branching order within Paenungulates. We use a phylogenomic dataset of alignments from 13,388 protein-coding genes across 261 Eutherian mammals to infer phylogenetic relationships within Paenungulates. We find that gene trees almost equally support the three alternative resolutions of Paenungulate relationships and that despite strong support for a Proboscidea+Hyracoidea split in the multispecies coalescent (MSC) tree, there is significant evidence for gene tree uncertainty, incomplete lineage sorting, and introgression among Proboscidea, Hyracoidea, and Sirenia. Indeed, only 8-10% of genes have statistically significant phylogenetic signal to reject the hypothesis of a Paenungulate polytomy. These data indicate little support for any resolution for the branching order Proboscidea, Hyracoidea, and Sirenia within Paenungulata and suggest that Paenungulata may be as close to a real, or at least unresolvable, polytomy as possible.
Collapse
Affiliation(s)
- Jacob Bowman
- Department of Biological Sciences, University at Buffalo, SUNY, 551 Cooke Hall, Buffalo, NY, USA
| | - David Enard
- Department of Ecology and Evolutionary Biology. University of Arizona, Tucson, AZ, USA
| | - Vincent J. Lynch
- Department of Biological Sciences, University at Buffalo, SUNY, 551 Cooke Hall, Buffalo, NY, USA
| |
Collapse
|
24
|
Lescroart J, Bonilla-Sánchez A, Napolitano C, Buitrago-Torres DL, Ramírez-Chaves HE, Pulido-Santacruz P, Murphy WJ, Svardal H, Eizirik E. Extensive Phylogenomic Discordance and the Complex Evolutionary History of the Neotropical Cat Genus Leopardus. Mol Biol Evol 2023; 40:msad255. [PMID: 37987559 PMCID: PMC10701098 DOI: 10.1093/molbev/msad255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/07/2023] [Accepted: 11/13/2023] [Indexed: 11/22/2023] Open
Abstract
Even in the genomics era, the phylogeny of Neotropical small felids comprised in the genus Leopardus remains contentious. We used whole-genome resequencing data to construct a time-calibrated consensus phylogeny of this group, quantify phylogenomic discordance, test for interspecies introgression, and assess patterns of genetic diversity and demographic history. We infer that the Leopardus radiation started in the Early Pliocene as an initial speciation burst, followed by another in its subgenus Oncifelis during the Early Pleistocene. Our findings challenge the long-held notion that ocelot (Leopardus pardalis) and margay (L. wiedii) are sister species and instead indicate that margay is most closely related to the enigmatic Andean cat (L. jacobita), whose whole-genome data are reported here for the first time. In addition, we found that the newly sampled Andean tiger cat (L. tigrinus pardinoides) population from Colombia associates closely with Central American tiger cats (L. tigrinus oncilla). Genealogical discordance was largely attributable to incomplete lineage sorting, yet was augmented by strong gene flow between ocelot and the ancestral branch of Oncifelis, as well as between Geoffroy's cat (L. geoffroyi) and southern tiger cat (L. guttulus). Contrasting demographic trajectories have led to disparate levels of current genomic diversity, with a nearly tenfold difference in heterozygosity between Andean cat and ocelot, spanning the entire range of variability found in extant felids. Our analyses improved our understanding of the speciation history and diversity patterns in this felid radiation, and highlight the benefits to phylogenomic inference of embracing the many heterogeneous signals scattered across the genome.
Collapse
Affiliation(s)
- Jonas Lescroart
- Department of Biology, University of Antwerp, Antwerp, Belgium
- School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Alejandra Bonilla-Sánchez
- School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
- Faculty of Exact and Natural Sciences, University of Antioquia, Medellín, Colombia
| | - Constanza Napolitano
- Department of Biological Sciences and Biodiversity, University of Los Lagos, Osorno, Chile
- Institute of Ecology and Biodiversity, Concepción, Chile
- Cape Horn International Center, Puerto Williams, Chile
- Andean Cat Alliance, Villa Carlos Paz, Argentina
| | - Diana L Buitrago-Torres
- School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Héctor E Ramírez-Chaves
- Department of Biological Sciences, University of Caldas, Manizales, Colombia
- Centro de Museos, Museo de Historia Natural, University of Caldas, Manizales, Colombia
| | | | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - Hannes Svardal
- Department of Biology, University of Antwerp, Antwerp, Belgium
- Naturalis Biodiversity Center, Leiden, Netherlands
| | - Eduardo Eizirik
- School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
- Instituto Pró-Carnívoros, Atibaia, Brazil
| |
Collapse
|
25
|
Tan X, Qi J, Liu Z, Fan P, Liu G, Zhang L, Shen Y, Li J, Roos C, Zhou X, Li M. Phylogenomics Reveals High Levels of Incomplete Lineage Sorting at the Ancestral Nodes of the Macaque Radiation. Mol Biol Evol 2023; 40:msad229. [PMID: 37823401 PMCID: PMC10638670 DOI: 10.1093/molbev/msad229] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/06/2023] [Accepted: 10/08/2023] [Indexed: 10/13/2023] Open
Abstract
The genus Macaca includes 23 species assigned into 4 to 7 groups. It exhibits the largest geographic range and represents the most successful example of adaptive radiation of nonhuman primates. However, intrageneric phylogenetic relationships among species remain controversial and have not been resolved so far. In this study, we conducted a phylogenomic analysis on 16 newly generated and 8 published macaque genomes. We found strong evidence supporting the division of this genus into 7 species groups. Incomplete lineage sorting (ILS) was the primary factor contributing to the discordance observed among gene trees; however, we also found evidence of hybridization events, specifically between the ancestral arctoides/sinica and silenus/nigra lineages that resulted in the hybrid formation of the fascicularis/mulatta group. Combined with fossil data, our phylogenomic data were used to establish a scenario for macaque radiation. These findings provide insights into ILS and potential ancient introgression events that were involved in the radiation of macaques, which will lead to a better understanding of the rapid speciation occurring in nonhuman primates.
Collapse
Affiliation(s)
- Xinxin Tan
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- Geneplus-Beijing Institute, Beijing 102206, China
| | - Jiwei Qi
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhijin Liu
- College of Life Sciences, Capital Normal University, Beijing 100049, China
| | - Pengfei Fan
- School of Life Sciences, Sun Yat-Sen University, Guangzhou 510275, China
| | - Gaoming Liu
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Liye Zhang
- Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Göttingen 37077, Germany
| | - Ying Shen
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Jing Li
- Key Laboratory of Bio-resources and Eco-environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610064, China
| | - Christian Roos
- Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Göttingen 37077, Germany
- Gene Bank of Primates, German Primate Center, Leibniz Institute for Primate Research, Göttingen 37077, Germany
| | - Xuming Zhou
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Ming Li
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
26
|
Waters JM, Campbell CSM, Dutoit L. Fish biogeography and hybridization: do contemporary distributions predict introgression history? Evolution 2023; 77:2409-2419. [PMID: 37587034 DOI: 10.1093/evolut/qpad147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/26/2023] [Accepted: 08/14/2023] [Indexed: 08/18/2023]
Abstract
Freshwater ecosystems frequently house diverse assemblages of closely related fish taxa, which can be particularly prone to hybridization and introgression. While extensive introgression may be expected among biogeographically proximate lineages, recent analyses imply that contemporary distributions do not always accurately predict hybridization history. Here, we use the ABBA-BABA approach to test biogeographic hypotheses regarding the extent of hybridization in the recent evolution of New Zealand's species-rich freshwater Galaxias vulgaris fish complex. Genome-wide comparisons reveal significant increases in introgression associated with increasing geographic overlap of taxa. The estimator DP, which assesses the net proportion of a genome originating from introgression, shows a particularly strong relationship with biogeographic overlap (R2 = .43; p = .005). Our analyses nevertheless reveal surprisingly substantial signatures of introgression among taxa that currently have disjunct distributions within drainages (e.g., separate subcatchments). These "anomalies" imply that current biogeography is not always an accurate predictor of introgression history. Our study suggests that both modern and ancient biogeographic shifts, including recent anthropogenic range fragmentation and tectonically driven riven capture events, have influenced introgression histories in this dynamic freshwater fish radiation.
Collapse
Affiliation(s)
| | | | - Ludovic Dutoit
- Department of Zoology, University of Otago, Dunedin, New Zealand
| |
Collapse
|
27
|
Roberts WR, Ruck EC, Downey KM, Pinseel E, Alverson AJ. Resolving Marine-Freshwater Transitions by Diatoms Through a Fog of Gene Tree Discordance. Syst Biol 2023; 72:984-997. [PMID: 37335140 DOI: 10.1093/sysbio/syad038] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 06/02/2023] [Accepted: 06/16/2023] [Indexed: 06/21/2023] Open
Abstract
Despite the obstacles facing marine colonists, most lineages of aquatic organisms have colonized and diversified in freshwaters repeatedly. These transitions can trigger rapid morphological or physiological change and, on longer timescales, lead to increased rates of speciation and extinction. Diatoms are a lineage of ancestrally marine microalgae that have diversified throughout freshwater habitats worldwide. We generated a phylogenomic data set of genomes and transcriptomes for 59 diatom taxa to resolve freshwater transitions in one lineage, the Thalassiosirales. Although most parts of the species tree were consistently resolved with strong support, we had difficulties resolving a Paleocene radiation, which affected the placement of one freshwater lineage. This and other parts of the tree were characterized by high levels of gene tree discordance caused by incomplete lineage sorting and low phylogenetic signal. Despite differences in species trees inferred from concatenation versus summary methods and codons versus amino acids, traditional methods of ancestral state reconstruction supported six transitions into freshwaters, two of which led to subsequent species diversification. Evidence from gene trees, protein alignments, and diatom life history together suggest that habitat transitions were largely the product of homoplasy rather than hemiplasy, a condition where transitions occur on branches in gene trees not shared with the species tree. Nevertheless, we identified a set of putatively hemiplasious genes, many of which have been associated with shifts to low salinity, indicating that hemiplasy played a small but potentially important role in freshwater adaptation. Accounting for differences in evolutionary outcomes, in which some taxa became locked into freshwaters while others were able to return to the ocean or become salinity generalists, might help further distinguish different sources of adaptive mutation in freshwater diatoms.
Collapse
Affiliation(s)
- Wade R Roberts
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Elizabeth C Ruck
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Kala M Downey
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Eveline Pinseel
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| | - Andrew J Alverson
- Department of Biological Sciences, University of Arkansas, 1 University of Arkansas, Fayetteville, AR, 72701, USA
| |
Collapse
|
28
|
Thomas GWC, Hughes JJ, Kumon T, Berv JS, Nordgren CE, Lampson M, Levine M, Searle JB, Good JM. The genomic landscape, causes, and consequences of extensive phylogenomic discordance in Old World mice and rats. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.28.555178. [PMID: 37693498 PMCID: PMC10491188 DOI: 10.1101/2023.08.28.555178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
A species tree is a central concept in evolutionary biology whereby a single branching phylogeny reflects relationships among species. However, the phylogenies of different genomic regions often differ from the species tree. Although tree discordance is often widespread in phylogenomic studies, we still lack a clear understanding of how variation in phylogenetic patterns is shaped by genome biology or the extent to which discordance may compromise comparative studies. We characterized patterns of phylogenomic discordance across the murine rodents (Old World mice and rats) - a large and ecologically diverse group that gave rise to the mouse and rat model systems. Combining new linked-read genome assemblies for seven murine species with eleven published rodent genomes, we first used ultra-conserved elements (UCEs) to infer a robust species tree. We then used whole genomes to examine finer-scale patterns of discordance and found that phylogenies built from proximate chromosomal regions had similar phylogenies. However, there was no relationship between tree similarity and local recombination rates in house mice, suggesting that genetic linkage influences phylogenetic patterns over deeper timescales. This signal may be independent of contemporary recombination landscapes. We also detected a strong influence of linked selection whereby purifying selection at UCEs led to less discordance, while genes experiencing positive selection showed more discordant and variable phylogenetic signals. Finally, we show that assuming a single species tree can result in high error rates when testing for positive selection under different models. Collectively, our results highlight the complex relationship between phylogenetic inference and genome biology and underscore how failure to account for this complexity can mislead comparative genomic studies.
Collapse
Affiliation(s)
- Gregg W. C. Thomas
- Division of Biological Sciences, University of Montana, Missoula, MT, 59801
- Informatics Group, Harvard University, Cambridge, MA, 02138
| | - Jonathan J. Hughes
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
- Department of Evolution, Ecology, and Organismal Biology, University of California Riverside, Riverside, CA, 92521
| | - Tomohiro Kumon
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Jacob S. Berv
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109
| | - C. Erik Nordgren
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Michael Lampson
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Mia Levine
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Jeremy B. Searle
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
| | - Jeffrey M. Good
- Division of Biological Sciences, University of Montana, Missoula, MT, 59801
| |
Collapse
|
29
|
Glasenapp MR, Pogson GH. Extensive introgression among strongylocentrotid sea urchins revealed by phylogenomics. Ecol Evol 2023; 13:e10446. [PMID: 37636863 PMCID: PMC10451471 DOI: 10.1002/ece3.10446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/01/2023] [Accepted: 08/07/2023] [Indexed: 08/29/2023] Open
Abstract
Gametic isolation is thought to play an important role in the evolution of reproductive isolation in broadcast-spawning marine invertebrates. However, it is unclear whether gametic isolation commonly evolves early in the speciation process or only accumulates after other reproductive barriers are already in place. It is also unknown whether gametic isolation is an effective barrier to introgression following speciation. Here, we used whole-genome sequencing data and multiple complementary phylogenomic approaches to test whether the well-documented gametic incompatibilities among the strongylocentrotid sea urchins have limited introgression. We quantified phylogenetic discordance, inferred reticulate phylogenetic networks, and applied the Δ statistic using gene tree topologies reconstructed from multiple sequence alignments of protein-coding single-copy orthologs. In addition, we conducted ABBA-BABA tests on genome-wide single nucleotide variants and reconstructed a phylogeny of mitochondrial genomes. Our results revealed strong mito-nuclear discordance and considerable nonrandom gene tree discordance that cannot be explained by incomplete lineage sorting alone. Eight of the nine species examined demonstrated a history of introgression with at least one other species or ancestral lineage, indicating that introgression was common during the diversification of the strongylocentrotid urchins. There was strong support for introgression between four extant species pairs (Strongylocentrotus pallidus ⇔ S. droebachiensis, S. intermedius ⇔ S. pallidus, S. purpuratus ⇔ S. fragilis, and Mesocentrotus franciscanus ⇔ Pseudocentrotus depressus) and additional evidence for introgression on internal branches of the phylogeny. Our results suggest that the existing gametic incompatibilities among the strongylocentrotid urchin species have not been a complete barrier to hybridization and introgression following speciation. Their continued divergence in the face of widespread introgression indicates that other reproductive isolating barriers likely exist and may have been more critical in establishing reproductive isolation early in speciation.
Collapse
Affiliation(s)
- Matthew R. Glasenapp
- Department of Ecology and Evolutionary BiologyUniversity of CaliforniaSanta CruzCaliforniaUSA
| | - Grant H. Pogson
- Department of Ecology and Evolutionary BiologyUniversity of CaliforniaSanta CruzCaliforniaUSA
| |
Collapse
|
30
|
Ilík V, Kreisinger J, Modrý D, Schwarz EM, Tagg N, Mbohli D, Nkombou IC, Petrželková KJ, Pafčo B. High diversity and sharing of strongylid nematodes in humans and great apes co-habiting an unprotected area in Cameroon. PLoS Negl Trop Dis 2023; 17:e0011499. [PMID: 37624869 PMCID: PMC10484444 DOI: 10.1371/journal.pntd.0011499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 09/07/2023] [Accepted: 07/03/2023] [Indexed: 08/27/2023] Open
Abstract
Rapid increases in human populations and environmental changes of past decades have led to changes in rates of contact and spatial overlap with wildlife. Together with other historical, social and environmental processes, this has significantly contributed to pathogen transmission in both directions, especially between humans and non-human primates, whose close phylogenetic relationship facilitates cross-infections. Using high-throughput amplicon sequencing, we studied strongylid communities in sympatric western lowland gorillas, central chimpanzees and humans co-occurring in an unprotected area in the northern periphery of the Dja Faunal Reserve, Cameroon. At the genus level, we classified 65 strongylid ITS-2 amplicon sequencing variants (ASVs) in humans and great apes. Great apes exhibited higher strongylid diversity than humans. Necator and Oesophagostomum were the most prevalent genera, and we commonly observed mixed infections of more than one strongylid species. Human strongylid communities were dominated by the human hookworm N. americanus, while great apes were mainly infected with N. gorillae, O. stephanostomum and trichostrongylids. We were also able to detect rare strongylid taxa (such as Ancylostoma and Ternidens). We detected eight ASVs shared between humans and great apes (four N. americanus variants, two N. gorillae variants, one O. stephanostomum type I and one Trichostrongylus sp. type II variant). Our results show that knowledge of strongylid communities in primates, including humans, is still limited. Sharing the same habitat, especially outside protected areas (where access to the forest is not restricted), can enable mutual parasite exchange and can even override host phylogeny or conserved patterns. Such studies are critical for assessing the threats posed to all hosts by increasing human-wildlife spatial overlap. In this study, the term "contact" refers to physical contact, while "spatial overlap" refers to environmental contact.
Collapse
Affiliation(s)
- Vladislav Ilík
- Department of Botany and Zoology, Faculty of Science, Masaryk University, Brno, Czech Republic
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
| | - Jakub Kreisinger
- Department of Zoology, Faculty of Science, Charles University, Praha, Czech Republic
| | - David Modrý
- Department of Botany and Zoology, Faculty of Science, Masaryk University, Brno, Czech Republic
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, Ceske Budejovice, Czech Republic
- Department of Veterinary Sciences, Faculty of Agrobiology, Food and Natural Resources/CINeZ, Czech University of Life Sciences Prague, Prague, Czech Republic
| | - Erich Marquard Schwarz
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Nikki Tagg
- Centre for Research and Conservation/KMDA, Antwerp, Belgium
| | - Donald Mbohli
- Association de la Protection des Grands Singes, Yaoundé, Cameroon
| | | | - Klára Judita Petrželková
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, Ceske Budejovice, Czech Republic
| | - Barbora Pafčo
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
| |
Collapse
|
31
|
Langschied F, Leisegang MS, Brandes RP, Ebersberger I. ncOrtho: efficient and reliable identification of miRNA orthologs. Nucleic Acids Res 2023; 51:e71. [PMID: 37260093 PMCID: PMC10359484 DOI: 10.1093/nar/gkad467] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 05/04/2023] [Accepted: 05/30/2023] [Indexed: 06/02/2023] Open
Abstract
MicroRNAs (miRNAs) are post-transcriptional regulators that finetune gene expression via translational repression or degradation of their target mRNAs. Despite their functional relevance, frameworks for the scalable and accurate detection of miRNA orthologs are missing. Consequently, there is still no comprehensive picture of how miRNAs and their associated regulatory networks have evolved. Here we present ncOrtho, a synteny informed pipeline for the targeted search of miRNA orthologs in unannotated genome sequences. ncOrtho matches miRNA annotations from multi-tissue transcriptomes in precision, while scaling to the analysis of hundreds of custom-selected species. The presence-absence pattern of orthologs to 266 human miRNA families across 402 vertebrate species reveals four bursts of miRNA acquisition, of which the most recent event occurred in the last common ancestor of higher primates. miRNA families are rarely modified or lost, but notable exceptions for both events exist. miRNA co-ortholog numbers faithfully indicate lineage-specific whole genome duplications, and miRNAs are powerful markers for phylogenomic analyses. Their exceptionally low genetic diversity makes them suitable to resolve clades where the phylogenetic signal is blurred by incomplete lineage sorting of ancestral alleles. In summary, ncOrtho allows to routinely consider miRNAs in evolutionary analyses that were thus far reserved to protein-coding genes.
Collapse
Affiliation(s)
- Felix Langschied
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt, Germany
| | - Matthias S Leisegang
- Institute for Cardiovascular Physiology, Goethe University, Frankfurt, Germany
- German Center of Cardiovascular Research (DZHK), Partner site RheinMain, Frankfurt, Germany
| | - Ralf P Brandes
- Institute for Cardiovascular Physiology, Goethe University, Frankfurt, Germany
- German Center of Cardiovascular Research (DZHK), Partner site RheinMain, Frankfurt, Germany
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt, Germany
- Senckenberg Biodiversity and Climate Research Centre (S-BIK-F), Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| |
Collapse
|
32
|
Rivas-González I, Rousselle M, Li F, Zhou L, Dutheil JY, Munch K, Shao Y, Wu D, Schierup MH, Zhang G. Pervasive incomplete lineage sorting illuminates speciation and selection in primates. Science 2023; 380:eabn4409. [PMID: 37262154 DOI: 10.1126/science.abn4409] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 01/19/2023] [Indexed: 06/03/2023]
Abstract
Incomplete lineage sorting (ILS) causes the phylogeny of some parts of the genome to differ from the species tree. In this work, we investigate the frequencies and determinants of ILS in 29 major ancestral nodes across the entire primate phylogeny. We find up to 64% of the genome affected by ILS at individual nodes. We exploit ILS to reconstruct speciation times and ancestral population sizes. Estimated speciation times are much more recent than genomic divergence times and are in good agreement with the fossil record. We show extensive variation of ILS along the genome, mainly driven by recombination but also by the distance to genes, highlighting a major impact of selection on variation along the genome. In many nodes, ILS is reduced more on the X chromosome compared with autosomes than expected under neutrality, which suggests higher impacts of natural selection on the X chromosome. Finally, we show an excess of ILS in genes with immune functions and a deficit of ILS in housekeeping genes. The extensive ILS in primates discovered in this study provides insights into the speciation times, ancestral population sizes, and patterns of natural selection that shape primate evolution.
Collapse
Affiliation(s)
- Iker Rivas-González
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | | | - Fang Li
- BGI-Research, BGI-Wuhan, Wuhan 430074, China
- Institute of Animal Sex and Development, ZhejiangWanli University, Ningbo 315104, China
- BGI-Research, BGI-Shenzhen, Shenzhen 518083, China
| | - Long Zhou
- Evolutionary & Organismal Biology Research Center, Zhejiang University School of Medicine, Hangzhou 310058, China
- Women's Hospital, School of Medicine, Zhejiang University, Shangcheng District, Hangzhou 310006, China
| | - Julien Y Dutheil
- Max Planck Institute for Evolutionary Biology, Plön, Germany
- Institute of Evolution Sciences of Montpellier (ISEM), CNRS, University of Montpellier, IRD, EPHE, 34095 Montpellier, France
| | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650107, China
- Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Mikkel H Schierup
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Guojie Zhang
- Evolutionary & Organismal Biology Research Center, Zhejiang University School of Medicine, Hangzhou 310058, China
- Women's Hospital, School of Medicine, Zhejiang University, Shangcheng District, Hangzhou 310006, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou 311121, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
| |
Collapse
|
33
|
Kuderna LFK, Gao H, Janiak MC, Kuhlwilm M, Orkin JD, Bataillon T, Manu S, Valenzuela A, Bergman J, Rousselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, Schraiber JG, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, Valsecchi J, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin AD, Guschanski K, Schierup MH, Beck RMD, Umapathy G, Roos C, Boubli JP, Rogers J, Farh KKH, Bonet TM. A global catalog of whole-genome diversity from 233 primate species. Science 2023; 380:906-913. [PMID: 37262161 PMCID: PMC12120848 DOI: 10.1126/science.abn7829] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 02/06/2023] [Indexed: 06/03/2023]
Abstract
The rich diversity of morphology and behavior displayed across primate species provides an informative context in which to study the impact of genomic diversity on fundamental biological processes. Analysis of that diversity provides insight into long-standing questions in evolutionary and conservation biology and is urgent given severe threats these species are facing. Here, we present high-coverage whole-genome data from 233 primate species representing 86% of genera and all 16 families. This dataset was used, together with fossil calibration, to create a nuclear DNA phylogeny and to reassess evolutionary divergence times among primate clades. We found within-species genetic diversity across families and geographic regions to be associated with climate and sociality, but not with extinction risk. Furthermore, mutation rates differ across species, potentially influenced by effective population sizes. Lastly, we identified extensive recurrence of missense mutations previously thought to be human specific. This study will open a wide range of research avenues for future primate genomic research.
Collapse
Affiliation(s)
- Lukas F. K. Kuderna
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, CA 94404, USA
| | - Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, CA 94404, USA
| | - Mareike C. Janiak
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Martin Kuhlwilm
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Department of Evolutionary Anthropology, University of Vienna, Djerassiplatz 1, 1030 Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Austria
| | - Joseph D. Orkin
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Département d’anthropologie, Université de Montréal, 3150 Jean-Brillant, Montréal, QC H3T 1N8, Canada
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Alejandro Valenzuela
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Section for Ecoinformatics and Biodiversity, Department of Biology, Aarhus University, Aarhus, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development, Estrada da Bexiga 2584, CEP 69553-225, Tefé, Amazonas, Brazil
- Evolutionary Biology and Ecology (EBE), Département de Biologie des Organismes, Université libre de Bruxelles (ULB), Av. Franklin D. Roosevelt 50, CP 160/12, B-1050 Brussels Belgium
| | - Lidia Agueda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Julie Blanc
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Ian Goodhead
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - R. Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
| | | | - Julie E. Horvath
- North Carolina Museum of Natural Sciences, Raleigh, NC 27601, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, NC 27707, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - David Juan
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
| | | | - Joshua G. Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, CA 94404, USA
| | | | - Fabrício Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas 69080-900, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah, Salt Lake City. UT 84102, USA
| | | | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas 69080-900, Brazil
| | - João Valsecchi
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development, Tefé, Amazonas, Brazil
- Rede de Pesquisa para Estudos sobre Diversidade, Conservação e Uso da Fauna na Amazônia – RedeFauna, Manaus, Amazonas, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica – ComFauna, Iquitos, Loreto, Peru
| | - Malu Messias
- Universidade Federal de Rondônia, Porto Velho, Rondônia, Brazil
| | | | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Rogerio Rossi
- Instituto de Biociências, Universidade Federal do Mato Grosso, Cuiabá, MT, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas 69080-900, Brazil
- Department of Biology, Trinity University, San Antonio, TX 78212, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clément J. Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clifford J. Jolly
- Department of Anthropology, New York University, New York, NY 10003, USA
| | - Jane Phillips-Conroy
- Department of Neuroscience, Washington University School of Medicine in St. Louis, St. Louis, MO 63110, USA
| | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop TX 78602, USA
| | - Christian Abee
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop TX 78602, USA
| | - Joe H. Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop TX 78602, USA
| | | | - Sree Kanthaswamy
- School of Mathematical and Natural Sciences, Arizona State University, Phoenix, AZ 85004, USA
| | - Fekadu Shiferaw
- Guinea Worm Eradication Program, The Carter Center Ethiopia, Addis Ababa, Ethiopia
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Long Zhou
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Guojie Zhang
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center, 1369 West Wenyi Road, Hangzhou 311121, China
- Women’s Hospital, School of Medicine, Zhejiang University, 1 Xueshi Road, Shangcheng District, Hangzhou 310006, China
| | - Julius D. Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Head Office, P.O. Box 661, Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, 17493 Greifswald–Insel Riems, Germany
| | - Minh D. Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University, Hanoi, Vietnam
| | - Esther Lizano
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, Stuttgart, Germany
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra. Pg. Luís Companys 23, 08010 Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Av. Doctor Aiguader, N88, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, C. Wellington 30, 08005 Barcelona, Spain
| | - Tilo Nadler
- Cuc Phuong Commune, Nho Quan District, Ninh Binh Province, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore
| | - Jessica Lee
- Mandai Nature, 80 Mandai Lake Road, Singapore
| | - Patrick Tan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore
| | - Andrew C. Kitchener
- Department of Natural Sciences, National Museums Scotland, Chambers Street, Edinburgh EH1 1JF, UK, and School of Geosciences, Drummond Street, Edinburgh EH8 9XP, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research, 37077 Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen, 37077 Göttingen, Germany
- Leibniz ScienceCampus Primate Cognition, 37077 Göttingen, Germany
| | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Amanda D. Melin
- Department of Anthropology and Archaeology, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada
- Department of Medical Genetics, University of Calgary, 3330 Hospital Drive NW, HMRB 202, Calgary, AB T2N 4N1, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, 3330 Hospital Drive NW, HMRB 202, Calgary, AB T2N 4N1, Canada
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | - Robin M. D. Beck
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Kellnerweg 4, 37077 Göttingen, Germany
| | - Jean P. Boubli
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, CA 94404, USA
| | - Tomas Marques Bonet
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra. Pg. Luís Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
34
|
Shao Y, Zhou L, Li F, Zhao L, Zhang BL, Shao F, Chen JW, Chen CY, Bi X, Zhuang XL, Zhu HL, Hu J, Sun Z, Li X, Wang D, Rivas-González I, Wang S, Wang YM, Chen W, Li G, Lu HM, Liu Y, Kuderna LFK, Farh KKH, Fan PF, Yu L, Li M, Liu ZJ, Tiley GP, Yoder AD, Roos C, Hayakawa T, Marques-Bonet T, Rogers J, Stenson PD, Cooper DN, Schierup MH, Yao YG, Zhang YP, Wang W, Qi XG, Zhang G, Wu DD. Phylogenomic analyses provide insights into primate evolution. Science 2023; 380:913-924. [PMID: 37262173 DOI: 10.1126/science.abn6919] [Citation(s) in RCA: 63] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 01/26/2023] [Indexed: 06/03/2023]
Abstract
Comparative analysis of primate genomes within a phylogenetic context is essential for understanding the evolution of human genetic architecture and primate diversity. We present such a study of 50 primate species spanning 38 genera and 14 families, including 27 genomes first reported here, with many from previously less well represented groups, the New World monkeys and the Strepsirrhini. Our analyses reveal heterogeneous rates of genomic rearrangement and gene evolution across primate lineages. Thousands of genes under positive selection in different lineages play roles in the nervous, skeletal, and digestive systems and may have contributed to primate innovations and adaptations. Our study reveals that many key genomic innovations occurred in the Simiiformes ancestral node and may have had an impact on the adaptive radiation of the Simiiformes and human evolution.
Collapse
Affiliation(s)
- Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
| | - Long Zhou
- Center of Evolutionary & Organismal Biology, and Women's Hospital at Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Fang Li
- Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
- Institute of Animal Sex and Development, ZhejiangWanli University, Ningbo 315100, China
| | - Lan Zhao
- Shaanxi Key Laboratory for Animal Conservation, College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Bao-Lin Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
| | - Feng Shao
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing 400715, China
| | | | - Chun-Yan Chen
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xupeng Bi
- Center of Evolutionary & Organismal Biology, and Women's Hospital at Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Xiao-Lin Zhuang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
- Kunming College of Life Science, University of the Chinese Academy of Sciences, Kunming 650204, China
| | | | - Jiang Hu
- Grandomics Biosciences, Beijing 102206, China
| | - Zongyi Sun
- Grandomics Biosciences, Beijing 102206, China
| | - Xin Li
- Grandomics Biosciences, Beijing 102206, China
| | - Depeng Wang
- Grandomics Biosciences, Beijing 102206, China
| | | | - Sheng Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
| | - Yun-Mei Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
| | - Wu Chen
- Guangzhou Zoo & Guangzhou Wildlife Research Center, Guangzhou 510070, China
| | - Gang Li
- College of Life Sciences, Shaanxi Normal University, Xi'an 710119, China
| | - Hui-Meng Lu
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yang Liu
- College of Life Sciences, Shaanxi Normal University, Xi'an 710119, China
| | - Lukas F K Kuderna
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, 08003 Barcelona, Spain
- Illumina Artificial Intelligence Laboratory, Illumina Inc, San Diego, CA 92122, USA
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina Inc, San Diego, CA 92122, USA
| | - Peng-Fei Fan
- School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong 510275, China
| | - Li Yu
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, School of Life Sciences, Yunnan University, Kunming 650091, China
| | - Ming Li
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhi-Jin Liu
- College of Life Sciences, Capital Normal University, Beijing 100048, China
| | - George P Tiley
- Department of Biology, Duke University, Durham, NC 27708, USA
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, NC 27708, USA
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, 37077 Göttingen, Germany
| | - Takashi Hayakawa
- Faculty of Environmental Earth Science, Hokkaido University, Sapporo, Hokkaido 060-0810, Japan
- Japan Monkey Centre, Inuyama, Aichi 484-0081, Japan
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, 08003 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08028 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Edifici ICTA-ICP, c/ Columnes s/n, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Jeffrey Rogers
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff CF14 4XN, UK
| | | | - Yong-Gang Yao
- Kunming College of Life Science, University of the Chinese Academy of Sciences, Kunming 650204, China
- Key Laboratory of Animal Models and Human Disease Mechanisms of Chinese Academy of Sciences & Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650201, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic & Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650201, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic & Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
| | - Wen Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an 710072, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650201, China
| | - Xiao-Guang Qi
- Shaanxi Key Laboratory for Animal Conservation, College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Guojie Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
- Center of Evolutionary & Organismal Biology, and Women's Hospital at Zhejiang University School of Medicine, Hangzhou 310058, China
- Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou 311121, China
| | - Dong-Dong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650201, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650201, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic & Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650107, China
- KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650204, China
| |
Collapse
|
35
|
Gao H, Hamp T, Ede J, Schraiber JG, McRae J, Singer-Berk M, Yang Y, Dietrich ASD, Fiziev PP, Kuderna LFK, Sundaram L, Wu Y, Adhikari A, Field Y, Chen C, Batzoglou S, Aguet F, Lemire G, Reimers R, Balick D, Janiak MC, Kuhlwilm M, Orkin JD, Manu S, Valenzuela A, Bergman J, Rousselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, do Amaral JV, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Bataillon T, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin A, Guschanski K, Schierup MH, Beck RMD, Umapathy G, Roos C, Boubli JP, Lek M, Sunyaev S, O'Donnell-Luria A, Rehm HL, Xu J, Rogers J, Marques-Bonet T, Farh KKH. The landscape of tolerated genetic variation in humans and primates. Science 2023; 380:eabn8153. [PMID: 37262156 DOI: 10.1126/science.abn8197] [Citation(s) in RCA: 78] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 03/22/2023] [Indexed: 06/03/2023]
Abstract
Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these variants can be inferred to have nondeleterious effects in humans based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases.
Collapse
Affiliation(s)
- Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Tobias Hamp
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Jeffrey Ede
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Joshua G Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Jeremy McRae
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Moriel Singer-Berk
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
| | - Yanshen Yang
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | | | - Petko P Fiziev
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Lukas F K Kuderna
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Laksshman Sundaram
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Yibing Wu
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Aashish Adhikari
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Yair Field
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Chen Chen
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Serafim Batzoglou
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Francois Aguet
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Gabrielle Lemire
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Rebecca Reimers
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
| | - Daniel Balick
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Mareike C Janiak
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Martin Kuhlwilm
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Department of Evolutionary Anthropology, University of Vienna, Djerassiplatz 1, 1030 Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, 1030 Vienna, Austria
| | - Joseph D Orkin
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Département d'anthropologie, Université de Montréal, 3150 Jean-Brillant, Montréal, QC H3T 1N8, Canada
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Alejandro Valenzuela
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark
- Section for Ecoinformatics & Biodiversity, Department of Biology, Aarhus University, 8000 Aarhus, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development, Estrada da Bexiga 2584, Tefé, Amazonas, CEP 69553-225, Brazil
- Evolutionary Biology and Ecology (EBE), Département de Biologie des Organismes, Université libre de Bruxelles (ULB), Av. Franklin D. Roosevelt 50, CP 160/12, B-1050 Brussels, Belgium
| | - Lidia Agueda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - Julie Blanc
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Ian Goodhead
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - R Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
| | | | - Julie E Horvath
- North Carolina Museum of Natural Sciences, Raleigh, NC 27601, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, NC 27707, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - David Juan
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | | | - Fabrício Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas, 69080-900, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah, Salt Lake City, UT 84102, USA
| | - Iracilda Sampaio
- Universidade Federal do Para, Guamá, Belém - PA, 66075-110, Brazil
| | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas, 69080-900, Brazil
| | - João Valsecchi do Amaral
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development, Tefé, Amazonas, 69553-225, Brazil
- Rede de Pesquisa para Estudos sobre Diversidade, Conservação e Uso da Fauna na Amazônia - RedeFauna, Manaus, Amazonas, 69080-900, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica - ComFauna, Iquitos, Loreto, 16001, Peru
| | - Mariluce Messias
- Universidade Federal de Rondonia, Porto Velho, Rondônia, 78900-000, Brazil
- PPGREN - Programa de Pós-Graduação "Conservação e Uso dos Recursos Naturais and BIONORTE - Programa de Pós-Graduação em Biodiversidade e Biotecnologia da Rede BIONORTE, Universidade Federal de Rondonia, Porto Velho, Rondônia, 78900-000, Brazil
| | - Maria N F da Silva
- Instituto Nacional de Pesquisas da Amazonia, Petrópolis, Manaus - AM, 69067-375, Brazil
| | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Rogerio Rossi
- Universidade Federal do Mato Grosso, Boa Esperança, Cuiabá - MT, 78060-900, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas, 69080-900, Brazil
- Department of Biology, Trinity University, San Antonio, TX 78212, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, 401, Madagascar
| | - Clément J Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, 401, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, 401, Madagascar
| | | | | | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Christian Abee
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Joe H Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Eduardo Fernandez-Duque
- Yale University, New Haven, CT 06520, USA
- Universidad Nacional de Formosa, Argentina Fundacion ECO, Formosa, Argentina
| | | | - Fekadu Shiferaw
- Guinea Worm Eradication Program, The Carter Center Ethiopia, PoB 16316, Addis Ababa 1000, Ethiopia
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Long Zhou
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Guojie Zhang
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center, 1369 West Wenyi Road, Hangzhou 311121, China
- Women's Hospital, School of Medicine, Zhejiang University, 1 Xueshi Road, Shangcheng District, Hangzhou 310006, China
| | - Julius D Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Head Office, P.O. Box 661, Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, 17493 Greifswald - Insei Riems, Germany
| | - Minh D Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University, Hanoi 100000, Vietnam
| | - Esther Lizano
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010 Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, 70191 Stuttgart, Germany
| | - Arcadi Navarro
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Edifici ICTA-ICP, c/ Columnes s/n, 08193 Cerdanyola del Vallès, Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Av. Doctor Aiguader, N88, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, C. Wellington 30, 08005 Barcelona, Spain
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark
| | - Tilo Nadler
- Cuc Phuong Commune, Nho Quan District, Ninh Binh Province 430000, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
| | - Jessica Lee
- Mandai Nature, 80 Mandai Lake Road, Singapore 729826, Republic of Singapore
| | - Patrick Tan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore 168582, Republic of Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore 168582, Republic of Singapore
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore 168582, Republic of Singapore
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums Scotland, Chambers Street, Edinburgh EH1 1JF, UK
- School of Geosciences, University of Edinburgh, Drummond Street, Edinburgh EH8 9XP, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research, 37077 Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen, 37077 Göttingen, Germany
- Leibniz Science Campus Primate Cognition, 37077 Göttingen, Germany
| | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra, Pg. Luís Companys 23, 08010 Barcelona, Spain
| | - Amanda Melin
- Department of Anthropology & Archaeology, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada
- Department of Medical Genetics, 3330 Hospital Drive NW, HMRB 202, Calgary, AB T2N 4N1, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh EH8 9XP, UK
| | | | - Robin M D Beck
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Kellnerweg 4, 37077 Göttingen, Germany
| | - Jean P Boubli
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA
| | - Shamil Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Heidi L Rehm
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jinbo Xu
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Edifici ICTA-ICP, c/ Columnes s/n, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| |
Collapse
|
36
|
Hibbins MS, Breithaupt LC, Hahn MW. Phylogenomic comparative methods: Accurate evolutionary inferences in the presence of gene tree discordance. Proc Natl Acad Sci U S A 2023; 120:e2220389120. [PMID: 37216509 PMCID: PMC10235958 DOI: 10.1073/pnas.2220389120] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 04/24/2023] [Indexed: 05/24/2023] Open
Abstract
Phylogenetic comparative methods have long been a mainstay of evolutionary biology, allowing for the study of trait evolution across species while accounting for their common ancestry. These analyses typically assume a single, bifurcating phylogenetic tree describing the shared history among species. However, modern phylogenomic analyses have shown that genomes are often composed of mosaic histories that can disagree both with the species tree and with each other-so-called discordant gene trees. These gene trees describe shared histories that are not captured by the species tree, and therefore that are unaccounted for in classic comparative approaches. The application of standard comparative methods to species histories containing discordance leads to incorrect inferences about the timing, direction, and rate of evolution. Here, we develop two approaches for incorporating gene tree histories into comparative methods: one that constructs an updated phylogenetic variance-covariance matrix from gene trees, and another that applies Felsenstein's pruning algorithm over a set of gene trees to calculate trait histories and likelihoods. Using simulation, we demonstrate that our approaches generate much more accurate estimates of tree-wide rates of trait evolution than standard methods. We apply our methods to two clades of the wild tomato genus Solanum with varying rates of discordance, demonstrating the contribution of gene tree discordance to variation in a set of floral traits. Our approaches have the potential to be applied to a broad range of classic inference problems in phylogenetics, including ancestral state reconstruction and the inference of lineage-specific rate shifts.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON M5S 3B2, Canada
- Department of Biology, Indiana University, Bloomington, IN 47405
| | - Lara C Breithaupt
- Department of Biology, Indiana University, Bloomington, IN 47405
- Department of Computer Science, Duke University, Durham, NC 27710
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405
- Department of Computer Science, Indiana University, Bloomington, IN 47405
| |
Collapse
|
37
|
Gao H, Hamp T, Ede J, Schraiber JG, McRae J, Singer-Berk M, Yang Y, Dietrich A, Fiziev P, Kuderna L, Sundaram L, Wu Y, Adhikari A, Field Y, Chen C, Batzoglou S, Aguet F, Lemire G, Reimers R, Balick D, Janiak MC, Kuhlwilm M, Orkin JD, Manu S, Valenzuela A, Bergman J, Rouselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath J, Hvilsom C, Juan D, Frandsen P, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, do Amaral JV, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Batallion T, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin A, Guschanski K, Schierup MH, Beck RMD, Umapathy G, Roos C, Boubli JP, Lek M, Sunyaev S, O’Donnell A, Rehm H, Xu J, Rogers J, Marques-Bonet T, Kai-How Farh K. The landscape of tolerated genetic variation in humans and primates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.01.538953. [PMID: 37205491 PMCID: PMC10187174 DOI: 10.1101/2023.05.01.538953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole genome sequencing data for 809 individuals from 233 primate species, and identified 4.3 million common protein-altering variants with orthologs in human. We show that these variants can be inferred to have non-deleterious effects in human based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases. One Sentence Summary Deep learning classifier trained on 4.3 million common primate missense variants predicts variant pathogenicity in humans.
Collapse
Affiliation(s)
- Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Tobias Hamp
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Jeffrey Ede
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Joshua G. Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Jeremy McRae
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Moriel Singer-Berk
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
| | - Yanshen Yang
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Anastasia Dietrich
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Petko Fiziev
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Lukas Kuderna
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Laksshman Sundaram
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Yibing Wu
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Aashish Adhikari
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Yair Field
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Chen Chen
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Serafim Batzoglou
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Francois Aguet
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Gabrielle Lemire
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Rebecca Reimers
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Daniel Balick
- Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Mareike C. Janiak
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Martin Kuhlwilm
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Department of Evolutionary Anthropology, University of Vienna; Djerassiplatz 1, 1030, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna; 1030, Vienna, Austria
| | - Joseph D. Orkin
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Département d’anthropologie, Université de Montréal; 3150 Jean-Brillant, Montréal, QC, H3T 1N8, Canada
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR); Ghaziabad, 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology; Hyderabad, 500007, India
| | - Alejandro Valenzuela
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University; Aarhus, 8000, Denmark
- Section for Ecoinformatics & Biodiversity, Department of Biology, Aarhus University; Aarhus, 8000, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development; Estrada da Bexiga 2584, Tefé, Amazonas, CEP 69553-225, Brazil
- Faculty of Sciences, Department of Organismal Biology, Unit of Evolutionary Biology and Ecology, Université Libre de Bruxelles (ULB); Avenue Franklin D. Roosevelt 50, 1050, Brussels, Belgium
| | - Lidia Agueda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Julie Blanc
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Ian Goodhead
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - R. Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine; Houston, Texas, 77030, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine; Houston, Texas, 77030, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University; SE-75236, Uppsala, Sweden
| | | | - Julie Horvath
- North Carolina Museum of Natural Sciences; Raleigh, North Carolina, 27601, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University; Durham, North Carolina , 27707, USA
- Department of Biological Sciences, North Carolina State University; Raleigh, North Carolina , 27695, USA
- Department of Evolutionary Anthropology, Duke University; Durham, North Carolina , 27708, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - David Juan
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | | | - Fabricio Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL); Manaus, Amazonas, 69080-900, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah; Salt Lake City, Utah, 84102, USA
| | - Iracilda Sampaio
- Universidade Federal do Para; Guamá, Belém - PA, 66075-110, Brazil
| | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL); Manaus, Amazonas, 69080-900, Brazil
| | - João Valsecchi do Amaral
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development; Tefé, Amazonas, 69553-225, Brazil
- Rede de Pesquisa para Estudos sobre Diversidade, Conservação e Uso da Fauna na Amazônia – RedeFauna; Manaus, Amazonas, 69080-900, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica – ComFauna; Iquitos, Loreto, 16001, Peru
| | - Mariluce Messias
- Universidade Federal de Rondonia; Porto Velho, Rondônia, 78900-000, Brazil
- PPGREN - Programa de Pós-Graduação “Conservação e Uso dos Recursos Naturais and BIONORTE - Programa de Pós-Graduação em Biodiversidade e Biotecnologia da Rede BIONORTE, Universidade Federal de Rondonia; Porto Velho, Rondônia, 78900-000, Brazil
| | - Maria N. F. da Silva
- Instituto Nacional de Pesquisas da Amazonia; Petrópolis, Manaus - AM, 69067-375, Brazil
| | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology; Hyderabad, 500007, India
| | - Rogerio Rossi
- Universidade Federal do Mato Grosso; Boa Esperança, Cuiabá - MT, 78060-900, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL); Manaus, Amazonas, 69080-900, Brazil
- Department of Biology, Trinity University; San Antonio, Texas, 78212, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga; Mahajanga, 401, Madagascar
| | - Clément J. Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga; Mahajanga, 401, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga; Mahajanga, 401, Madagascar
| | | | | | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center; Houston, Texas, 77030, USA
| | | | - Joe H. Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center; Houston, Texas, 77030, USA
| | - Eduardo Fernandez-Duque
- Yale University; New Haven, Connecticut, 06520, USA
- Universidad Nacional de Formosa, Argentina Fundacion ECO, Formosa, Argentina
| | | | | | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences; Kunming, Yunnan, 650223, China
| | - Long Zhou
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences; Kunming, Yunnan, 650223, China
| | - Guojie Zhang
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen; Copenhagen, DK-2100, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center; 1369 West Wenyi Road, Hangzhou, 311121, China
- Women’s Hospital, School of Medicine, Zhejiang University; 1 Xueshi Road, Shangcheng District, Hangzhou, 310006, China
| | - Julius D. Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Head Office; P.O.Box 661, Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health; 17493 Greifswald - Isle of Riems, Germany
| | - Minh D. Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University; Hanoi, 100000, Vietnam
| | - Esther Lizano
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain; Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart; 70191 Stuttgart, Germany
| | - Arcadi Navarro
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra, Pg. Luís Companys 23, Barcelona, 08010, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology; Av. Doctor Aiguader, N88, Barcelona, 08003, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation; C. Wellington 30, Barcelona, 08005, Spain
| | - Thomas Batallion
- Bioinformatics Research Centre, Aarhus University; Aarhus, 8000, Denmark
| | - Tilo Nadler
- Cuc Phuong Commune; Nho Quan District, Ninh Binh Province, 430000, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
| | - Jessica Lee
- Mandai Nature; 80 Mandai Lake Road, Singapore 729826, Republic of Singapore
| | - Patrick Tan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM); Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School; Singapore 168582, Republic of Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM); Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School; Singapore 168582, Republic of Singapore
- SingHealth Duke-NUS Genomic Medicine Centre; Singapore 168582, Republic of Singapore
| | - Andrew C. Kitchener
- Department of Natural Sciences, National Museums Scotland; Chambers Street, Edinburgh, EH1 1JF, UK
- School of Geosciences, University of Edinburgh; Drummond Street, Edinburgh, EH8 9XP, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research; 37077 Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen; 37077 Göttingen, Germany
| | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
- Universitat Pompeu Fabra, Pg. Luís Companys 23, Barcelona, 08010, Spain
| | - Amanda Melin
- Leibniz Science Campus Primate Cognition; 37077 Göttingen, Germany
- Department of Anthropology & Archaeology and Department of Medical Genetics
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University; SE-75236, Uppsala, Sweden
- Alberta Children’s Hospital Research Institute; University of Calgary; 2500 University Dr NW T2N 1N4, Calgary, Alberta, Canada
| | | | - Robin M. D. Beck
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR); Ghaziabad, 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology; Hyderabad, 500007, India
| | - Christian Roos
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh; Edinburgh, EH8 9XP, UK
| | - Jean P. Boubli
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Monkol Lek
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research; Kellnerweg 4, 37077 Göttingen, Germany
| | - Shamil Sunyaev
- Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
- Department of Genetics, Yale School of Medicine; New Haven, Connecticut, 06520, USA
| | - Anne O’Donnell
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA
| | - Heidi Rehm
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Jinbo Xu
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
- Toyota Technological Institute at Chicago; Chicago, Illinois, 60637, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine; Houston, Texas, 77030, USA
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain; Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra, Pg. Luís Companys 23, Barcelona, 08010, Spain
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| |
Collapse
|
38
|
Foley NM, Mason VC, Harris AJ, Bredemeyer KR, Damas J, Lewin HA, Eizirik E, Gatesy J, Karlsson EK, Lindblad-Toh K, Springer MS, Murphy WJ, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, et alFoley NM, Mason VC, Harris AJ, Bredemeyer KR, Damas J, Lewin HA, Eizirik E, Gatesy J, Karlsson EK, Lindblad-Toh K, Springer MS, Murphy WJ, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. A genomic timescale for placental mammal evolution. Science 2023; 380:eabl8189. [PMID: 37104581 DOI: 10.1126/science.abl8189] [Show More Authors] [Citation(s) in RCA: 62] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
The precise pattern and timing of speciation events that gave rise to all living placental mammals remain controversial. We provide a comprehensive phylogenetic analysis of genetic variation across an alignment of 241 placental mammal genome assemblies, addressing prior concerns regarding limited genomic sampling across species. We compared neutral genome-wide phylogenomic signals using concatenation and coalescent-based approaches, interrogated phylogenetic variation across chromosomes, and analyzed extensive catalogs of structural variants. Interordinal relationships exhibit relatively low rates of phylogenomic conflict across diverse datasets and analytical methods. Conversely, X-chromosome versus autosome conflicts characterize multiple independent clades that radiated during the Cenozoic. Genomic time trees reveal an accumulation of cladogenic events before and immediately after the Cretaceous-Paleogene (K-Pg) boundary, implying important roles for Cretaceous continental vicariance and the K-Pg extinction in the placental radiation.
Collapse
Affiliation(s)
- Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
| | - Victor C Mason
- Institute of Cell Biology, University of Bern, Bern, Switzerland
| | - Andrew J Harris
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | - Kevin R Bredemeyer
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | - Joana Damas
- The Genome Center, University of California, Davis, CA, USA
| | - Harris A Lewin
- The Genome Center, University of California, Davis, CA, USA
- Department of Evolution and Ecology, University of California, Davis, CA, USA
| | - Eduardo Eizirik
- School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY, USA
| | - Elinor K Karlsson
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Molecular Medicine, University of Massachussetts Chan Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, USA
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Mao Y, Harvey WT, Porubsky D, Munson KM, Hoekzema K, Lewis AP, Audano PA, Rozanski A, Yang X, Zhang S, Gordon DS, Wei X, Logsdon GA, Haukness M, Dishuck PC, Jeong H, Del Rosario R, Bauer VL, Fattor WT, Wilkerson GK, Lu Q, Paten B, Feng G, Sawyer SL, Warren WC, Carbone L, Eichler EE. Structurally divergent and recurrently mutated regions of primate genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531415. [PMID: 36945442 PMCID: PMC10028934 DOI: 10.1101/2023.03.07.531415] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
To better understand the pattern of primate genome structural variation, we sequenced and assembled using multiple long-read sequencing technologies the genomes of eight nonhuman primate species, including New World monkeys (owl monkey and marmoset), Old World monkey (macaque), Asian apes (orangutan and gibbon), and African ape lineages (gorilla, bonobo, and chimpanzee). Compared to the human genome, we identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. Across 50 million years of primate evolution, we estimate that 819.47 Mbp or ~27% of the genome has been affected by SVs based on analysis of these primate lineages. We identify 1,607 structurally divergent regions (SDRs) wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (CARDs, ABCD7, OLAH) and new lineage-specific genes are generated (e.g., CKAP2, NEK5) and have become targets of rapid chromosomal diversification and positive selection (e.g., RGPDs). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species for the first time.
Collapse
Affiliation(s)
- Yafei Mao
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - David S Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Xiaoxi Wei
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ricardo Del Rosario
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vanessa L Bauer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO, USA
| | - Will T Fattor
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO, USA
| | - Gregory K Wilkerson
- Department of Veterinary Sciences, Michale E. Keeling Center for Comparative Medicine and Research, The University of Texas MD Anderson Cancer Center, Bastrop, TX, USA
- Department of Clinical Sciences, North Carolina State University, Raleigh, NC, USA
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Guoping Feng
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sara L Sawyer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO, USA
| | - Wesley C Warren
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
- Department of Surgery, School of Medicine, University of Missouri, Columbia, MO, USA
- Institute of Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Lucia Carbone
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA
- Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA
- Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, USA
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
40
|
Wang Y, Wang Y, Cheng X, Ding Y, Wang C, Merilä J, Guo B. Prevalent Introgression Underlies Convergent Evolution in the Diversification of Pungitius Sticklebacks. Mol Biol Evol 2023; 40:7026025. [PMID: 36738166 PMCID: PMC9949714 DOI: 10.1093/molbev/msad026] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 12/16/2022] [Accepted: 01/31/2023] [Indexed: 02/05/2023] Open
Abstract
New mutations and standing genetic variations contribute significantly to repeated phenotypic evolution in sticklebacks. However, less is known about the role of introgression in this process. We analyzed taxonomically and geographically comprehensive genomic data from Pungitius sticklebacks to decipher the extent of introgression and its consequences for the diversification of this genus. Our results demonstrate that introgression is more prevalent than suggested by earlier studies. Although gene flow was generally bidirectional, it was often asymmetric and left unequal genomic signatures in hybridizing species, which might, at least partly, be due to biased hybridization and/or population size differences. In several cases, introgression of variants from one species to another was accompanied by transitions of pelvic and/or lateral plate structures-important diagnostic traits in Pungitius systematics-and frequently left signatures of adaptation in the core gene regulatory networks of armor trait development. This finding suggests that introgression has been an important source of genetic variation and enabled phenotypic convergence among Pungitius sticklebacks. The results highlight the importance of introgression of genetic variation as a source of adaptive variation underlying key ecological and taxonomic traits. Taken together, our study indicates that introgression-driven convergence likely explains the long-standing challenges in resolving the taxonomy and systematics of this small but phenotypically highly diverse group of fish.
Collapse
Affiliation(s)
- Yu Wang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China,University of Chinese Academy of Sciences, Beijing, China
| | - Yingnan Wang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Xiaoqi Cheng
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China,University of Chinese Academy of Sciences, Beijing, China
| | - Yongli Ding
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China,University of Chinese Academy of Sciences, Beijing, China
| | - Chongnv Wang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Juha Merilä
- Ecological Genetics Research Unit, Research Programme in Organismal and Evolutionary Biology, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland,Area of Ecology and Biodiversity, School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China
| | | |
Collapse
|
41
|
Beck RMD, de Vries D, Janiak MC, Goodhead IB, Boubli JP. Total evidence phylogeny of platyrrhine primates and a comparison of undated and tip-dating approaches. J Hum Evol 2023; 174:103293. [PMID: 36493598 DOI: 10.1016/j.jhevol.2022.103293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 10/21/2022] [Accepted: 10/21/2022] [Indexed: 12/12/2022]
Abstract
There have been multiple published phylogenetic analyses of platyrrhine primates (New World monkeys) using both morphological and molecular data, but relatively few that have integrated both types of data into a total evidence approach. Here, we present phylogenetic analyses of recent and fossil platyrrhines, based on a total evidence data set of 418 morphological characters and 10.2 kilobases of DNA sequence data from 17 nuclear genes taken from previous studies, using undated and tip-dating approaches in a Bayesian framework. We compare the results of these analyses with molecular scaffold analyses using maximum parsimony and Bayesian approaches, and we use a formal information theoretic approach to identify unstable taxa. After a posteriori pruning of unstable taxa, the undated and tip-dating topologies appear congruent with recent molecular analyses and support largely similar relationships, with strong support for Stirtonia as a stem alouattine, Neosaimiri as a stem saimirine, Cebupithecia as a stem pitheciine, and Lagonimico as a stem callitrichid. Both analyses find three Greater Antillean subfossil platyrrhines (Xenothrix, Antillothrix, and Paralouatta) to form a clade that is related to Callicebus, congruent with a single dispersal event by the ancestor of this clade to the Greater Antilles. They also suggest that the fossil Proteropithecia may not be closely related to pitheciines, and that all known platyrrhines older than the Middle Miocene are stem taxa. Notably, the undated analysis found the Early Miocene Panamacebus (currently recognized as the oldest known cebid) to be unstable, and the tip-dating analysis placed it outside crown Platyrrhini. Our tip-dating analysis supports a late Oligocene or earliest Miocene (20.8-27.0 Ma) age for crown Platyrrhini, congruent with recent molecular clock analyses.
Collapse
Affiliation(s)
- Robin M D Beck
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK.
| | - Dorien de Vries
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Mareike C Janiak
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Ian B Goodhead
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Jean P Boubli
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| |
Collapse
|
42
|
H Tomasco I, Giorello FM, Boullosa N, Feijoo M, Lanzone C, Lessa EP. The contribution of incomplete lineage sorting and introgression to the evolutionary history of the fast-evolving genus Ctenomys (Rodentia, Ctenomyidae). Mol Phylogenet Evol 2022; 176:107593. [PMID: 35905819 DOI: 10.1016/j.ympev.2022.107593] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 06/28/2022] [Accepted: 07/21/2022] [Indexed: 10/31/2022]
Abstract
Incomplete lineage sorting and introgression have been increasingly recognized as important processes involved in biological differentiation. Both incomplete lineage sorting and introgression result in incongruences between gene trees and species trees, consequently causing difficulties in phylogenetic reconstruction. This is particularly the case for rapid radiations, as short internodal distances and incomplete reproductive isolation increase the likelihood of both ILS and introgression. Estimation of the relative frequency of these processes requires assessments across many genomic regions. We use transcriptomics to test for introgression and estimate the frequency of incomplete lineage sorting in a set of three closely related and geographically adjacent South American tuco-tucos species (Ctenomys), a genus comprising 64 species resulting from recent, rapid radiation. After cleaning and filtering, 5764 orthologous genes strongly support paraphyly of C. pearsoni relative to C. brasiliensis (putatively represented by the population of Villa Serrana). In line with earlier phylogenetic work, the C. pearsoni - C. brasiliensis pair is closely related to C. torquatus, whereas C. rionegrensis is more distantly related to these three nominal species. Classical Patterson's D-statistic shows significant signals of introgression from C. torquatus into C. brasiliensis. However, a 5-taxon test shows no significant results. Incomplete lineage sorting was estimated to have involved about 9% of the loci, suggesting it represents an important process in the incipient diversification of tuco-tucos.
Collapse
Affiliation(s)
- Ivanna H Tomasco
- Departamento de Ecología y Evolución, Facultad de Ciencias, Universidad de la República. Iguá 4225. Montevideo, 11400. Uruguay.
| | - Facundo M Giorello
- Facundo M. Giorello. PDU Espacio de Biología Vegetal del Noreste, Centro Universitario de Tacuarembó (CUT), Universidad de la República, Ruta 5 km 386,200, 45000, Tacuarembó, Uruguay
| | - Nicolás Boullosa
- Departamento de Ecología y Evolución, Facultad de Ciencias, Universidad de la República. Iguá 4225. Montevideo, 11400. Uruguay
| | - Matías Feijoo
- Matías Feijoo. Departamento de Sistemas Agrarios y Paisajes Culturales, Centro Universitario Regional Este (CURE). Universidad de la República. Ruta 8 Km 281, Treinta y Tres, Uruguay
| | - Cecilia Lanzone
- Cecilia Lanzone. Laboratorio de Genética Evolutiva, IBS (CONICET-UNaM), FCEQyN, Félix de Azara 1553, Posadas,3300. Misiones, Argentina
| | - Enrique P Lessa
- Departamento de Ecología y Evolución, Facultad de Ciencias, Universidad de la República. Iguá 4225. Montevideo, 11400. Uruguay
| |
Collapse
|
43
|
Smith ML, Vanderpool D, Hahn MW. Using all gene families vastly expands data available for phylogenomic inference. Mol Biol Evol 2022; 39:6596367. [PMID: 35642314 PMCID: PMC9178227 DOI: 10.1093/molbev/msac112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs using clustering approaches and retain families with a single sequence per species. This limits the amount of data available by excluding larger families. Recent advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several methods for species tree inference are robust to the inclusion of paralogs and could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference by examining relationships among 26 primate species in detail and by analyzing five additional data sets. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data. We explore several species tree inference methods, finding that identical trees are returned across nearly all subsets of the data and methods for primates. The relationships among Platyrrhini remain contentious; however, the species tree inference method matters more than the subset of data used. Using data from larger gene families drastically increases the number of genes available and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression in primates. For the other data sets, topological inferences are consistent whether single-copy families or orthologs extracted using decomposition approaches are analyzed. Using larger gene families is a promising approach to include more data in phylogenomics without sacrificing accuracy, at least when high-quality genomes are available.
Collapse
Affiliation(s)
- Megan L Smith
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| | - Dan Vanderpool
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|
44
|
Pozzi L, Penna A. Rocks and clocks revised: New promises and challenges in dating the primate tree of life. Evol Anthropol 2022; 31:138-153. [PMID: 35102633 DOI: 10.1002/evan.21940] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 10/04/2021] [Accepted: 01/12/2022] [Indexed: 01/14/2023]
Abstract
In recent years, multiple technological and methodological advances have increased our ability to estimate phylogenies, leading to more accurate dating of the primate tree of life. Here we provide an overview of the limitations and potentials of some of these advancements and discuss how dated phylogenies provide the crucial temporal scale required to understand primate evolution. First, we review new methods, such as the total-evidence dating approach, that promise a better integration between the fossil record and molecular data. We then explore how the ever-increasing availability of genomic-level data for more primate species can impact our ability to accurately estimate timetrees. Finally, we discuss more recent applications of mutation rates to date divergence times. We highlight example studies that have applied these approaches to estimate divergence dates within primates. Our goal is to provide a critical overview of these new developments and explore the promises and challenges of their application in evolutionary anthropology.
Collapse
Affiliation(s)
- Luca Pozzi
- Department of Anthropology, The University of Texas at San Antonio, San Antonio, Texas, USA
| | - Anna Penna
- Department of Anthropology, The University of Texas at San Antonio, San Antonio, Texas, USA
| |
Collapse
|
45
|
Sigeman H, Sinclair B, Hansson B. Findzx: an automated pipeline for detecting and visualising sex chromosomes using whole-genome sequencing data. BMC Genomics 2022; 23:328. [PMID: 35477344 PMCID: PMC9044604 DOI: 10.1186/s12864-022-08432-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 03/01/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sex chromosomes have evolved numerous times, as revealed by recent genomic studies. However, large gaps in our knowledge of sex chromosome diversity across the tree of life remain. Filling these gaps, through the study of novel species, is crucial for improved understanding of why and how sex chromosomes evolve. Characterization of sex chromosomes in already well-studied organisms is also important to avoid misinterpretations of population genomic patterns caused by undetected sex chromosome variation. RESULTS Here we present findZX, an automated Snakemake-based computational pipeline for detecting and visualizing sex chromosomes through differences in genome coverage and heterozygosity between any number of males and females. A main feature of the pipeline is the option to perform a genome coordinate liftover to a reference genome of another species. This allows users to inspect sex-linked regions over larger contiguous chromosome regions, while also providing important between-species synteny information. To demonstrate its effectiveness, we applied findZX to publicly available genomic data from species belonging to widely different taxonomic groups (mammals, birds, reptiles, and fish), with sex chromosome systems of different ages, sizes, and levels of differentiation. We also demonstrate that the liftover method is robust over large phylogenetic distances (> 80 million years of evolution). CONCLUSIONS With findZX we provide an easy-to-use and highly effective tool for identification of sex chromosomes. The pipeline is compatible with both Linux and MacOS systems, and scalable to suit different computational platforms.
Collapse
Affiliation(s)
- Hanna Sigeman
- Department of Biology, Lund University, Ecology Building, 223 62, Lund, Sweden.
| | - Bella Sinclair
- Department of Biology, Lund University, Ecology Building, 223 62, Lund, Sweden
| | - Bengt Hansson
- Department of Biology, Lund University, Ecology Building, 223 62, Lund, Sweden
| |
Collapse
|
46
|
Tigano A, Khan R, Omer AD, Weisz D, Dudchenko O, Multani AS, Pathak S, Behringer RR, Aiden EL, Fisher H, MacManes MD. Chromosome size affects sequence divergence between species through the interplay of recombination and selection. Evolution 2022; 76:782-798. [PMID: 35271737 PMCID: PMC9314927 DOI: 10.1111/evo.14467] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 12/12/2021] [Indexed: 01/21/2023]
Abstract
The structure of the genome shapes the distribution of genetic diversity and sequence divergence. To investigate how the relationship between chromosome size and recombination rate affects sequence divergence between species, we combined empirical analyses and evolutionary simulations. We estimated pairwise sequence divergence among 15 species from three different mammalian clades-Peromyscus rodents, Mus mice, and great apes-from chromosome-level genome assemblies. We found a strong significant negative correlation between chromosome size and sequence divergence in all species comparisons within the Peromyscus and great apes clades but not the Mus clade, suggesting that the dramatic chromosomal rearrangements among Mus species may have masked the ancestral genomic landscape of divergence in many comparisons. Our evolutionary simulations showed that the main factor determining differences in divergence among chromosomes of different sizes is the interplay of recombination rate and selection, with greater variation in larger populations than in smaller ones. In ancestral populations, shorter chromosomes harbor greater nucleotide diversity. As ancestral populations diverge, diversity present at the onset of the split contributes to greater sequence divergence in shorter chromosomes among daughter species. The combination of empirical data and evolutionary simulations revealed that chromosomal rearrangements, demography, and divergence times may also affect the relationship between chromosome size and divergence, thus deepening our understanding of the role of genome structure in the evolution of species divergence.
Collapse
Affiliation(s)
- Anna Tigano
- Molecular, Cellular, and Biomedical Sciences DepartmentUniversity of New HampshireDurhamNH03824USA,Hubbard Center for Genome StudiesUniversity of New HampshireDurhamNH03824USA,Current address: Department of BiologyUniversity of British Columbia – Okanagan CampusKelownaBCV1 V 1V7Canada
| | - Ruqayya Khan
- The Center for Genome ArchitectureDepartment of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Arina D. Omer
- The Center for Genome ArchitectureDepartment of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - David Weisz
- The Center for Genome ArchitectureDepartment of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Olga Dudchenko
- The Center for Genome ArchitectureDepartment of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA,Department of Computer ScienceDepartment of Computational and Applied MathematicsRice UniversityHoustonTX77030USA
| | - Asha S. Multani
- Department of GeneticsM.D. Anderson Cancer CenterUniversity of TexasHoustonTX77030USA
| | - Sen Pathak
- Department of GeneticsM.D. Anderson Cancer CenterUniversity of TexasHoustonTX77030USA
| | - Richard R. Behringer
- Department of GeneticsM.D. Anderson Cancer CenterUniversity of TexasHoustonTX77030USA
| | - Erez L. Aiden
- The Center for Genome ArchitectureDepartment of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA,Department of Computer ScienceDepartment of Computational and Applied MathematicsRice UniversityHoustonTX77030USA,Center for Theoretical and Biological PhysicsRice UniversityHoustonTX77030USA,Shanghai Institute for Advanced Immunochemical StudiesShanghaiTech UniversityShanghai201210China,School of Agriculture and EnvironmentUniversity of Western AustraliaPerthWA6009Australia
| | - Heidi Fisher
- Department of BiologyUniversity of MarylandCollege ParkMD20742USA
| | - Matthew D. MacManes
- Molecular, Cellular, and Biomedical Sciences DepartmentUniversity of New HampshireDurhamNH03824USA,Hubbard Center for Genome StudiesUniversity of New HampshireDurhamNH03824USA
| |
Collapse
|
47
|
Schull JK, Turakhia Y, Hemker JA, Dally WJ, Bejerano G. Champagne: Automated Whole-Genome Phylogenomic Character Matrix Method Using Large Genomic Indels for Homoplasy-Free Inference. Genome Biol Evol 2022; 14:evac013. [PMID: 35171243 PMCID: PMC8920512 DOI: 10.1093/gbe/evac013] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/10/2022] [Indexed: 11/14/2022] Open
Abstract
We present Champagne, a whole-genome method for generating character matrices for phylogenomic analysis using large genomic indel events. By rigorously picking orthologous genes and locating large insertion and deletion events, Champagne delivers a character matrix that considerably reduces homoplasy compared with morphological and nucleotide-based matrices, on both established phylogenies and difficult-to-resolve nodes in the mammalian tree. Champagne provides ample evidence in the form of genomic structural variation to support incomplete lineage sorting and possible introgression in Paenungulata and human-chimp-gorilla which were previously inferred primarily through matrices composed of aligned single-nucleotide characters. Champagne also offers further evidence for Myomorpha as sister to Sciuridae and Hystricomorpha in the rodent tree. Champagne harbors distinct theoretical advantages as an automated method that produces nearly homoplasy-free character matrices on the whole-genome scale.
Collapse
Affiliation(s)
- James K Schull
- Department of Computer Science, Stanford University, USA
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, USA
| | - James A Hemker
- Department of Computer Science, Stanford University, USA
| | - William J Dally
- Department of Computer Science, Stanford University, USA
- NVIDIA, Santa Clara, California, USA
- Department of Electrical Engineering, Stanford University, USA
| | - Gill Bejerano
- Department of Computer Science, Stanford University, USA
- Department of Developmental Biology, Stanford University, USA
- Department of Biomedical Data Science, Stanford University, USA
- Department of Pediatrics, Stanford University, USA
| |
Collapse
|
48
|
Hagemann L, Grow N, Bohr YEMB, Perwitasari-Farajallah D, Duma Y, Gursky SL, Merker S. Small, odd and old: The mysterious Tarsius pumilus is the most basal Sulawesi tarsier. Biol Lett 2022; 18:20210642. [PMID: 35350878 PMCID: PMC8965421 DOI: 10.1098/rsbl.2021.0642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 03/04/2022] [Indexed: 11/12/2022] Open
Abstract
In this study, we present the first genetic evidence of the phylogenetic position of Tarsius pumilus, the mountain tarsier of Sulawesi, Indonesia. This mysterious primate is the only Eastern tarsier species that occurs exclusively in cloud forests above 1800 m.a.s.l. It exhibits striking morphological peculiarities-most prominently its extremely reduced body size, which led to the common name of 'pygmy tarsier'. However, our results indicate that T. pumilus is not an aberrant form of a lowland tarsier, but in fact, the most basal of all Sulawesi tarsiers. Applying a Bayesian multi-locus coalescent approach, we dated the divergence between the T. pumilus lineage and the ancestor of all other extant Sulawesi tarsiers to 9.88 Mya. This is as deep as the split between the two other tarsier genera Carlito (Philippine tarsiers) and Cephalopachus (Western tarsiers), and predates further tarsier diversification on Sulawesi by around 7 Myr. The date coincides with the deepening of the marine environment between eastern and western Sulawesi, which likely led to allopatric speciation between T. pumilus or its predecessor in the west and the ancestor of all other Sulawesi tarsiers in the east. As the split preceded the emergence of permanent mountains in western Sulawesi, it is unlikely that the shift to montane habitat has driven the formation of the T. pumilus lineage.
Collapse
Affiliation(s)
- Laura Hagemann
- Department of Zoology, State Museum of Natural History Stuttgart, 70191 Stuttgart, Germany
| | - Nanda Grow
- Department of Anthropology, Washington State University, Pullman, WA 99164‐4910, USA
| | - Yvonne E.-M. B. Bohr
- Institute of Ecology, Evolution and Diversity, Johann Wolfgang Goethe-Universität Frankfurt, 60438 Frankfurt am Main, Germany
- Department of Biology, Universität Hamburg, 20146 Hamburg, Germany
| | - Dyah Perwitasari-Farajallah
- Primate Research Center, IPB University, Bogor 16151, Indonesia
- Department of Biology, Faculty of Mathematics and Natural Sciences, IPB University, Bogor 16151, Indonesia
| | - Yulius Duma
- Faculty of Animal Husbandry and Fisheries, Universitas Tadulako Palu, 94148, Palu, Central Sulawesi, Indonesia
| | - Sharon L. Gursky
- Department of Anthropology, Texas A&M University, College Station, TX 77843‐4352, USA
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, 70191 Stuttgart, Germany
| |
Collapse
|
49
|
Hibbins MS, Hahn MW. Phylogenomic approaches to detecting and characterizing introgression. Genetics 2022; 220:iyab173. [PMID: 34788444 PMCID: PMC9208645 DOI: 10.1093/genetics/iyab173] [Citation(s) in RCA: 78] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 10/02/2021] [Indexed: 12/26/2022] Open
Abstract
Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
50
|
Suvorov A, Kim BY, Wang J, Armstrong EE, Peede D, D'Agostino ERR, Price DK, Waddell P, Lang M, Courtier-Orgogozo V, David JR, Petrov D, Matute DR, Schrider DR, Comeault AA. Widespread introgression across a phylogeny of 155 Drosophila genomes. Curr Biol 2022; 32:111-123.e5. [PMID: 34788634 PMCID: PMC8752469 DOI: 10.1016/j.cub.2021.10.052] [Citation(s) in RCA: 129] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 09/29/2021] [Accepted: 10/22/2021] [Indexed: 01/12/2023]
Abstract
Genome-scale sequence data have invigorated the study of hybridization and introgression, particularly in animals. However, outside of a few notable cases, we lack systematic tests for introgression at a larger phylogenetic scale across entire clades. Here, we leverage 155 genome assemblies from 149 species to generate a fossil-calibrated phylogeny and conduct multilocus tests for introgression across 9 monophyletic radiations within the genus Drosophila. Using complementary phylogenomic approaches, we identify widespread introgression across the evolutionary history of Drosophila. Mapping gene-tree discordance onto the phylogeny revealed that both ancient and recent introgression has occurred across most of the 9 clades that we examined. Our results provide the first evidence of introgression occurring across the evolutionary history of Drosophila and highlight the need to continue to study the evolutionary consequences of hybridization and introgression in this genus and across the tree of life.
Collapse
Affiliation(s)
- Anton Suvorov
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA.
| | - Bernard Y Kim
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Jeremy Wang
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | | | - David Peede
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | | | - Donald K Price
- School of Life Sciences, University of Nevada, Las Vegas, NV 89119, USA
| | - Peter Waddell
- School of Fundamental Sciences, Massey University, Palmerston North 4442, New Zealand
| | - Michael Lang
- CNRS, Institut Jacques Monod, Université de Paris, Paris 75013, France
| | | | - Jean R David
- Laboratoire Evolution, Génomes, Comportement, Ecologie (EGCE) CNRS, IRD, Univ. Paris-sud, Université Paris-Saclay, Gif sur Yvette 91190, France; Institut de Systématique, Evolution, Biodiversité, CNRS, MNHN, UPMC, EPHE, Muséum National d'Histoire Naturelle, Sorbonne Universités, Paris 75005, France
| | - Dmitri Petrov
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Daniel R Matute
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Aaron A Comeault
- Molecular Ecology & Evolution Group, School of Natural Sciences, Bangor University, Bangor, Gwynedd LL57 2DGA, UK.
| |
Collapse
|