1
|
Gwee CY, Metzler D, Fuchs J, Wolf JBW. Reconciling Gene Tree Discordance and Biogeography in European Crows. Mol Ecol 2025; 34:e17764. [PMID: 40208017 PMCID: PMC12051742 DOI: 10.1111/mec.17764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 03/20/2025] [Accepted: 03/31/2025] [Indexed: 04/11/2025]
Abstract
Reconstructing the evolutionary history of young lineages diverging with gene flow is challenging due to factors like incomplete lineage sorting, introgression, and selection causing gene tree discordance. The European crow hybrid zone between all-black carrion crows and grey-coated hooded crows exemplifies this challenge. Most of the genome in Western and Central European carrion crow populations is near-identical to hooded crows, but differs substantially from their Iberian congeners. A notable exception is a single major-effect colour-locus under sexual selection aligning with the 'species' tree. To understand the underlying evolutionary processes, we reconstructed the biogeographic history of the species complex. During the Pleistocene carrion and hooded crows took refuge in the Iberian Peninsula and the Middle East, respectively. Allele-sharing of all-black Western European populations with likewise black Iberian crows at the colour-locus represents the last trace of carrion crow ancestry, resisting gene flow from expanding hooded crow populations that have homogenised most of the genome. A model of colour-locus introgression from an Iberian ancestor into hooded crow populations near the Pyrenées was significantly less supported. We found no positive relationship between introgression and recombination rate consistent with the absence of genome-wide, polygenic barriers in this young species complex. Overall, this study portrays a scenario where few large-effect loci, subject to divergent sexual selection, resist rampant and asymmetric gene exchange. This study underscores the importance of integrating population demography and biogeography to accurately interpret patterns of gene tree discordance following population divergence.
Collapse
Affiliation(s)
- Chyi Yin Gwee
- Division of Evolutionary BiologyLMU MunichPlanegg‐MartinsriedGermany
- Microevolution and BiodiversityMax Planck Institute for Biological IntelligenceSeewiesenGermany
| | - Dirk Metzler
- Division of Evolutionary BiologyLMU MunichPlanegg‐MartinsriedGermany
| | - Jérôme Fuchs
- Institut de Systématique, Evolution, Biodiversité (ISYEB), CNRS, SU, EPHE, UAMuséum National d'Histoire NaturelleParisFrance
| | - Jochen B. W. Wolf
- Division of Evolutionary BiologyLMU MunichPlanegg‐MartinsriedGermany
- Microevolution and BiodiversityMax Planck Institute for Biological IntelligenceSeewiesenGermany
| |
Collapse
|
2
|
Adams R, Lozano JR, Duncan M, Green J, Assis R, DeGiorgio M. A Tale of Too Many Trees: A Conundrum for Phylogenetic Regression. Mol Biol Evol 2025; 42:msaf032. [PMID: 39930867 PMCID: PMC11884811 DOI: 10.1093/molbev/msaf032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 12/20/2024] [Accepted: 01/21/2025] [Indexed: 03/08/2025] Open
Abstract
Just exactly which tree(s) should we assume when testing evolutionary hypotheses? This question has plagued comparative biologists for decades. Though all phylogenetic comparative methods require input trees, we seldom know with certainty whether even a perfectly estimated tree (if this is possible in practice) is appropriate for our studied traits. Yet, we also know that phylogenetic conflict is ubiquitous in modern comparative biology, and we are still learning about its dangers when testing evolutionary hypotheses. Here, we investigate the consequences of tree-trait mismatch for phylogenetic regression in the presence of gene tree-species tree conflict. Our simulation experiments reveal excessively high false positive rates for mismatched models with both small and large trees, simple and complex traits, and known and estimated phylogenies. In some cases, we find evidence of a directionality of error: assuming a species tree for traits that evolved according to a gene tree sometimes fares worse than the opposite. We also explored the impacts of tree choice using an expansive, cross-species gene expression dataset as an arguably "best-case" scenario in which one may have a better chance of matching tree with trait. Offering a potential path forward, we found promise in the application of a robust estimator as a potential, albeit imperfect, solution to some issues raised by tree mismatch. Collectively, our results emphasize the importance of careful study design for comparative methods, highlighting the need to fully appreciate the role of accurate and thoughtful phylogenetic modeling.
Collapse
Affiliation(s)
- Richard Adams
- Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR, USA
- Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA
| | - Jenniffer Roa Lozano
- Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR, USA
- Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA
| | - Mataya Duncan
- Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR, USA
- Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA
| | - Jack Green
- Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR, USA
- Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA
| | - Raquel Assis
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA
- Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA
| |
Collapse
|
3
|
Thomas GWC, Hughes JJ, Kumon T, Berv JS, Nordgren CE, Lampson M, Levine M, Searle JB, Good JM. The Genomic Landscape, Causes, and Consequences of Extensive Phylogenomic Discordance in Murine Rodents. Genome Biol Evol 2025; 17:evaf017. [PMID: 39903560 PMCID: PMC11837218 DOI: 10.1093/gbe/evaf017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 01/08/2025] [Accepted: 01/23/2025] [Indexed: 02/06/2025] Open
Abstract
A species tree is a central concept in evolutionary biology whereby a single branching phylogeny reflects relationships among species. However, the phylogenies of different genomic regions often differ from the species tree. Although tree discordance is widespread in phylogenomic studies, we still lack a clear understanding of how variation in phylogenetic patterns is shaped by genome biology or the extent to which discordance may compromise comparative studies. We characterized patterns of phylogenomic discordance across the murine rodents-a large and ecologically diverse group that gave rise to the laboratory mouse and rat model systems. Combining recently published linked-read genome assemblies for seven murine species with other available rodent genomes, we first used ultraconserved elements (UCEs) to infer a robust time-calibrated species tree. We then used whole genomes to examine finer-scale patterns of discordance across ∼12 million years of divergence. We found that proximate chromosomal regions tended to have more similar phylogenetic histories. There was no clear relationship between local tree similarity and recombination rates in house mice, but we did observe a correlation between recombination rates and average similarity to the species tree. We also detected a strong influence of linked selection whereby purifying selection at UCEs led to appreciably less discordance. Finally, we show that assuming a single species tree can result in substantial deviation from the results with gene trees when testing for positive selection under different models. Collectively, our results highlight the complex relationship between phylogenetic inference and genome biology and underscore how failure to account for this complexity can mislead comparative genomic studies.
Collapse
Affiliation(s)
- Gregg W C Thomas
- Division of Biological Sciences, University of Montana, Missoula, MT 59801, USA
- Informatics Group, Harvard University, Cambridge, MA 02138, USA
| | - Jonathan J Hughes
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
- Department of Evolution, Ecology, and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Tomohiro Kumon
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jacob S Berv
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - C Erik Nordgren
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Michael Lampson
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Mia Levine
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jeremy B Searle
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
| | - Jeffrey M Good
- Division of Biological Sciences, University of Montana, Missoula, MT 59801, USA
| |
Collapse
|
4
|
Wilhoit K, Yamanouchi S, Chen BJ, Yamasaki YY, Ishikawa A, Inoue J, Iwasaki W, Kitano J. Convergent Evolution and Predictability of Gene Copy Numbers Associated with Diets in Mammals. Genome Biol Evol 2025; 17:evaf008. [PMID: 39849899 PMCID: PMC11797053 DOI: 10.1093/gbe/evaf008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 01/09/2025] [Accepted: 01/15/2025] [Indexed: 01/25/2025] Open
Abstract
Convergent evolution, the evolution of the same or similar phenotypes in phylogenetically independent lineages, is a widespread phenomenon in nature. If the genetic basis for convergent evolution is predictable to some extent, it may be possible to infer organismic phenotypes and the capability of organisms to utilize new ecological resources based on genome sequence data. While repeated amino acid changes have been studied in association with convergent evolution, relatively little is known about the potential contribution of repeated gene copy number changes. In this study, we explore whether gene copy number changes of particular gene families are linked to diet shifts in mammals and assess whether trophic ecology can be inferred from the copy numbers of a specific set of gene families. Using 86 mammalian genome sequences, we identified 24 gene families with a trend toward higher copy numbers in herbivores, carnivores, and omnivores, even after phylogenetic corrections. We were able to confirm previous findings on genes such as amylase, olfactory receptors, and xenobiotic metabolism genes, and identify novel gene families whose copy numbers correlate with dietary patterns. For example, omnivores exhibited higher copy numbers of genes encoding regulators of translation. We also established a discriminant function based on the copy numbers of 13 gene families that can help predict trophic ecology to some extent. These findings highlight a possible association between convergent evolution and repeated copy number changes in specific gene families, suggesting the potential to develop a method for predicting animal ecology from genome sequence data.
Collapse
Affiliation(s)
- Kayla Wilhoit
- Ecological Genetics Laboratory, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Biomedical Sciences Program, Texas A&M University, College Station, TX, USA
- University Program in Genetics and Genomics, Duke University, Durham, NC, USA
| | - Shun Yamanouchi
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo 113-0032, Japan
| | - Bo-Jyun Chen
- Ecological Genetics Laboratory, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Genetics Course, The Graduate University for Advanced Studies, Mishima, Shizuoka 411-8540, Japan
| | - Yo Y Yamasaki
- Ecological Genetics Laboratory, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Genetics Course, The Graduate University for Advanced Studies, Mishima, Shizuoka 411-8540, Japan
| | - Asano Ishikawa
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-0882, Japan
| | - Jun Inoue
- Atmosphere and Ocean Research Institute, The University of Tokyo, Kashiwa, Chiba 277-0882, Japan
| | - Wataru Iwasaki
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo 113-0032, Japan
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-0882, Japan
- Atmosphere and Ocean Research Institute, The University of Tokyo, Kashiwa, Chiba 277-0882, Japan
| | - Jun Kitano
- Ecological Genetics Laboratory, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Genetics Course, The Graduate University for Advanced Studies, Mishima, Shizuoka 411-8540, Japan
| |
Collapse
|
5
|
Witharana EP, Iwasaki T, San MH, Jayawardana NU, Kotoda N, Yamamoto M, Nagano Y. Subfamily evolution analysis using nuclear and chloroplast data from the same reads. Sci Rep 2025; 15:687. [PMID: 39753617 PMCID: PMC11698846 DOI: 10.1038/s41598-024-83292-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 12/13/2024] [Indexed: 01/06/2025] Open
Abstract
The chloroplast (cp) genome is a widely used tool for exploring plant evolutionary relationships, yet its effectiveness in fully resolving these relationships remains uncertain. Integrating cp genome data with nuclear DNA information offers a more comprehensive view but often requires separate datasets. In response, we employed the same raw read sequencing data to construct cp genome-based trees and nuclear DNA phylogenetic trees using Read2Tree, a cost-efficient method for extracting conserved nuclear gene sequences from raw read data, focusing on the Aurantioideae subfamily, which includes Citrus and its relatives. The resulting nuclear DNA trees were consistent with existing nuclear evolutionary relationships derived from high-throughput sequencing, but diverged from cp genome-based trees. To elucidate the underlying complex evolutionary processes causing these discordances, we implemented an integrative workflow that utilized multiple alignments of each gene generated by Read2Tree, in conjunction with other phylogenomic methods. Our analysis revealed that incomplete lineage sorting predominantly drives these discordances, while introgression and ancient introgression also contribute to topological discrepancies within certain clades. This study underscores the cost-effectiveness of using the same raw sequencing data for both cp and nuclear DNA analyses in understanding plant evolutionary relationships.
Collapse
Affiliation(s)
- Eranga Pawani Witharana
- Faculty of Agriculture, University of Peradeniya, Peradeniya, Sri Lanka.
- Analytical Research Center for Experimental Sciences, Saga University, Saga, Japan.
- Graduate School of Advanced Health Science, Saga University, Saga, Japan.
| | | | - Myat Htoo San
- Analytical Research Center for Experimental Sciences, Saga University, Saga, Japan
- The United Graduate School of Agricultural Sciences, Kagoshima University, Kagoshima, Japan
| | - Nadeeka U Jayawardana
- Faculty of Agriculture, University of Peradeniya, Peradeniya, Sri Lanka
- Applied BioSciences, Macquarie University, 205B, Culloden Road, Sydney, NSW, Australia
| | - Nobuhiro Kotoda
- Graduate School of Advanced Health Science, Saga University, Saga, Japan
- The United Graduate School of Agricultural Sciences, Kagoshima University, Kagoshima, Japan
- Faculty of Agriculture, Saga University, Saga, Japan
| | - Masashi Yamamoto
- The United Graduate School of Agricultural Sciences, Kagoshima University, Kagoshima, Japan
- Faculty of Agriculture, Kagoshima University, Kagoshima, Japan
| | - Yukio Nagano
- Analytical Research Center for Experimental Sciences, Saga University, Saga, Japan.
- Graduate School of Advanced Health Science, Saga University, Saga, Japan.
- The United Graduate School of Agricultural Sciences, Kagoshima University, Kagoshima, Japan.
| |
Collapse
|
6
|
Lanfear R, Hahn MW. The Meaning and Measure of Concordance Factors in Phylogenomics. Mol Biol Evol 2024; 41:msae214. [PMID: 39418118 PMCID: PMC11532913 DOI: 10.1093/molbev/msae214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 09/25/2024] [Accepted: 10/04/2024] [Indexed: 10/19/2024] Open
Abstract
As phylogenomic datasets have grown in size, researchers have developed new ways to measure biological variation and to assess statistical support for specific branches. Larger datasets have more sites and loci and therefore less sampling variance. While we can more accurately measure the mean signal in these datasets, lower sampling variance is often reflected in uniformly high measures of branch support-such as the bootstrap and posterior probability-limiting their utility. Larger datasets have also revealed substantial biological variation in the topologies found across individual loci, such that the single species tree inferred by most phylogenetic methods represents a limited summary of the data for many purposes. In contrast to measures of statistical support, the degree of underlying topological variation among loci should be approximately constant regardless of the size of the dataset. "Concordance factors" (CFs) and similar statistics have therefore become increasingly important tools in phylogenetics. In this review, we explain why CFs should be thought of as descriptors of topological variation rather than as measures of statistical support, and argue that they provide important information about the predictive power of the species tree not contained in measures of support. We review a growing suite of statistics for measuring concordance, compare them in a common framework that reveals their interrelationships, and demonstrate how to calculate them using an example from birds. We also discuss how measures of topological variation might change in the future as we move beyond estimating a single "tree of life" toward estimating the myriad evolutionary histories underlying genomic variation.
Collapse
Affiliation(s)
- Robert Lanfear
- Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australia
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN, USA
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| |
Collapse
|
7
|
Schraiber JG, Edge MD, Pennell M. Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations. PLoS Biol 2024; 22:e3002847. [PMID: 39383205 PMCID: PMC11493298 DOI: 10.1371/journal.pbio.3002847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 10/21/2024] [Accepted: 09/17/2024] [Indexed: 10/11/2024] Open
Abstract
In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these 2 fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we lay out a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., genome-wide association studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur analytically and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study, we re-examine an analysis testing for coevolution of expression levels between genes across a fungal phylogeny and show that including eigenvectors of the covariance matrix as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.
Collapse
Affiliation(s)
- Joshua G. Schraiber
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Michael D. Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Matt Pennell
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, United States of America
- Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
8
|
Rurik I, Melichárková A, Gbúrová Štubová E, Kučera J, Kochjarová J, Paun O, Vďačný P, Slovák M. Homoplastic versus xenoplastic evolution: exploring the emergence of key intrinsic and extrinsic traits in the montane genus Soldanella (Primulaceae). THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 118:753-765. [PMID: 38217489 DOI: 10.1111/tpj.16630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 12/02/2023] [Accepted: 12/27/2023] [Indexed: 01/15/2024]
Abstract
Specific ecological conditions in the high mountain environment exert a selective pressure that often leads to convergent trait evolution. Reticulations induced by incomplete lineage sorting and introgression can lead to discordant trait patterns among gene and species trees (hemiplasy/xenoplasy), providing a false illusion that the traits under study are homoplastic. Using phylogenetic species networks, we explored the effect of gene exchange on trait evolution in Soldanella, a genus profoundly influenced by historical introgression. At least three features evolved independently multiple times: the single-flowered dwarf phenotype, dysploid cytotype, and ecological generalism. The present analyses also indicated that the recurring occurrence of stoloniferous growth might have been prompted by an introgression event between an ancestral lineage and a still extant species, although its emergence via convergent evolution cannot be completely ruled out. Phylogenetic regression suggested that the independent evolution of larger genomes in snowbells is most likely a result of the interplay between hybridization events of dysploid and euploid taxa and hostile environments at the range margins of the genus. The emergence of key intrinsic and extrinsic traits in snowbells has been significantly impacted not only by convergent evolution but also by historical and recent introgression events.
Collapse
Affiliation(s)
- Ivan Rurik
- Department of Zoology, Comenius University Bratislava, Ilkovičova 6, 842 15, Bratislava, Slovak Republic
| | - Andrea Melichárková
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
| | - Eliška Gbúrová Štubová
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
- Slovak National Museum, Natural History Museum, Vajanského nábrežie 2, 810 06, Bratislava, Slovak Republic
| | - Jaromír Kučera
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
| | - Judita Kochjarová
- Department of Phytology, Faculty of Forestry, Technical University Zvolen, Masarykova 24, 960 53, Zvolen, Slovak Republic
| | - Ovidiu Paun
- Department of Botany and Biodiversity Research, University of Vienna, Rennweg 14, 1030, Vienna, Austria
| | - Peter Vďačný
- Department of Zoology, Comenius University Bratislava, Ilkovičova 6, 842 15, Bratislava, Slovak Republic
| | - Marek Slovák
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
- Department of Botany, Charles University, Benátská 2, 128 01, Prague, Czech Republic
| |
Collapse
|
9
|
Jin M, Wang H, Liu G, Lu J, Yuan Z, Li T, Liu E, Lu Z, Du L, Wei C. Whole-genome resequencing of Chinese indigenous sheep provides insight into the genetic basis underlying climate adaptation. Genet Sel Evol 2024; 56:26. [PMID: 38565986 PMCID: PMC10988870 DOI: 10.1186/s12711-024-00880-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/31/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Chinese indigenous sheep are valuable resources with unique features and characteristics. They are distributed across regions with different climates in mainland China; however, few reports have analyzed the environmental adaptability of sheep based on their genome. We examined the variants and signatures of selection involved in adaptation to extreme humidity, altitude, and temperature conditions in 173 sheep genomes from 41 phenotypically and geographically representative Chinese indigenous sheep breeds to characterize the genetic basis underlying environmental adaptation in these populations. RESULTS Based on the analysis of population structure, we inferred that Chinese indigenous sheep are divided into four groups: Kazakh (KAZ), Mongolian (MON), Tibetan (TIB), and Yunnan (YUN). We also detected a set of candidate genes that are relevant to adaptation to extreme environmental conditions, such as drought-prone regions (TBXT, TG, and HOXA1), high-altitude regions (DYSF, EPAS1, JAZF1, PDGFD, and NF1) and warm-temperature regions (TSHR, ABCD4, and TEX11). Among all these candidate genes, eight ABCD4, CNTN4, DOCK10, LOC105608545, LOC121816479, SEM3A, SVIL, and TSHR overlap between extreme environmental conditions. The TSHR gene shows a strong signature for positive selection in the warm-temperature group and harbors a single nucleotide polymorphism (SNP) missense mutation located between positions 90,600,001 and 90,650,001 on chromosome 7, which leads to a change in the protein structure of TSHR and influences its stability. CONCLUSIONS Analysis of the signatures of selection uncovered genes that are likely related to environmental adaptation and a SNP missense mutation in the TSHR gene that affects the protein structure and stability. It also provides information on the evolution of the phylogeographic structure of Chinese indigenous sheep populations. These results provide important genetic resources for future breeding studies and new perspectives on how animals can adapt to climate change.
Collapse
Affiliation(s)
- Meilin Jin
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Huihua Wang
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Gang Liu
- National Animal Husbandry Service, National Center of Preservation and Utilization of Animal Genetic Resources, Beijing, China
| | - Jian Lu
- National Animal Husbandry Service, National Center of Preservation and Utilization of Animal Genetic Resources, Beijing, China
| | - Zehu Yuan
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Taotao Li
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Engming Liu
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zengkui Lu
- Lanzhou Institute of Husbandry and Pharmaceutical Sciences, Chinese Academy of Agricultural Sciences, Lan-Zhou, China
| | - Lixin Du
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China.
| | - Caihong Wei
- Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, China.
| |
Collapse
|
10
|
Schraiber JG, Edge MD, Pennell M. Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.10.579721. [PMID: 38496530 PMCID: PMC10942266 DOI: 10.1101/2024.02.10.579721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we derive a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., Genome-Wide Association Studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur using analytical theory and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate this by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study of this, we re-examine an analysis testing for co-evolution of expression levels between genes across a fungal phylogeny, and show that including covariance matrix eigenvectors as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.
Collapse
|
11
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. Genome Biol Evol 2023; 15:evad211. [PMID: 38000902 PMCID: PMC10709115 DOI: 10.1093/gbe/evad211] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred models for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best-fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
12
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527893. [PMID: 37645857 PMCID: PMC10461906 DOI: 10.1101/2023.02.09.527893] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well-described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred model for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Canada
- Department of Genetics, Washington University School of Medicine, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Canada
- Department of Quantitative and Computational Biology, University of Southern California, USA
- Department of Biological Sciences, University of Southern California, USA
| |
Collapse
|