1
|
Changes in life history and population size can explain the relative neutral diversity levels on X and autosomes in extant human populations. Proc Natl Acad Sci U S A 2020; 117:20063-20069. [PMID: 32747577 DOI: 10.1073/pnas.1915664117] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
In human populations, the relative levels of neutral diversity on the X and autosomes differ markedly from each other and from the naïve theoretical expectation of 3/4. Here we propose an explanation for these differences based on new theory about the effects of sex-specific life history and given pedigree-based estimates of the dependence of human mutation rates on sex and age. We demonstrate that life history effects, particularly longer generation times in males than in females, are expected to have had multiple effects on human X-to-autosome (X:A) diversity ratios, as a result of male-biased mutation rates, the equilibrium X:A ratio of effective population sizes, and the differential responses to changes in population size. We also show that the standard approach of using divergence between species to correct for male mutation bias results in biased estimates of X:A effective population size ratios. We obtain alternative estimates using pedigree-based estimates of the male mutation bias, which reveal that X:A ratios of effective population sizes are considerably greater than previously appreciated. Finally, we find that the joint effects of historical changes in life history and population size can explain the observed X:A diversity ratios in extant human populations. Our results suggest that ancestral human populations were highly polygynous, that non-African populations experienced a substantial reduction in polygyny and/or increase in the male-to-female ratio of generation times around the Out-of-Africa bottleneck, and that current diversity levels were affected by fairly recent changes in sex-specific life history.
Collapse
|
2
|
Zhang Y, Zhou Y, Liu X, Yu H, Li D, Zhang Y. Genetic diversity of the Sichuan snub-nosed monkey (Rhinopithecus roxellana) in Shennongjia National Park, China using RAD-seq analyses. Genetica 2019; 147:327-335. [DOI: 10.1007/s10709-019-00073-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 07/12/2019] [Indexed: 12/30/2022]
|
3
|
Zeng K, Jackson BC, Barton HJ. Methods for Estimating Demography and Detecting Between-Locus Differences in the Effective Population Size and Mutation Rate. Mol Biol Evol 2019; 36:423-433. [PMID: 30428070 PMCID: PMC6409433 DOI: 10.1093/molbev/msy212] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
It is known that the effective population size (Ne) and the mutation rate (u) vary across the genome. Here, we show that ignoring this heterogeneity may lead to biased estimates of past demography. To solve the problem, we develop new methods for jointly inferring past changes in population size and detecting variation in Ne and u between loci. These methods rely on either polymorphism data alone or both polymorphism and divergence data. In addition to inferring demography, we can use the methods to study a variety of questions: 1) comparing sex chromosomes with autosomes (for finding evidence for male-driven evolution, an unequal sex ratio, or sex-biased demographic changes) and 2) analyzing multilocus data from within autosomes or sex chromosomes (for studying determinants of variability in Ne and u). Simulations suggest that the methods can provide accurate parameter estimates and have substantial statistical power for detecting difference in Ne and u. As an example, we use the methods to analyze a polymorphism data set from Drosophila simulans. We find clear evidence for rapid population expansion. The results also indicate that the autosomes have a higher mutation rate than the X chromosome and that the sex ratio is probably female-biased. The new methods have been implemented in a user-friendly package.
Collapse
Affiliation(s)
- Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| | - Benjamin C Jackson
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Henry J Barton
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
4
|
Chen ZH, Zhang M, Lv FH, Ren X, Li WR, Liu MJ, Nam K, Bruford MW, Li MH. Contrasting Patterns of Genomic Diversity Reveal Accelerated Genetic Drift but Reduced Directional Selection on X-Chromosome in Wild and Domestic Sheep Species. Genome Biol Evol 2018; 10:1282-1297. [PMID: 29790980 PMCID: PMC5963296 DOI: 10.1093/gbe/evy085] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2018] [Indexed: 01/08/2023] Open
Abstract
Analyses of genomic diversity along the X chromosome and of its correlation with autosomal diversity can facilitate understanding of evolutionary forces in shaping sex-linked genomic architecture. Strong selective sweeps and accelerated genetic drift on the X-chromosome have been inferred in primates and other model species, but no such insight has yet been gained in domestic animals compared with their wild relatives. Here, we analyzed X-chromosome variability in a large ovine data set, including a BeadChip array for 943 ewes from the world’s sheep populations and 110 whole genomes of wild and domestic sheep. Analyzing whole-genome sequences, we observed a substantially reduced X-to-autosome diversity ratio (∼0.6) compared with the value expected under a neutral model (0.75). In particular, one large X-linked segment (43.05–79.25 Mb) was found to show extremely low diversity, most likely due to a high density of coding genes, featuring highly conserved regions. In general, we observed higher nucleotide diversity on the autosomes, but a flat diversity gradient in X-linked segments, as a function of increasing distance from the nearest genes, leading to a decreased X: autosome (X/A) diversity ratio and contrasting to the positive correlation detected in primates and other model animals. Our evidence suggests that accelerated genetic drift but reduced directional selection on X chromosome, as well as sex-biased demographic events, explain low X-chromosome diversity in sheep species. The distinct patterns of X-linked and X/A diversity we observed between Middle Eastern and non-Middle Eastern sheep populations can be explained by multiple migrations, selection, and admixture during the domestic sheep’s recent postdomestication demographic expansion, coupled with natural selection for adaptation to new environments. In addition, we identify important novel genes involved in abnormal behavioral phenotypes, metabolism, and immunity, under selection on the sheep X-chromosome.
Collapse
Affiliation(s)
- Ze-Hui Chen
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing, China.,College of Life Sciences, University of the Academy of Sciences, Beijing 100049, China
| | - Min Zhang
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing, China.,School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Feng-Hua Lv
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing, China
| | - Xue Ren
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing, China
| | - Wen-Rong Li
- Animal Biotechnological Research Center, Xinjiang Academy of Animal Science, Urumqi, China
| | - Ming-Jun Liu
- Animal Biotechnological Research Center, Xinjiang Academy of Animal Science, Urumqi, China
| | - Kiwoong Nam
- Diversité, Génomes et Interactions Microorganismes Insectes, Institut National de la Recherche Agronomique, University of Montpellier, Montpellier, France
| | - Michael W Bruford
- Organisms and Environment Division, School of Biosciences and Sustainable Places Research Institute, Cardiff University, Wales, United Kingdom
| | - Meng-Hua Li
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing, China
| |
Collapse
|
5
|
Abstract
Levels and patterns of genetic diversity can provide insights into a population’s history. In species with sex chromosomes, differences between genomic regions with unique inheritance patterns can be used to distinguish between different sets of possible demographic and selective events. This review introduces the differences in population history for sex chromosomes and autosomes, provides the expectations for genetic diversity across the genome under different evolutionary scenarios, and gives an introductory description for how deviations in these expectations are calculated and can be interpreted. Predominantly, diversity on the sex chromosomes has been used to explore and address three research areas: 1) Mating patterns and sex-biased variance in reproductive success, 2) signatures of selection, and 3) evidence for modes of speciation and introgression. After introducing the theory, this review catalogs recent studies of genetic diversity on the sex chromosomes across species within the major research areas that sex chromosomes are typically applied to, arguing that there are broad similarities not only between male-heterogametic (XX/XY) and female-heterogametic (ZZ/ZW) sex determination systems but also any mating system with reduced recombination in a sex-determining region. Further, general patterns of reduced diversity in nonrecombining regions are shared across plants and animals. There are unique patterns across populations with vastly different patterns of mating and speciation, but these do not tend to cluster by taxa or sex determination system.
Collapse
Affiliation(s)
- Melissa A Wilson Sayres
- School of Life Sciences, Center for Evolution and Medicine, The Biodesign Institute, Arizona State University
| |
Collapse
|
6
|
Shaw RE, Banks SC, Peakall R. The impact of mating systems and dispersal on fine-scale genetic structure at maternally, paternally and biparentally inherited markers. Mol Ecol 2017; 27:66-82. [PMID: 29154412 DOI: 10.1111/mec.14433] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 11/06/2017] [Accepted: 11/08/2017] [Indexed: 10/18/2022]
Abstract
For decades, studies have focused on how dispersal and mating systems influence genetic structure across populations or social groups. However, we still lack a thorough understanding of how these processes and their interaction shape spatial genetic patterns over a finer scale (tens-hundreds of metres). Using uniparentally inherited markers may help answer these questions, yet their potential has not been fully explored. Here, we use individual-level simulations to investigate the effects of dispersal and mating system on fine-scale genetic structure at autosomal, mitochondrial and Y chromosome markers. Using genetic spatial autocorrelation analysis, we found that dispersal was the major driver of fine-scale genetic structure across maternally, paternally and biparentally inherited markers. However, when dispersal was restricted (mean distance = 100 m), variation in mating behaviour created strong differences in the comparative level of structure detected at maternally and paternally inherited markers. Promiscuity reduced spatial genetic structure at Y chromosome loci (relative to monogamy), whereas structure increased under polygyny. In contrast, mitochondrial and autosomal markers were robust to differences in the specific mating system, although genetic structure increased across all markers when reproductive success was skewed towards fewer individuals. Comparing males and females at Y chromosome vs. mitochondrial markers, respectively, revealed that some mating systems can generate similar patterns to those expected under sex-biased dispersal. This demonstrates the need for caution when inferring ecological and behavioural processes from genetic results. Comparing patterns between the sexes, across a range of marker types, may help us tease apart the processes shaping fine-scale genetic structure.
Collapse
Affiliation(s)
- Robyn E Shaw
- Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, ACT, Australia.,The Fenner School of Environment and Society, The Australian National University, Canberra, ACT, Australia
| | - Sam C Banks
- The Fenner School of Environment and Society, The Australian National University, Canberra, ACT, Australia
| | - Rod Peakall
- Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| |
Collapse
|
7
|
Jackson BC, Campos JL, Haddrill PR, Charlesworth B, Zeng K. Variation in the Intensity of Selection on Codon Bias over Time Causes Contrasting Patterns of Base Composition Evolution in Drosophila. Genome Biol Evol 2017; 9:102-123. [PMID: 28082609 PMCID: PMC5381600 DOI: 10.1093/gbe/evw291] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2016] [Indexed: 12/11/2022] Open
Abstract
Four-fold degenerate coding sites form a major component of the genome, and are often used to make inferences about selection and demography, so that understanding their evolution is important. Despite previous efforts, many questions regarding the causes of base composition changes at these sites in Drosophila remain unanswered. To shed further light on this issue, we obtained a new whole-genome polymorphism data set from D. simulans. We analyzed samples from the putatively ancestral range of D. simulans, as well as an existing polymorphism data set from an African population of D. melanogaster. By using D. yakuba as an outgroup, we found clear evidence for selection on 4-fold sites along both lineages over a substantial period, with the intensity of selection increasing with GC content. Based on an explicit model of base composition evolution, we suggest that the observed AT-biased substitution pattern in both lineages is probably due to an ancestral reduction in selection intensity, and is unlikely to be the result of an increase in mutational bias towards AT alone. By using two polymorphism-based methods for estimating selection coefficients over different timescales, we show that the selection intensity on codon usage has been rather stable in D. simulans in the recent past, but the long-term estimates in D. melanogaster are much higher than the short-term ones, indicating a continuing decline in selection intensity, to such an extent that the short-term estimates suggest that selection is only active in the most GC-rich parts of the genome. Finally, we provide evidence for complex evolutionary patterns in the putatively neutral short introns, which cannot be explained by the standard GC-biased gene conversion model. These results reveal a dynamic picture of base composition evolution.
Collapse
Affiliation(s)
- Benjamin C Jackson
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| | - José L Campos
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Penelope R Haddrill
- Centre for Forensic Science, Department of Pure and Applied Chemistry, University of Strathclyde, Glasgow, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
8
|
Evans BJ, Tosi AJ, Zeng K, Dushoff J, Corvelo A, Melnick DJ. Speciation over the edge: gene flow among non-human primate species across a formidable biogeographic barrier. ROYAL SOCIETY OPEN SCIENCE 2017; 4:170351. [PMID: 29134059 PMCID: PMC5666242 DOI: 10.1098/rsos.170351] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 09/18/2017] [Indexed: 05/30/2023]
Abstract
Many genera of terrestrial vertebrates diversified exclusively on one or the other side of Wallace's Line, which lies between Borneo and Sulawesi islands in Southeast Asia, and demarcates one of the sharpest biogeographic transition zones in the world. Macaque monkeys are unusual among vertebrate genera in that they are distributed on both sides of Wallace's Line, raising the question of whether dispersal across this barrier was an evolutionary one-off or a more protracted exchange-and if the latter, what were the genomic consequences. To explore the nature of speciation over the edge of this biogeographic divide, we used genomic data to test for evidence of gene flow between macaque species across Wallace's Line after macaques colonized Sulawesi. We recovered evidence of post-colonization gene flow, most prominently on the X chromosome. These results are consistent with the proposal that gene flow is a pervasive component of speciation-even when barriers to gene flow seem almost insurmountable.
Collapse
Affiliation(s)
- Ben J. Evans
- Biology Department, Life Sciences Building Room 328, McMaster University, 1280 Main Street West, Hamilton, ON, Canada L8S4K1
- Department of Ecology, Evolution, and Environmental Biology, Columbia University, 10th floor Schermerhorn Extension, 119th Street and Amsterdam Avenue, New York, NY 10027, USA
| | - Anthony J. Tosi
- Anthropology Department, Kent State University, 238 Lowry Hall, Kent, OH 44242, USA
| | - Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | - Jonathan Dushoff
- Biology Department, Life Sciences Building Room 328, McMaster University, 1280 Main Street West, Hamilton, ON, Canada L8S4K1
| | - André Corvelo
- New York Genome Center, 101 Avenue of the Americas, New York, NY 10013, USA
| | - Don J. Melnick
- Department of Ecology, Evolution, and Environmental Biology, Columbia University, 10th floor Schermerhorn Extension, 119th Street and Amsterdam Avenue, New York, NY 10027, USA
| |
Collapse
|
9
|
Osada N. Genetic diversity in humans and non-human primates and its evolutionary consequences. Genes Genet Syst 2016; 90:133-45. [PMID: 26510568 DOI: 10.1266/ggs.90.133] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Genetic diversity is a key parameter in population genetics and is important for understanding the process of evolution and for the development of appropriate conservation strategies. Recent advances in sequencing technology have enabled the measurement of genetic diversity of various organisms at the nucleotide level and on a genome-wide scale, yielding more precise estimates than were previously achievable. In this review, I have compiled and summarized the estimates of genetic diversity in humans and non-human primates based on recent genome-wide studies. Although studies on population genetics demonstrated fluctuations in population sizes over time, general patterns have emerged. As shown previously, genetic diversity in humans is one of the lowest among primates; however, certain other primate species exhibit genetic diversity that is comparable to or even lower than that in humans. There exists greater than 10-fold variation in genetic diversity among primate species, and I found weak correlation with species fecundity but not with body or propagule size. I further discuss the potential evolutionary consequences of population size decline on the evolution of primate species. The level of genetic diversity negatively correlates with the ratio of non-synonymous to synonymous polymorphisms in a population, suggesting that proportionally greater numbers of slightly deleterious mutations segregate in small rather than large populations. Although population size decline is likely to promote the fixation of slightly deleterious mutations, there are molecular mechanisms, such as compensatory mutations at various molecular levels, which may prevent fitness decline at the population level. The effects of slightly deleterious mutations from theoretical and empirical studies and their relevance to conservation biology are also discussed in this review.
Collapse
Affiliation(s)
- Naoki Osada
- Department of Population Genetics, National Institute of Genetics
| |
Collapse
|
10
|
Ghenu AH, Bolker BM, Melnick DJ, Evans BJ. Multicopy gene family evolution on primate Y chromosomes. BMC Genomics 2016; 17:157. [PMID: 26925773 PMCID: PMC4772468 DOI: 10.1186/s12864-015-2187-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 11/02/2015] [Indexed: 12/12/2022] Open
Abstract
Background The primate Y chromosome is distinguished by a lack of inter-chromosomal recombination along most of its length, extensive gene loss, and a prevalence of repetitive elements. A group of genes on the male-specific portion of the Y chromosome known as the “ampliconic genes” are present in multiple copies that are sometimes part of palindromes, and that undergo a form of intra-chromosomal recombination called gene conversion, wherein the nucleotides of one copy are homogenized by those of another. With the aim of further understanding gene family evolution of these genes, we collected nucleotide sequence and gene copy number information for several species of papionin monkey. We then tested for evidence of gene conversion, and developed a novel statistical framework to evaluate alternative models of gene family evolution using our data combined with other information from a human, a chimpanzee, and a rhesus macaque. Results Our results (i) recovered evidence for several novel examples of gene conversion in papionin monkeys and indicate that (ii) ampliconic gene families evolve faster than autosomal gene families and than single-copy genes on the Y chromosome and that (iii) Y-linked singleton and autosomal gene families evolved faster in humans and chimps than they do in the other Old World Monkey lineages we studied. Conclusions Rapid evolution of ampliconic genes cannot be attributed solely to residence on the Y chromosome, nor to variation between primate lineages in the rate of gene family evolution. Instead other factors, such as natural selection and gene conversion, appear to play a role in driving temporal and genomic evolutionary heterogeneity in primate gene families. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2187-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ana-Hermina Ghenu
- Biology Department, McMaster University, 1280 Main Street West, Hamilton, L8S 4K1, Canada.
| | - Benjamin M Bolker
- Biology Department, McMaster University, 1280 Main Street West, Hamilton, L8S 4K1, Canada.,Department of Mathematics & Statistics, McMaster University, 1280 Main Street West, Hamilton, L8S 4K1, Canada
| | - Don J Melnick
- Department of Ecology, Evolution, and Environmental Biology, Columbia University, 10th Floor Schermerhorn Extension, New York, 10027, USA
| | - Ben J Evans
- Biology Department, McMaster University, 1280 Main Street West, Hamilton, L8S 4K1, Canada.
| |
Collapse
|
11
|
Abstract
High-throughput techniques based on restriction site-associated DNA sequencing (RADseq) are enabling the low-cost discovery and genotyping of thousands of genetic markers for any species, including non-model organisms, which is revolutionizing ecological, evolutionary and conservation genetics. Technical differences among these methods lead to important considerations for all steps of genomics studies, from the specific scientific questions that can be addressed, and the costs of library preparation and sequencing, to the types of bias and error inherent in the resulting data. In this Review, we provide a comprehensive discussion of RADseq methods to aid researchers in choosing among the many different approaches and avoiding erroneous scientific conclusions from RADseq data, a problem that has plagued other genetic marker types in the past.
Collapse
|
12
|
Evans BJ, Kwon T. Molecular Polymorphism and Divergence of Duplicated Genes in Tetraploid African Clawed Frogs (Xenopus). Cytogenet Genome Res 2015; 145:243-52. [PMID: 26066830 DOI: 10.1159/000431108] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Genome duplication creates redundancy in proteins and their interaction networks, and subsequent smaller-scale gene duplication can further amplify genetic redundancy. Mutations then lead to the loss, maintenance or functional divergence of duplicated genes. Genome duplication occurred many times in African clawed frogs (genus Xenopus), and almost all extant species in this group evolved from a polyploid ancestor. To better understand the nature of selective constraints in a polyploid genome, we examined molecular polymorphism and divergence of duplicates and single-copy genes in 2 tetraploid African clawed frog species, Xenopus laevis and X. victorianus. We found that molecular polymorphism in the coding regions of putative duplicated genes was higher than in singletons, but not significantly so. Our findings also suggest that transcriptome evolution in polyploids is influenced by variation in the genome-wide mutation rate, and do not reject the hypothesis that gene dosage balance is also important.
Collapse
Affiliation(s)
- Ben J Evans
- Department of Biology, McMaster University, Hamilton, Ont., Canada
| | | |
Collapse
|
13
|
Harvey MG, Judy CD, Seeholzer GF, Maley JM, Graves GR, Brumfield RT. Similarity thresholds used in DNA sequence assembly from short reads can reduce the comparability of population histories across species. PeerJ 2015; 3:e895. [PMID: 25922792 PMCID: PMC4411482 DOI: 10.7717/peerj.895] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 03/27/2015] [Indexed: 01/19/2023] Open
Abstract
Comparing inferences among datasets generated using short read sequencing may provide insight into the concerted impacts of divergence, gene flow and selection across organisms, but comparisons are complicated by biases introduced during dataset assembly. Sequence similarity thresholds allow the de novo assembly of short reads into clusters of alleles representing different loci, but the resulting datasets are sensitive to both the similarity threshold used and to the variation naturally present in the organism under study. Thresholds that require high sequence similarity among reads for assembly (stringent thresholds) as well as highly variable species may result in datasets in which divergent alleles are lost or divided into separate loci ('over-splitting'), whereas liberal thresholds increase the risk of paralogous loci being combined into a single locus ('under-splitting'). Comparisons among datasets or species are therefore potentially biased if different similarity thresholds are applied or if the species differ in levels of within-lineage genetic variation. We examine the impact of a range of similarity thresholds on assembly of empirical short read datasets from populations of four different non-model bird lineages (species or species pairs) with different levels of genetic divergence. We find that, in all species, stringent similarity thresholds result in fewer alleles per locus than more liberal thresholds, which appears to be the result of high levels of over-splitting. The frequency of putative under-splitting, conversely, is low at all thresholds. Inferred genetic distances between individuals, gene tree depths, and estimates of the ancestral mutation-scaled effective population size (θ) differ depending upon the similarity threshold applied. Relative differences in inferences across species differ even when the same threshold is applied, but may be dramatically different when datasets assembled under different thresholds are compared. These differences not only complicate comparisons across species, but also preclude the application of standard mutation rates for parameter calibration. We suggest some best practices for assembling short read data to maximize comparability, such as using more liberal thresholds and examining the impact of different thresholds on each dataset.
Collapse
Affiliation(s)
- Michael G. Harvey
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Caroline Duffie Judy
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
- Department of Vertebrate Zoology, MRC-116, National Museum of Natural History, Smithsonian Institution, Washington, D.C., USA
| | - Glenn F. Seeholzer
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - James M. Maley
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
- Moore Laboratory of Zoology, Occidental College, Los Angeles, CA, USA
| | - Gary R. Graves
- Department of Vertebrate Zoology, MRC-116, National Museum of Natural History, Smithsonian Institution, Washington, D.C., USA
- Center for Macroecology, Evolution and Climate, Natural History Museum of Denmark, University of Copenhagen, Copenhagen Ø, Denmark
| | - Robb T. Brumfield
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| |
Collapse
|
14
|
Abstract
Some species exhibit very high levels of DNA sequence variability; there is also evidence for the existence of heritable epigenetic variants that experience state changes at a much higher rate than sequence variants. In both cases, the resulting high diversity levels within a population (hyperdiversity) mean that standard population genetics methods are not trustworthy. We analyze a population genetics model that incorporates purifying selection, reversible mutations, and genetic drift, assuming a stationary population size. We derive analytical results for both population parameters and sample statistics and discuss their implications for studies of natural genetic and epigenetic variation. In particular, we find that (1) many more intermediate-frequency variants are expected than under standard models, even with moderately strong purifying selection, and (2) rates of evolution under purifying selection may be close to, or even exceed, neutral rates. These findings are related to empirical studies of sequence and epigenetic variation.
Collapse
|