1
|
Harris M, Kim BY, Garud N. Enrichment of hard sweeps on the X chromosome compared to autosomes in six Drosophila species. Genetics 2024; 226:iyae019. [PMID: 38366786 PMCID: PMC10990427 DOI: 10.1093/genetics/iyae019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 01/17/2024] [Accepted: 01/18/2024] [Indexed: 02/18/2024] Open
Abstract
The X chromosome, being hemizygous in males, is exposed one-third of the time increasing the visibility of new mutations to natural selection, potentially leading to different evolutionary dynamics than autosomes. Recently, we found an enrichment of hard selective sweeps over soft selective sweeps on the X chromosome relative to the autosomes in a North American population of Drosophila melanogaster. To understand whether this enrichment is a universal feature of evolution on the X chromosome, we analyze diversity patterns across 6 commonly studied Drosophila species. We find an increased proportion of regions with steep reductions in diversity and elevated homozygosity on the X chromosome compared to autosomes. To assess if these signatures are consistent with positive selection, we simulate a wide variety of evolutionary scenarios spanning variations in demography, mutation rate, recombination rate, background selection, hard sweeps, and soft sweeps and find that the diversity patterns observed on the X are most consistent with hard sweeps. Our findings highlight the importance of sex chromosomes in driving evolutionary processes and suggest that hard sweeps have played a significant role in shaping diversity patterns on the X chromosome across multiple Drosophila species.
Collapse
Affiliation(s)
- Mariana Harris
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Bernard Y Kim
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Nandita Garud
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
2
|
Song H, Chu J, Li W, Li X, Fang L, Han J, Zhao S, Ma Y. A Novel Approach Utilizing Domain Adversarial Neural Networks for the Detection and Classification of Selective Sweeps. Adv Sci (Weinh) 2024; 11:e2304842. [PMID: 38308186 PMCID: PMC11005742 DOI: 10.1002/advs.202304842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/10/2024] [Indexed: 02/04/2024]
Abstract
The identification and classification of selective sweeps are of great significance for improving the understanding of biological evolution and exploring opportunities for precision medicine and genetic improvement. Here, a domain adaptation sweep detection and classification (DASDC) method is presented to balance the alignment of two domains and the classification performance through a domain-adversarial neural network and its adversarial learning modules. DASDC effectively addresses the issue of mismatch between training data and real genomic data in deep learning models, leading to a significant improvement in its generalization capability, prediction robustness, and accuracy. The DASDC method demonstrates improved identification performance compared to existing methods and excels in classification performance, particularly in scenarios where there is a mismatch between application data and training data. The successful implementation of DASDC in real data of three distinct species highlights its potential as a useful tool for identifying crucial functional genes and investigating adaptive evolutionary mechanisms, particularly with the increasing availability of genomic data.
Collapse
Affiliation(s)
- Hui Song
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Jinyu Chu
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Wangjiao Li
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
| | - Lingzhao Fang
- Center for Quantitative Genetics and GenomicsAarhus UniversityAarhus8000Denmark
| | - Jianlin Han
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- CAAS‐ILRI Joint Laboratory on Livestock and Forage Genetic ResourcesInstitute of Animal ScienceChinese Academy of Agricultural Sciences (CAAS)Beijing100193China
- Livestock Genetics ProgramInternational Livestock Research Institute (ILRI)Nairobi00100Kenya
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
- Lingnan Modern Agricultural Science and Technology Guangdong LaboratoryGuangzhou510642China
| | - Yunlong Ma
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
- Lingnan Modern Agricultural Science and Technology Guangdong LaboratoryGuangzhou510642China
| |
Collapse
|
3
|
Lyulina AS, Liu Z, Good BH. Linkage equilibrium between rare mutations. bioRxiv 2024:2024.03.28.587282. [PMID: 38617331 PMCID: PMC11014483 DOI: 10.1101/2024.03.28.587282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Recombination breaks down genetic linkage by reshuffling existing variants onto new genetic backgrounds. These dynamics are traditionally quantified by examining the correlations between alleles, and how they decay as a function of the recombination rate. However, the magnitudes of these correlations are strongly influenced by other evolutionary forces like natural selection and genetic drift, making it difficult to tease out the effects of recombination. Here we introduce a theoretical framework for analyzing an alternative family of statistics that measure the homoplasy produced by recombination. We derive analytical expressions that predict how these statistics depend on the rates of recombination and recurrent mutation, the strength of negative selection and genetic drift, and the present-day frequencies of the mutant alleles. We find that the degree of homoplasy can strongly depend on this frequency scale, which reflects the underlying timescales over which these mutations occurred. We show how these scaling properties can be used to isolate the effects of recombination, and discuss their implications for the rates of horizontal gene transfer in bacteria.
Collapse
Affiliation(s)
- Anastasia S Lyulina
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Zhiru Liu
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Benjamin H Good
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub - San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
4
|
Thom G, Moreira LR, Batista R, Gehara M, Aleixo A, Smith BT. Genomic Architecture Predicts Tree Topology, Population Structuring, and Demographic History in Amazonian Birds. Genome Biol Evol 2024; 16:evae002. [PMID: 38236173 PMCID: PMC10823491 DOI: 10.1093/gbe/evae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 10/26/2023] [Accepted: 12/12/2023] [Indexed: 01/19/2024] Open
Abstract
Geographic barriers are frequently invoked to explain genetic structuring across the landscape. However, inferences on the spatial and temporal origins of population variation have been largely limited to evolutionary neutral models, ignoring the potential role of natural selection and intrinsic genomic processes known as genomic architecture in producing heterogeneity in differentiation across the genome. To test how variation in genomic characteristics (e.g. recombination rate) impacts our ability to reconstruct general patterns of differentiation between species that cooccur across geographic barriers, we sequenced the whole genomes of multiple bird populations that are distributed across rivers in southeastern Amazonia. We found that phylogenetic relationships within species and demographic parameters varied across the genome in predictable ways. Genetic diversity was positively associated with recombination rate and negatively associated with species tree support. Gene flow was less pervasive in genomic regions of low recombination, making these windows more likely to retain patterns of population structuring that matched the species tree. We further found that approximately a third of the genome showed evidence of selective sweeps and linked selection, skewing genome-wide estimates of effective population sizes and gene flow between populations toward lower values. In sum, we showed that the effects of intrinsic genomic characteristics and selection can be disentangled from neutral processes to elucidate spatial patterns of population differentiation.
Collapse
Affiliation(s)
- Gregory Thom
- Department of Ornithology, American Museum of Natural History, New York, NY, USA
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Lucas Rocha Moreira
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Department of Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Romina Batista
- Programa de Coleções Biológicas, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil
- School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Marcelo Gehara
- Department of Earth and Environmental Sciences, Rutgers University, Newark, NJ, USA
| | - Alexandre Aleixo
- Finnish Museum of Natural History, University of Helsinki, Helsinki, Finland
- Department of Environmental Genomics, Instituto Tecnológico Vale, Belém, Brazil
| | - Brian Tilston Smith
- Department of Ornithology, American Museum of Natural History, New York, NY, USA
| |
Collapse
|
5
|
Harris M, Kim B, Garud N. Enrichment of hard sweeps on the X chromosome compared to autosomes in six Drosophila species. bioRxiv 2023:2023.06.21.545888. [PMID: 38106201 PMCID: PMC10723260 DOI: 10.1101/2023.06.21.545888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The X chromosome, being hemizygous in males, is exposed one third of the time increasing the visibility of new mutations to natural selection, potentially leading to different evolutionary dynamics than autosomes. Recently, we found an enrichment of hard selective sweeps over soft selective sweeps on the X chromosome relative to the autosomes in a North American population of Drosophila melanogaster. To understand whether this enrichment is a universal feature of evolution on the X chromosome, we analyze diversity patterns across six commonly studied Drosophila species. We find an increased proportion of regions with steep reductions in diversity and elevated homozygosity on the X chromosome compared to autosomes. To assess if these signatures are consistent with positive selection, we simulate a wide variety of evolutionary scenarios spanning variations in demography, mutation rate, recombination rate, background selection, hard sweeps, and soft sweeps, and find that the diversity patterns observed on the X are most consistent with hard sweeps. Our findings highlight the importance of sex chromosomes in driving evolutionary processes and suggest that hard sweeps have played a significant role in shaping diversity patterns on the X chromosome across multiple Drosophila species.
Collapse
Affiliation(s)
- Mariana Harris
- Department of Computational Medicine, University of California Los Angeles, Los Angeles California, United States of America
| | - Bernard Kim
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Nandita Garud
- Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles California, United States of America
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
| |
Collapse
|
6
|
Schrider DR. Allelic gene conversion softens selective sweeps. bioRxiv 2023:2023.12.05.570141. [PMID: 38106127 PMCID: PMC10723294 DOI: 10.1101/2023.12.05.570141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The prominence of positive selection, in which beneficial mutations are favored by natural selection and rapidly increase in frequency, is a subject of intense debate. Positive selection can result in selective sweeps, in which the haplotype(s) bearing the adaptive allele "sweep" through the population, thereby removing much of the genetic diversity from the region surrounding the target of selection. Two models of selective sweeps have been proposed: classical sweeps, or "hard sweeps", in which a single copy of the adaptive allele sweeps to fixation, and "soft sweeps", in which multiple distinct copies of the adaptive allele leave descendants after the sweep. Soft sweeps can be the outcome of recurrent mutation to the adaptive allele, or the presence of standing genetic variation consisting of multiple copies of the adaptive allele prior to the onset of selection. Importantly, soft sweeps will be common when populations can rapidly adapt to novel selective pressures, either because of a high mutation rate or because adaptive alleles are already present. The prevalence of soft sweeps is especially controversial, and it has been noted that selection on standing variation or recurrent mutations may not always produce soft sweeps. Here, we show that the inverse is true: selection on single-origin de novo mutations may often result in an outcome that is indistinguishable from a soft sweep. This is made possible by allelic gene conversion, which "softens" hard sweeps by copying the adaptive allele onto multiple genetic backgrounds, a process we refer to as a "pseudo-soft" sweep. We carried out a simulation study examining the impact of gene conversion on sweeps from a single de novo variant in models of human, Drosophila, and Arabidopsis populations. The fraction of simulations in which gene conversion had produced multiple haplotypes with the adaptive allele upon fixation was appreciable. Indeed, under realistic demographic histories and gene conversion rates, even if selection always acts on a single-origin mutation, sweeps involving multiple haplotypes are more likely than hard sweeps in large populations, especially when selection is not extremely strong. Thus, even when the mutation rate is low or there is no standing variation, hard sweeps are expected to be the exception rather than the rule in large populations. These results also imply that the presence of signatures of soft sweeps does not necessarily mean that adaptation has been especially rapid or is not mutation limited.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
| |
Collapse
|
7
|
Gutiérrez-Guerrero YT, Phifer-Rixey M, Nachman MW. Across two continents: the genomic basis of environmental adaptation in house mice ( Mus musculus domesticus) from the Americas. bioRxiv 2023:2023.10.30.564674. [PMID: 37961195 PMCID: PMC10634997 DOI: 10.1101/2023.10.30.564674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Parallel clines across environmental gradients can be strong evidence of adaptation. House mice (Mus musculus domesticus) were introduced to the Americas by European colonizers and are now widely distributed from Tierra del Fuego to Alaska. Multiple aspects of climate, such as temperature, vary predictably across latitude in the Americas. Past studies of North American populations across latitudinal gradients provided evidence of environmental adaptation in traits related to body size, metabolism, and behavior and identified candidate genes using selection scans. Here, we investigate genomic signals of environmental adaptation on a second continent, South America, and ask whether there is evidence of parallel adaptation across multiple latitudinal transects in the Americas. We first identified loci across the genome showing signatures of selection related to climatic variation in mice sampled across a latitudinal transect in South America, accounting for neutral population structure. Consistent with previous results, most candidate SNPs were in regulatory regions. Genes containing the most extreme outliers relate to traits such as body weight or size, metabolism, immunity, fat, and development or function of the eye as well as traits associated with the cardiovascular and renal systems. We then combined these results with published results from two transects in North America. While most candidate genes were unique to individual transects, we found significant overlap among candidate genes identified independently in the three transects, providing strong evidence of parallel adaptation and identifying genes that likely underlie recent environmental adaptation in house mice across North and South America.
Collapse
Affiliation(s)
- Yocelyn T. Gutiérrez-Guerrero
- Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, United States of America
| | - Megan Phifer-Rixey
- Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, United States of America
- Department of Biology, Drexel University, Philadelphia, PA, United States of America
| | - Michael W. Nachman
- Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, United States of America
| |
Collapse
|
8
|
Pivirotto AM, Platt A, Patel R, Kumar S, Hey J. Analyses of allele age and fitness impact reveal human beneficial alleles to be older than neutral controls. bioRxiv 2023:2023.10.09.561569. [PMID: 37873438 PMCID: PMC10592680 DOI: 10.1101/2023.10.09.561569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
A classic population genetic prediction is that alleles experiencing directional selection should swiftly traverse allele frequency space, leaving detectable reductions in genetic variation in linked regions. However, despite this expectation, identifying clear footprints of beneficial allele passage has proven to be surprisingly challenging. We addressed the basic premise underlying this expectation by estimating the ages of large numbers of beneficial and deleterious alleles in a human population genomic data set. Deleterious alleles were found to be young, on average, given their allele frequency. However, beneficial alleles were older on average than non-coding, non-regulatory alleles of the same frequency. This finding is not consistent with directional selection and instead indicates some type of balancing selection. Among derived beneficial alleles, those fixed in the population show higher local recombination rates than those still segregating, consistent with a model in which new beneficial alleles experience an initial period of balancing selection due to linkage disequilibrium with deleterious recessive alleles. Alleles that ultimately fix following a period of balancing selection will leave a modest 'soft' sweep impact on the local variation, consistent with the overall paucity of species-wide 'hard' sweeps in human genomes.
Collapse
Affiliation(s)
| | - Alexander Platt
- Temple University, Department of Biology, Philadelphia PA 19122, USA
- University of Pennsylvania, Department of Genetics, Philadelphia PA 19104, USA
| | - Ravi Patel
- Temple University, Department of Biology, Philadelphia PA 19122, USA
- Institute for Genomics and Evolutionary Medicine, Temple University, PA 19122, USA
| | - Sudhir Kumar
- Temple University, Department of Biology, Philadelphia PA 19122, USA
- Institute for Genomics and Evolutionary Medicine, Temple University, PA 19122, USA
| | - Jody Hey
- Temple University, Department of Biology, Philadelphia PA 19122, USA
| |
Collapse
|
9
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. Evolution 2023; 77:2113-2127. [PMID: 37395482 PMCID: PMC10547124 DOI: 10.1093/evolut/qpad120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/15/2023] [Accepted: 06/30/2023] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modeled by a realistic mutation rate and as part of a realistic distribution of fitness effects, as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modeled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false-positive rates are in excess of true-positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
10
|
Whitehouse LS, Schrider DR. Timesweeper: accurately identifying selective sweeps using population genomic time series. Genetics 2023; 224:iyad084. [PMID: 37157914 PMCID: PMC10324941 DOI: 10.1093/genetics/iyad084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 07/25/2022] [Accepted: 04/25/2023] [Indexed: 05/10/2023] Open
Abstract
Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data. This is because in most population genetic studies of natural populations, only a single period of time can be sampled. Recent advancements in sequencing technology, including improvements in extracting and sequencing ancient DNA, have made repeated samplings of a population possible, allowing for more direct analysis of recent evolutionary dynamics. Serial sampling of organisms with shorter generation times has also become more feasible due to improvements in the cost and throughput of sequencing. With these advances in mind, here we present Timesweeper, a fast and accurate convolutional neural network-based tool for identifying selective sweeps in data consisting of multiple genomic samplings of a population over time. Timesweeper analyzes population genomic time-series data by first simulating training data under a demographic model appropriate for the data of interest, training a one-dimensional convolutional neural network on said simulations, and inferring which polymorphisms in this serialized data set were the direct target of a completed or ongoing selective sweep. We show that Timesweeper is accurate under multiple simulated demographic and sampling scenarios, identifies selected variants with high resolution, and estimates selection coefficients more accurately than existing methods. In sum, we show that more accurate inferences about natural selection are possible when genomic time-series data are available; such data will continue to proliferate in coming years due to both the sequencing of ancient samples and repeated samplings of extant populations with faster generation times, as well as experimentally evolved populations where time-series data are often generated. Methodological advances such as Timesweeper thus have the potential to help resolve the controversy over the role of positive selection in the genome. We provide Timesweeper as a Python package for use by the community.
Collapse
Affiliation(s)
- Logan S Whitehouse
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27514, USA
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27514, USA
| |
Collapse
|
11
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. bioRxiv 2023:2023.06.15.545166. [PMID: 37398347 PMCID: PMC10312679 DOI: 10.1101/2023.06.15.545166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modelled by a realistic mutation rate and as part of a realistic distribution of fitness effects (DFE), as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modelled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false positive rates are in excess of true positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong. Teaser Text Outlier-based genomic scans have proven a popular approach for identifying loci that have potentially experienced recent positive selection. However, it has previously been shown that an evolutionarily appropriate baseline model that incorporates non-equilibrium population histories, purifying and background selection, and variation in mutation and recombination rates is necessary to reduce often extreme false positive rates when performing genomic scans. Here we evaluate the power to detect recurrent selective sweeps using common SFS-based and haplotype-based methods under these increasingly realistic models. We find that while these appropriate evolutionary baselines are essential to reduce false positive rates, the power to accurately detect recurrent selective sweeps is generally low across much of the biologically relevant parameter space.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Present address: Department of Biology, Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | | |
Collapse
|
12
|
Terbot JW, Johri P, Liphardt SW, Soni V, Pfeifer SP, Cooper BS, Good JM, Jensen JD. Developing an appropriate evolutionary baseline model for the study of SARS-CoV-2 patient samples. PLoS Pathog 2023; 19:e1011265. [PMID: 37018331 PMCID: PMC10075409 DOI: 10.1371/journal.ppat.1011265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2023] Open
Abstract
Over the past 3 years, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has spread through human populations in several waves, resulting in a global health crisis. In response, genomic surveillance efforts have proliferated in the hopes of tracking and anticipating the evolution of this virus, resulting in millions of patient isolates now being available in public databases. Yet, while there is a tremendous focus on identifying newly emerging adaptive viral variants, this quantification is far from trivial. Specifically, multiple co-occurring and interacting evolutionary processes are constantly in operation and must be jointly considered and modeled in order to perform accurate inference. We here outline critical individual components of such an evolutionary baseline model-mutation rates, recombination rates, the distribution of fitness effects, infection dynamics, and compartmentalization-and describe the current state of knowledge pertaining to the related parameters of each in SARS-CoV-2. We close with a series of recommendations for future clinical sampling, model construction, and statistical analysis.
Collapse
Affiliation(s)
- John W Terbot
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Parul Johri
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Schuyler W Liphardt
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
| | - Vivak Soni
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Susanne P Pfeifer
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Brandon S Cooper
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
| | - Jeffrey M Good
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
| | - Jeffrey D Jensen
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| |
Collapse
|
13
|
Jensen JD. Population genetic concerns related to the interpretation of empirical outliers and the neglect of common evolutionary processes. Heredity (Edinb) 2023; 130:109-110. [PMID: 36829044 PMCID: PMC9981695 DOI: 10.1038/s41437-022-00575-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 02/26/2023] Open
Affiliation(s)
- Jeffrey D Jensen
- School of Life Science, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
14
|
Harris M, Garud NR. Enrichment of Hard Sweeps on the X Chromosome in Drosophila melanogaster. Mol Biol Evol 2022; 40:6955808. [PMID: 36546413 PMCID: PMC9825254 DOI: 10.1093/molbev/msac268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 11/11/2022] [Accepted: 12/05/2022] [Indexed: 12/24/2022] Open
Abstract
The characteristic properties of the X chromosome, such as male hemizygosity and its unique inheritance pattern, expose it to natural selection in a way that can be different from the autosomes. Here, we investigate the differences in the tempo and mode of adaptation on the X chromosome and autosomes in a population of Drosophila melanogaster. Specifically, we test the hypothesis that due to hemizygosity and a lower effective population size on the X, the relative proportion of hard sweeps, which are expected when adaptation is gradual, compared with soft sweeps, which are expected when adaptation is rapid, is greater on the X than on the autosomes. We quantify the incidence of hard versus soft sweeps in North American D. melanogaster population genomic data with haplotype homozygosity statistics and find an enrichment of the proportion of hard versus soft sweeps on the X chromosome compared with the autosomes, confirming predictions we make from simulations. Understanding these differences may enable a deeper understanding of how important phenotypes arise as well as the impact of fundamental evolutionary parameters on adaptation, such as dominance, sex-specific selection, and sex-biased demography.
Collapse
Affiliation(s)
- Mariana Harris
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA
| | | |
Collapse
|
15
|
Souilmi Y, Tobler R, Johar A, Williams M, Grey ST, Schmidt J, Teixeira JC, Rohrlach A, Tuke J, Johnson O, Gower G, Turney C, Cox M, Cooper A, Huber CD. Admixture has obscured signals of historical hard sweeps in humans. Nat Ecol Evol 2022; 6:2003-2015. [PMID: 36316412 PMCID: PMC9715430 DOI: 10.1038/s41559-022-01914-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Accepted: 09/16/2022] [Indexed: 11/06/2022]
Abstract
The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In particular, admixture between diverged populations can distort or hide prior selection events in modern genomes, though this process is not explicitly accounted for in most selection studies despite its apparent ubiquity in humans and other species. Through analyses of ancient and modern human genomes, we show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes. Our results imply that this canonical mode of selection has probably been underappreciated in the evolutionary history of humans and suggest that our current understanding of the tempo and mode of selection in natural populations may be inaccurate.
Collapse
Affiliation(s)
- Yassine Souilmi
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
| | - Raymond Tobler
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Evolution of Cultural Diversity Initiative, Australian National University, Canberra, Australian Capital Territory, Australia.
| | - Angad Johar
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, USA.
| | - Matthew Williams
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Shane T Grey
- Transplantation Immunology Group, Immunology Division, Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
- St Vincent's Clinical School, Faculty of Medicine, UNSW, Darlinghurst, New South Wales, Australia
| | - Joshua Schmidt
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - João C Teixeira
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Adam Rohrlach
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Jonathan Tuke
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- School of Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Olivia Johnson
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Graham Gower
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Chris Turney
- Chronos 14Carbon-Cycle Facility and Earth and Sustainability Science Research Centre, University of New South Wales, Sydney, New South Wales, Australia
| | - Murray Cox
- Statistics and Bioinformatics Group, School of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| | - Alan Cooper
- South Australian Museum, Adelaide, South Australia, Australia.
- BlueSky Genetics, Ashton, South Australia, Australia.
| | - Christian D Huber
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Biology, Penn State University, University Park, PA, USA.
| |
Collapse
|
16
|
Mouterde M, Daali Y, Rollason V, Čížková M, Mulugeta A, Al Balushi KA, Fakis G, Constantinidis TC, Al-Thihli K, Černá M, Makonnen E, Boukouvala S, Al-Yahyaee S, Yimer G, Černý V, Desmeules J, Poloni ES. Joint Analysis of Phenotypic and Genomic Diversity Sheds Light on the Evolution of Xenobiotic Metabolism in Humans. Genome Biol Evol 2022; 14:6852765. [PMID: 36445690 PMCID: PMC9750130 DOI: 10.1093/gbe/evac167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 11/03/2022] [Accepted: 11/22/2022] [Indexed: 11/30/2022] Open
Abstract
Variation in genes involved in the absorption, distribution, metabolism, and excretion of drugs (ADME) can influence individual response to a therapeutic treatment. The study of ADME genetic diversity in human populations has led to evolutionary hypotheses of adaptation to distinct chemical environments. Population differentiation in measured drug metabolism phenotypes is, however, scarcely documented, often indirectly estimated via genotype-predicted phenotypes. We administered seven probe compounds devised to target six cytochrome P450 enzymes and the P-glycoprotein (P-gp) activity to assess phenotypic variation in four populations along a latitudinal transect spanning over Africa, the Middle East, and Europe (349 healthy Ethiopian, Omani, Greek, and Czech volunteers). We demonstrate significant population differentiation for all phenotypes except the one measuring CYP2D6 activity. Genome-wide association studies (GWAS) evidenced that the variability of phenotypes measuring CYP2B6, CYP2C9, CYP2C19, and CYP2D6 activity was associated with genetic variants linked to the corresponding encoding genes, and additional genes for the latter three. Instead, GWAS did not indicate any association between genetic diversity and the phenotypes measuring CYP1A2, CYP3A4, and P-gp activity. Genome scans of selection highlighted multiple candidate regions, a few of which included ADME genes, but none overlapped with the GWAS candidates. Our results suggest that different mechanisms have been shaping the evolution of these phenotypes, including phenotypic plasticity, and possibly some form of balancing selection. We discuss how these contrasting results highlight the diverse evolutionary trajectories of ADME genes and proteins, consistent with the wide spectrum of both endogenous and exogenous molecules that are their substrates.
Collapse
Affiliation(s)
| | - Youssef Daali
- Division of Clinical Pharmacology and Toxicology, Geneva University Hospitals and University of Geneva, Geneva, Switzerland
| | - Victoria Rollason
- Division of Clinical Pharmacology and Toxicology, Geneva University Hospitals and University of Geneva, Geneva, Switzerland
| | - Martina Čížková
- Institute of Archaeology of the Academy of Sciences of the Czech Republic, Prague, Czech Republic
| | - Anwar Mulugeta
- Department of Pharmacology and Clinical Pharmacy, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Khalid A Al Balushi
- College of Pharmacy, National University of Science and Technology, Muscat, Sultanate of Oman
| | - Giannoulis Fakis
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, Greece
| | | | - Khalid Al-Thihli
- Department of Genetics, Sultan Qaboos University Hospital, Muscat, Sultanate of Oman
| | - Marie Černá
- Department of Medical Genetics, Third Faculty of Medicine, Charles University, Prague, Czech Republic
| | - Eyasu Makonnen
- Department of Pharmacology and Clinical Pharmacy, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia,Center for Innovative Drug Development and Therapeutic Trials for Africa, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Sotiria Boukouvala
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, Greece
| | - Said Al-Yahyaee
- Department of Genetics, College of Medicine and Health Sciences, Sultan Qaboos University, Muscat, Sultanate of Oman
| | - Getnet Yimer
- Center for Global Genomics & Health Equity, Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Viktor Černý
- Department of Anthropology and Human Genetics, Faculty of Science, Charles University, Prague, Czech Republic
| | - Jules Desmeules
- Division of Clinical Pharmacology and Toxicology, Geneva University Hospitals and University of Geneva, Geneva, Switzerland
| | | |
Collapse
|
17
|
Lotterhos KE, Fitzpatrick MC, Blackmon H. Simulation Tests of Methods in Evolution, Ecology, and Systematics: Pitfalls, Progress, and Principles. Annu Rev Ecol Evol Syst 2022; 53:113-136. [PMID: 38107485 PMCID: PMC10723108 DOI: 10.1146/annurev-ecolsys-102320-093722] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Complex statistical methods are continuously developed across the fields of ecology, evolution, and systematics (EES). These fields, however, lack standardized principles for evaluating methods, which has led to high variability in the rigor with which methods are tested, a lack of clarity regarding their limitations, and the potential for misapplication. In this review, we illustrate the common pitfalls of method evaluations in EES, the advantages of testing methods with simulated data, and best practices for method evaluations. We highlight the difference between method evaluation and validation and review how simulations, when appropriately designed, can refine the domain in which a method can be reliably applied. We also discuss the strengths and limitations of different evaluation metrics. The potential for misapplication of methods would be greatly reduced if funding agencies, reviewers, and journals required principled method evaluation.
Collapse
Affiliation(s)
- Katie E Lotterhos
- Department of Marine and Environmental Sciences, Northeastern University, Nahant, Massachusetts, USA
| | - Matthew C Fitzpatrick
- Appalachian Lab, University of Maryland Center for Environmental Science, Frostburg, Maryland, USA
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
18
|
Abstract
We discuss the genetic, demographic, and selective forces that are likely to be at play in restricting observed levels of DNA sequence variation in natural populations to a much smaller range of values than would be expected from the distribution of census population sizes alone-Lewontin's Paradox. While several processes that have previously been strongly emphasized must be involved, including the effects of direct selection and genetic hitchhiking, it seems unlikely that they are sufficient to explain this observation without contributions from other factors. We highlight a potentially important role for the less-appreciated contribution of population size change; specifically, the likelihood that many species and populations may be quite far from reaching the relatively high equilibrium diversity values that would be expected given their current census sizes.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
19
|
Johri P, Eyre-Walker A, Gutenkunst RN, Lohmueller KE, Jensen JD. On the prospect of achieving accurate joint estimation of selection with population history. Genome Biol Evol 2022; 14:6604401. [PMID: 35675379 PMCID: PMC9254643 DOI: 10.1093/gbe/evac088] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/02/2022] [Indexed: 11/15/2022] Open
Abstract
As both natural selection and population history can affect genome-wide patterns of variation, disentangling the contributions of each has remained as a major challenge in population genetics. We here discuss historical and recent progress towards this goal—highlighting theoretical and computational challenges that remain to be addressed, as well as inherent difficulties in dealing with model complexity and model violations—and offer thoughts on potentially fruitful next steps.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | | | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.,Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
20
|
Kemp SA, Charles OJ, Derache A, Smidt W, Martin DP, Iwuji C, Adamson J, Govender K, de Oliveira T, Dabis F, Pillay D, Goldstein RA, Gupta RK. HIV-1 Evolutionary Dynamics under Nonsuppressive Antiretroviral Therapy. mBio 2022; 13:e0026922. [PMID: 35446121 PMCID: PMC9239331 DOI: 10.1128/mbio.00269-22] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 03/28/2022] [Indexed: 12/19/2022] Open
Abstract
Prolonged virologic failure on 2nd-line protease inhibitor (PI)-based antiretroviral therapy (ART) without emergence of major protease mutations is well recognized and provides an opportunity to study within-host evolution in long-term viremic individuals. Using next-generation sequencing and in silico haplotype reconstruction, we analyzed whole-genome sequences from longitudinal plasma samples of eight chronically infected HIV-1-positive individuals failing 2nd-line regimens from the French National Agency for AIDS and Viral Hepatitis Research (ANRS) 12249 Treatment as Prevention (TasP) trial. On nonsuppressive ART, there were large fluctuations in synonymous and nonsynonymous variant frequencies despite stable viremia. Reconstructed haplotypes provided evidence for selective sweeps during periods of partial adherence, and viral haplotype competition, during periods of low drug exposure. Drug resistance mutations in reverse transcriptase (RT) were used as markers of viral haplotypes in the reservoir, and their distribution over time indicated recombination. We independently observed linkage disequilibrium decay, indicative of recombination. These data highlight dramatic changes in virus population structure that occur during stable viremia under nonsuppressive ART. IMPORTANCE HIV-1 infections are most commonly initiated with a single founder virus and are characterized by extensive inter- and intraparticipant genetic diversity. However, existing literature on HIV-1 intrahost population dynamics is largely limited to untreated infections, predominantly in subtype B-infected individuals. The manuscript characterizes viral population dynamics in long-term viremic treatment-experienced individuals, which has not been previously characterized. These data are particularly relevant for understanding HIV dynamics but can also be applied to other RNA viruses. With this unique data set we propose that the virus is highly unstable, and we have found compelling evidence of HIV-1 within-host viral diversification, recombination, and haplotype competition during nonsuppressive ART.
Collapse
Affiliation(s)
- Steven A. Kemp
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge, United Kingdom
| | - Oscar J. Charles
- Division of Infection & Immunity, University College London, London, United Kingdom
| | - Anne Derache
- Africa Health Research Institute, Durban, South Africa
| | - Werner Smidt
- Africa Health Research Institute, Durban, South Africa
| | - Darren P. Martin
- Department of Integrative Biomedical Sciences, University of Cape Town, Cape Town, South Africa
| | - Collins Iwuji
- Africa Health Research Institute, Durban, South Africa
- Research Department of Infection and Population Health, University College London, United Kingdom
| | - John Adamson
- Africa Health Research Institute, Durban, South Africa
| | | | - Tulio de Oliveira
- Africa Health Research Institute, Durban, South Africa
- KRISP - KwaZulu-Natal Research and Innovation Sequencing Platform, UKZN, Durban, South Africa
| | - Francois Dabis
- INSERM U1219-Centre Inserm Bordeaux Population Health, Université de Bordeaux, France
- Université de Bordeaux, ISPED, Centre INSERM U1219-Bordeaux Population Health, France
| | - Deenan Pillay
- Division of Infection & Immunity, University College London, London, United Kingdom
| | - Richard A. Goldstein
- Division of Infection & Immunity, University College London, London, United Kingdom
| | - Ravindra K. Gupta
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge, United Kingdom
- Africa Health Research Institute, Durban, South Africa
| |
Collapse
|
21
|
Rougemont Q, Perrier C, Besnard AL, Lebel I, Abdallah Y, Feunteun E, Réveillac E, Lasne E, Acou A, Nachón DJ, Cobo F, Evanno G, Baglinière JL, Launey S. Population genetics reveals divergent lineages and ongoing hybridization in a declining migratory fish species complex. Heredity (Edinb) 2022. [PMID: 35665777 DOI: 10.1038/s41437-022-00547-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 05/13/2022] [Accepted: 05/16/2022] [Indexed: 11/08/2022] Open
Abstract
Deciphering the effects of historical and recent demographic processes responsible for the spatial patterns of genetic diversity and structure is a key objective in evolutionary and conservation biology. Using population genetic analyses, we investigated the demographic history, the contemporary genetic diversity and structure, and the occurrence of hybridization and introgression of two species of anadromous fish with contrasting life history strategies and which have undergone recent demographic declines, the allis shad (Alosa alosa) and the twaite shad (Alosa fallax). We genotyped 706 individuals from 20 rivers and 5 sites at sea in Southern Europe at thirteen microsatellite markers. Genetic structure between populations was lower for the nearly semelparous species A. alosa, which disperses greater distances compared to the iteroparous species, A. fallax. Individuals caught at sea were assigned at the river level for A. fallax and at the region level for A. alosa. Using an approximate Bayesian computation framework, we inferred that the most likely long term historical divergence scenario between both species and lineages involved historical separation followed by secondary contact accompanied by strong population size decline. Accordingly, we found evidence for contemporary hybridization and bidirectional introgression due to gene flow between both species and lineages. Moreover, our results support the existence of at least one distinct species in the Mediterrannean sea: A. agone in Golfe du Lion area, and another divergent lineage in Corsica. Overall, our results shed light on the interplay between historical and recent demographic processes and life history strategies in shaping population genetic diversity and structure of closely related species. The recent demographic decline of these species' populations and their hybridization should be carefully considered while implementing conservation programs.
Collapse
|
22
|
Johri P, Aquadro CF, Beaumont M, Charlesworth B, Excoffier L, Eyre-Walker A, Keightley PD, Lynch M, McVean G, Payseur BA, Pfeifer SP, Stephan W, Jensen JD. Recommendations for improving statistical inference in population genomics. PLoS Biol 2022; 20:e3001669. [PMID: 35639797 PMCID: PMC9154105 DOI: 10.1371/journal.pbio.3001669] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Charles F. Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Mark Beaumont
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne, Switzerland
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Peter D. Keightley
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Susanne P. Pfeifer
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
- * E-mail:
| |
Collapse
|
23
|
Tergemina E, Elfarargi AF, Flis P, Fulgione A, Göktay M, Neto C, Scholle M, Flood PJ, Xerri SA, Zicola J, Döring N, Dinis H, Krämer U, Salt DE, Hancock AM. A two-step adaptive walk rewires nutrient transport in a challenging edaphic environment. Sci Adv 2022; 8:eabm9385. [PMID: 35584228 PMCID: PMC9116884 DOI: 10.1126/sciadv.abm9385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 04/01/2022] [Indexed: 06/15/2023]
Abstract
Most well-characterized cases of adaptation involve single genetic loci. Theory suggests that multilocus adaptive walks should be common, but these are challenging to identify in natural populations. Here, we combine trait mapping with population genetic modeling to show that a two-step process rewired nutrient homeostasis in a population of Arabidopsis as it colonized the base of an active stratovolcano characterized by extremely low soil manganese (Mn). First, a variant that disrupted the primary iron (Fe) uptake transporter gene (IRT1) swept quickly to fixation in a hard selective sweep, increasing Mn but limiting Fe in the leaves. Second, multiple independent tandem duplications occurred at NRAMP1 and together rose to near fixation in the island population, compensating the loss of IRT1 by improving Fe homeostasis. This study provides a clear case of a multilocus adaptive walk and reveals how genetic variants reshaped a phenotype and spread over space and time.
Collapse
Affiliation(s)
- Emmanuel Tergemina
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Ahmed F. Elfarargi
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Paulina Flis
- Future Food Beacon of Excellence and the School of Biosciences, University of Nottingham, Sutton Bonington Campus, Nr Loughborough, LE12 5RD Nottingham, UK
| | - Andrea Fulgione
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Mehmet Göktay
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Célia Neto
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Marleen Scholle
- Faculty of Biology and Biotechnology, Ruhr University Bochum, 44801 Bochum, Germany
| | - Pádraic J. Flood
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Sophie-Asako Xerri
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Johan Zicola
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Nina Döring
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| | - Herculano Dinis
- Parque Natural do Fogo, Direção Nacional do Ambiente, 115 Chã d’Areia, Praia, Santiago, Cabo Verde, Africa
- Associação Projecto Vitó, 8234, Xaguate, Cidade de São Filipe, Fogo, Cabo Verde, Africa
| | - Ute Krämer
- Faculty of Biology and Biotechnology, Ruhr University Bochum, 44801 Bochum, Germany
| | - David E. Salt
- Future Food Beacon of Excellence and the School of Biosciences, University of Nottingham, Sutton Bonington Campus, Nr Loughborough, LE12 5RD Nottingham, UK
| | - Angela M. Hancock
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany
| |
Collapse
|
24
|
Muralidhar P, Veller C. Dominance shifts increase the likelihood of soft selective sweeps. Evolution 2022; 76:966-984. [PMID: 35213740 PMCID: PMC9928167 DOI: 10.1111/evo.14459] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 02/04/2022] [Indexed: 01/21/2023]
Abstract
Genetic models of adaptation to a new environment have typically assumed that the alleles involved maintain a constant fitness dominance across the old and new environments. However, theories of dominance suggest that this should often not be the case. Instead, the alleles involved should frequently shift from recessive deleterious in the old environment to dominant beneficial in the new environment. Here, we study the consequences of these expected dominance shifts for the genetics of adaptation to a new environment. We find that dominance shifts increase the likelihood that adaptation occurs from standing variation, and that multiple alleles from the standing variation are involved (a soft selective sweep). Furthermore, we find that expected dominance shifts increase the haplotypic diversity of selective sweeps, rendering soft sweeps more detectable in small genomic samples. In cases where an environmental change threatens the viability of the population, we show that expected dominance shifts of newly beneficial alleles increase the likelihood of evolutionary rescue and the number of alleles involved. Finally, we apply our results to a well-studied case of adaptation to a new environment: the evolution of pesticide resistance at the Ace locus in Drosophila melanogaster. We show that, under reasonable demographic assumptions, the expected dominance shift of resistant alleles causes soft sweeps to be the most frequent outcome in this case, with the primary source of these soft sweeps being the standing variation at the onset of pesticide use, rather than recurrent mutation thereafter.
Collapse
Affiliation(s)
- Pavitra Muralidhar
- Center for Population Biology, University of California,
Davis, CA 95616,Department of Evolution and Ecology, University of
California, Davis, CA 95616,corresponding author:
| | - Carl Veller
- Center for Population Biology, University of California,
Davis, CA 95616,Department of Evolution and Ecology, University of
California, Davis, CA 95616
| |
Collapse
|
25
|
Láruson ÁJ, Fitzpatrick MC, Keller SR, Haller BC, Lotterhos KE. Seeing the Forest for the trees: Assessing genetic offset predictions from Gradient Forest. Evol Appl 2022; 15:403-416. [PMID: 35386401 PMCID: PMC8965365 DOI: 10.1111/eva.13354] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 01/22/2022] [Accepted: 01/30/2022] [Indexed: 12/02/2022] Open
Abstract
Gradient Forest (GF) is a machine learning algorithm designed to analyze spatial patterns of biodiversity as a function of environmental gradients. An offset measure between the GF‐predicted environmental association of adapted alleles and a new environment (GF Offset) is increasingly being used to predict the loss of environmentally adapted alleles under rapid environmental change, but remains mostly untested for this purpose. Here, we explore the robustness of GF Offset to assumption violations, and its relationship to measures of fitness, using SLiM simulations with explicit genome architecture and a spatial metapopulation. We evaluate measures of GF Offset in: (1) a neutral model with no environmental adaptation; (2) a monogenic “population genetic” model with a single environmentally adapted locus; and (3) a polygenic “quantitative genetic” model with two adaptive traits, each adapting to a different environment. We found GF Offset to be broadly correlated with fitness offsets under both single locus and polygenic architectures. However, neutral demography, genomic architecture, and the nature of the adaptive environment can all confound relationships between GF Offset and fitness. GF Offset is a promising tool, but it is important to understand its limitations and underlying assumptions, especially when used in the context of predicting maladaptation.
Collapse
Affiliation(s)
- Áki Jarl Láruson
- Department of Natural Resources Cornell University Ithaca NY 14853 USA
| | - Matthew C. Fitzpatrick
- Appalachian Laboratory University of Maryland Center for Environmental Science Frostburg Maryland 21532 USA
| | - Stephen R. Keller
- Department of Plant Biology University of Vermont Burlington Vermont 05405 USA
| | - Benjamin C. Haller
- Department of Computational Biology Cornell University Ithaca NY 14853 USA
| | - Katie E. Lotterhos
- Department of Marine and Environmental Sciences Northeastern University Marine Science Center Nahant MA 01908 USA
| |
Collapse
|
26
|
Morales-Arce AY, Johri P, Jensen JD. Inferring the distribution of fitness effects in patient-sampled and experimental virus populations: two case studies. Heredity (Edinb) 2022; 128:79-87. [PMID: 34987185 PMCID: PMC8728706 DOI: 10.1038/s41437-021-00493-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 12/12/2021] [Accepted: 12/13/2021] [Indexed: 11/19/2022] Open
Abstract
We here propose an analysis pipeline for inferring the distribution of fitness effects (DFE) from either patient-sampled or experimentally-evolved viral populations, that explicitly accounts for non-Wright-Fisher and non-equilibrium population dynamics inherent to pathogens. We examine the performance of this approach via extensive power and performance analyses, and highlight two illustrative applications - one from an experimentally-passaged RNA virus, and the other from a clinically-sampled DNA virus. Finally, we discuss how such DFE inference may shed light on major research questions in virus evolution, ranging from a quantification of the population genetic processes governing genome size, to the role of Hill-Robertson interference in dictating adaptive outcomes, to the potential design of novel therapeutic approaches to eradicate within-patient viral populations via induced mutational meltdown.
Collapse
Affiliation(s)
- Ana Y. Morales-Arce
- grid.215654.10000 0001 2151 2636Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ USA
| | - Parul Johri
- grid.215654.10000 0001 2151 2636Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ USA
| | - Jeffrey D. Jensen
- grid.215654.10000 0001 2151 2636Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ USA
| |
Collapse
|
27
|
Abstract
The ability to accurately identify and quantify genetic signatures associated with soft selective sweeps based on patterns of nucleotide variation has remained controversial. We here provide counter viewpoints to recent publications in PLOS Genetics that have argued not only for the statistical identifiability of soft selective sweeps, but also for their pervasive evolutionary role in both Drosophila and HIV populations. We present evidence that these claims owe to a lack of consideration of competing evolutionary models, unjustified interpretations of empirical outliers, as well as to new definitions of the processes themselves. Our results highlight the dangers of fitting evolutionary models based on hypothesized and episodic processes without properly first considering common processes and, more generally, of the tendency in certain research areas to view pervasive positive selection as a foregone conclusion.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| |
Collapse
|
28
|
Persoons A, Maupetit A, Louet C, Andrieux A, Lipzen A, Barry KW, Na H, Adam C, Grigoriev IV, Segura V, Duplessis S, Frey P, Halkett F, De Mita S. Genomic signatures of a major adaptive event in the pathogenic fungus Melampsora larici-populina. Genome Biol Evol 2021; 14:6468622. [PMID: 34919678 PMCID: PMC8755504 DOI: 10.1093/gbe/evab279] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2021] [Indexed: 11/14/2022] Open
Abstract
The recent availability of genome-wide sequencing techniques has allowed systematic screening for molecular signatures of adaptation, including in nonmodel organisms. Host–pathogen interactions constitute good models due to the strong selective pressures that they entail. We focused on an adaptive event which affected the poplar rust fungus Melampsora larici-populina when it overcame a resistance gene borne by its host, cultivated poplar. Based on 76 virulent and avirulent isolates framing narrowly the estimated date of the adaptive event, we examined the molecular signatures of selection. Using an array of genome scan methods based on different features of nucleotide diversity, we detected a single locus exhibiting a consistent pattern suggestive of a selective sweep in virulent individuals (excess of differentiation between virulent and avirulent samples, linkage disequilibrium, genotype–phenotype statistical association, and long-range haplotypes). Our study pinpoints a single gene and further a single amino acid replacement which may have allowed the adaptive event. Although our samples are nearly contemporary to the selective sweep, it does not seem to have affected genome diversity further than the immediate vicinity of the causal locus, which can be explained by a soft selective sweep (where selection acts on standing variation) and by the impact of recombination in mitigating the impact of selection. Therefore, it seems that properties of the life cycle of M. larici-populina, which entails both high genetic diversity and outbreeding, has facilitated its adaptation.
Collapse
Affiliation(s)
| | - Agathe Maupetit
- Université de Lorraine,INRAE, IAM, Nancy, France.,Physiology and Biotechnology of Algae Laboratory,IFREMER, Nantes, France
| | | | | | - Anna Lipzen
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Kerrie W Barry
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Hyunsoo Na
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Catherine Adam
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA.,Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, USA
| | - Vincent Segura
- BioForA,INRAE, ONF, Orléans, France.,UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398 Montpellier, France
| | | | - Pascal Frey
- Université de Lorraine,INRAE, IAM, Nancy, France
| | | | - Stéphane De Mita
- Université de Lorraine,INRAE, IAM, Nancy, France.,PHIM, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| |
Collapse
|
29
|
Laval G, Patin E, Boutillier P, Quintana-Murci L. Sporadic occurrence of recent selective sweeps from standing variation in humans as revealed by an approximate Bayesian computation approach. Genetics 2021; 219:6377789. [PMID: 34849862 DOI: 10.1093/genetics/iyab161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 09/01/2021] [Indexed: 12/14/2022] Open
Abstract
During their dispersals over the last 100,000 years, modern humans have been exposed to a large variety of environments, resulting in genetic adaptation. While genome-wide scans for the footprints of positive Darwinian selection have increased knowledge of genes and functions potentially involved in human local adaptation, they have globally produced evidence of a limited contribution of selective sweeps in humans. Conversely, studies based on machine learning algorithms suggest that recent sweeps from standing variation are widespread in humans, an observation that has been recently questioned. Here, we sought to formally quantify the number of recent selective sweeps in humans, by leveraging approximate Bayesian computation and whole-genome sequence data. Our computer simulations revealed suitable ABC estimations, regardless of the frequency of the selected alleles at the onset of selection and the completion of sweeps. Under a model of recent selection from standing variation, we inferred that an average of 68 (from 56 to 79) and 140 (from 94 to 198) sweeps occurred over the last 100,000 years of human history, in African and Eurasian populations, respectively. The former estimation is compatible with human adaptation rates estimated since divergence with chimps, and reveals numbers of sweeps per generation per site in the range of values estimated in Drosophila. Our results confirm the rarity of selective sweeps in humans and show a low contribution of sweeps from standing variation to recent human adaptation.
Collapse
Affiliation(s)
- Guillaume Laval
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Pierre Boutillier
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France.,Human Genomics and Evolution, Collège de France, 75005 Paris, France
| |
Collapse
|
30
|
Santos SHD, Peery RM, Miller JM, Dao A, Lyu FH, Li X, Li MH, Coltman DW. Ancient hybridization patterns between bighorn and thinhorn sheep. Mol Ecol 2021; 30:6273-6288. [PMID: 34845798 DOI: 10.1111/mec.16136] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 07/27/2021] [Accepted: 08/18/2021] [Indexed: 12/12/2022]
Abstract
Whole-genome sequencing has advanced the study of species evolution, including the detection of genealogical discordant events such as ancient hybridization and incomplete lineage sorting (ILS). The evolutionary history of bighorn (Ovis canadensis) and thinhorn (Ovis dalli) sheep present an ideal system to investigate evolutionary discordance due to their recent and rapid radiation and putative secondary contact between bighorn and thinhorn sheep subspecies, specifically the dark pelage Stone sheep (O. dalli stonei) and predominately white Dall sheep (O. dalli dalli), during the last ice age. Here, we used multiple genomes of bighorn and thinhorn sheep, together with snow (O. nivicola) and the domestic sheep (O. aries) as outgroups, to assess their phylogenomic history, potential introgression patterns and their adaptive consequences. Among the Pachyceriforms (snow, bighorn and thinhorn sheep) a consistent monophyletic species tree was retrieved; however, many genealogical discordance patterns were observed. Alternative phylogenies frequently placed Stone and bighorn as sister clades. This relationship occurred more often and was less divergent than that between Dall and bighorn. We also observed many blocks containing introgression signal between Stone and bighorn genomes in which coat colour genes were present. Introgression signals observed between Dall and bighorn were more random and less frequent, and therefore probably due to ILS or intermediary secondary contact. These results strongly suggest that Stone sheep originated from a complex series of events, characterized by multiple, ancient periods of secondary contact with bighorn sheep.
Collapse
Affiliation(s)
- Sarah H D Santos
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Rhiannon M Peery
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Joshua M Miller
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Anh Dao
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Feng-Hua Lyu
- College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Xin Li
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Chinese Academy of Sciences (CAS), Beijing, China.,University of Chinese Academy of Sciences (UCAS), Beijing, China
| | - Meng-Hua Li
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Chinese Academy of Sciences (CAS), Beijing, China
| | - David W Coltman
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
31
|
Abstract
Patterns of variation and evolution at a given site in a genome can be strongly influenced by the effects of selection at genetically linked sites. In particular, the recombination rates of genomic regions correlate with their amount of within-population genetic variability, the degree to which the frequency distributions of DNA sequence variants differ from their neutral expectations, and the levels of adaptation of their functional components. We review the major population genetic processes that are thought to lead to these patterns, focusing on their effects on patterns of variability: selective sweeps, background selection, associative overdominance, and Hill–Robertson interference among deleterious mutations. We emphasize the difficulties in distinguishing among the footprints of these processes and disentangling them from the effects of purely demographic factors such as population size changes. We also discuss how interactions between selective and demographic processes can significantly affect patterns of variability within genomes.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona 85281, USA
| |
Collapse
|
32
|
Stephan W. The classical hitchhiking model with continuous mutational pressure and purifying selection. Ecol Evol 2021; 11:15896-15904. [PMID: 34824798 PMCID: PMC8601925 DOI: 10.1002/ece3.8259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Revised: 08/24/2021] [Accepted: 10/08/2021] [Indexed: 11/14/2022] Open
Abstract
Detecting selective sweeps driven by strong positive selection and localizing the targets of selection in the genome play a major role in modern population genetics and genomics. Most of these analyses are based on the classical model of genetic hitchhiking proposed by Maynard Smith and Haigh (1974, Genetical Research, 23, 23). Here, we consider extensions of the classical two-locus model. Introducing mutation at the strongly selected site, we analyze the conditions under which soft sweeps may arise. We identify a new parameter (the ratio of the beneficial mutation rate to the selection coefficient) that characterizes the occurrence of multiple-origin soft sweeps. Furthermore, we quantify the hitchhiking effect when the polymorphism at the linked locus is not neutral but maintained in a mutation-selection balance. In this case, we find a smaller relative reduction of heterozygosity at the linked site than for a neutral polymorphism. In our analysis, we use a semi-deterministic approach; i.e., we analyze the frequency process of the beneficial allele in an infinitely large population when its frequency is above a certain threshold; however, for very small frequencies in the initial phase after the onset of selection we rely on diffusion theory.
Collapse
Affiliation(s)
- Wolfgang Stephan
- Leibniz‐Institute for Evolution and Biodiversity ScienceNatural History MuseumBerlinGermany
| |
Collapse
|
33
|
Gompert Z, Springer A, Brady M, Chaturvedi S, Lucas LK. Genomic time-series data show that gene flow maintains high genetic diversity despite substantial genetic drift in a butterfly species. Mol Ecol 2021; 30:4991-5008. [PMID: 34379852 DOI: 10.1111/mec.16111] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 07/09/2021] [Accepted: 07/19/2021] [Indexed: 11/29/2022]
Abstract
Effective population size affects the efficacy of selection, rate of evolution by drift, and neutral diversity levels. When species are subdivided into multiple populations connected by gene flow, evolutionary processes can depend on global or local effective population sizes. Theory predicts that high levels of diversity might be maintained by gene flow, even very low levels of gene flow, consistent with species long-term effective population size, but tests of this idea are mostly lacking. Here, we show that Lycaeides buttery populations maintain low contemporary (variance) effective population sizes (e.g., ~200 individuals) and thus evolve rapidly by genetic drift. In contrast, populations harbored high levels of genetic diversity consistent with an effective population size several orders of magnitude larger. We hypothesized that the differences in the magnitude and variability of contemporary versus long-term effective population sizes were caused by gene flow of sufficient magnitude to maintain diversity but only subtly affect evolution on generational time scales. Consistent with this hypothesis, we detected low but non-trivial gene flow among populations. Furthermore, using short-term population-genomic time-series data, we documented patterns consistent with predictions from this hypothesis, including a weak but detectable excess of evolutionary change in the direction of the mean (migrant gene pool) allele frequencies across populations, and consistency in the direction of allele frequency change over time. The documented decoupling of diversity levels and short-term change by drift in Lycaeides has implications for our understanding of contemporary evolution and the maintenance of genetic variation in the wild.
Collapse
Affiliation(s)
- Zachariah Gompert
- Department of Biology, Utah State University, Logan, UT, 84322, USA.,Ecology Center, Utah State University, Logan, UT, 84322, USA
| | - Amy Springer
- Department of Biology, Utah State University, Logan, UT, 84322, USA
| | - Megan Brady
- Department of Biology, Utah State University, Logan, UT, 84322, USA
| | - Samridhi Chaturvedi
- Department of Biology, Utah State University, Logan, UT, 84322, USA.,Department of Organismic & Evolutionary Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Lauren K Lucas
- Department of Biology, Utah State University, Logan, UT, 84322, USA
| |
Collapse
|
34
|
Johri P, Charlesworth B, Howell EK, Lynch M, Jensen JD. Revisiting the Notion of Deleterious Sweeps. Genetics 2021; 219:6298596. [PMID: 34125884 PMCID: PMC9101445 DOI: 10.1093/genetics/iyab094] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/08/2021] [Indexed: 11/14/2022] Open
Abstract
It has previously been shown that, conditional on its fixation, the time to fixation of a semi-dominant deleterious autosomal mutation in a randomly mating population is the same as that of an advantageous mutation. This result implies that deleterious mutations could generate selective sweep-like effects. Although their fixation probabilities greatly differ, the much larger input of deleterious relative to beneficial mutations suggests that this phenomenon could be important. We here examine how the fixation of mildly deleterious mutations affects levels and patterns of polymorphism at linked sites - both in the presence and absence of interference amongst deleterious mutations - and how this class of sites may contribute to divergence between-populations and species. We find that, while deleterious fixations are unlikely to represent a significant proportion of outliers in polymorphism-based genomic scans within populations, minor shifts in the frequencies of deleterious mutations can influence the proportions of private variants and the value of FST after a recent population split. As sites subject to deleterious mutations are necessarily found in functional genomic regions, interpretations in terms of recurrent positive selection may require reconsideration.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Emma K Howell
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States.,Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, United States
| |
Collapse
|
35
|
Bourgeois YXC, Warren BH. An overview of current population genomics methods for the analysis of whole-genome resequencing data in eukaryotes. Mol Ecol 2021; 30:6036-6071. [PMID: 34009688 DOI: 10.1111/mec.15989] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 04/26/2021] [Accepted: 05/11/2021] [Indexed: 01/01/2023]
Abstract
Characterizing the population history of a species and identifying loci underlying local adaptation is crucial in functional ecology, evolutionary biology, conservation and agronomy. The constant improvement of high-throughput sequencing techniques has facilitated the production of whole genome data in a wide range of species. Population genomics now provides tools to better integrate selection into a historical framework, and take into account selection when reconstructing demographic history. However, this improvement has come with a profusion of analytical tools that can confuse and discourage users. Such confusion limits the amount of information effectively retrieved from complex genomic data sets, and impairs the diffusion of the most recent analytical tools into fields such as conservation biology. It may also lead to redundancy among methods. To address these isssues, we propose an overview of more than 100 state-of-the-art methods that can deal with whole genome data. We summarize the strategies they use to infer demographic history and selection, and discuss some of their limitations. A website listing these methods is available at www.methodspopgen.com.
Collapse
Affiliation(s)
| | - Ben H Warren
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRS, Sorbonne Université, EPHE, UA, CP 51, Paris, France
| |
Collapse
|
36
|
Clemente F, Unterländer M, Dolgova O, Amorim CEG, Coroado-Santos F, Neuenschwander S, Ganiatsou E, Cruz Dávalos DI, Anchieri L, Michaud F, Winkelbach L, Blöcher J, Arizmendi Cárdenas YO, Sousa da Mota B, Kalliga E, Souleles A, Kontopoulos I, Karamitrou-Mentessidi G, Philaniotou O, Sampson A, Theodorou D, Tsipopoulou M, Akamatis I, Halstead P, Kotsakis K, Urem-Kotsou D, Panagiotopoulos D, Ziota C, Triantaphyllou S, Delaneau O, Jensen JD, Moreno-Mayar JV, Burger J, Sousa VC, Lao O, Malaspinas AS, Papageorgopoulou C. The genomic history of the Aegean palatial civilizations. Cell 2021; 184:2565-2586.e21. [PMID: 33930288 PMCID: PMC8127963 DOI: 10.1016/j.cell.2021.03.039] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 09/17/2020] [Accepted: 03/18/2021] [Indexed: 12/30/2022]
Abstract
The Cycladic, the Minoan, and the Helladic (Mycenaean) cultures define the Bronze Age (BA) of Greece. Urbanism, complex social structures, craft and agricultural specialization, and the earliest forms of writing characterize this iconic period. We sequenced six Early to Middle BA whole genomes, along with 11 mitochondrial genomes, sampled from the three BA cultures of the Aegean Sea. The Early BA (EBA) genomes are homogeneous and derive most of their ancestry from Neolithic Aegeans, contrary to earlier hypotheses that the Neolithic-EBA cultural transition was due to massive population turnover. EBA Aegeans were shaped by relatively small-scale migration from East of the Aegean, as evidenced by the Caucasus-related ancestry also detected in Anatolians. In contrast, Middle BA (MBA) individuals of northern Greece differ from EBA populations in showing ∼50% Pontic-Caspian Steppe-related ancestry, dated at ca. 2,600-2,000 BCE. Such gene flow events during the MBA contributed toward shaping present-day Greek genomes.
Collapse
Affiliation(s)
- Florian Clemente
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Martina Unterländer
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece; Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Olga Dolgova
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028 Barcelona, Spain
| | - Carlos Eduardo G Amorim
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Francisco Coroado-Santos
- CE3C, Centre for Ecology, Evolution and Environmental Changes, Faculty of Sciences of the University of Lisbon, 1749-016 Lisbon, Portugal
| | - Samuel Neuenschwander
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Vital-IT, Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Elissavet Ganiatsou
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Diana I Cruz Dávalos
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Lucas Anchieri
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Frédéric Michaud
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Laura Winkelbach
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Jens Blöcher
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Yami Ommar Arizmendi Cárdenas
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Bárbara Sousa da Mota
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Eleni Kalliga
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Angelos Souleles
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Ioannis Kontopoulos
- Center for GeoGenetics, GLOBE Institute, University of Copenhagen, 1350 Copenhagen, Denmark
| | | | - Olga Philaniotou
- Ephor Emerita of Antiquities, Hellenic Ministry of Culture and Sports, 10682 Athens, Greece
| | - Adamantios Sampson
- Department of Mediterranean Studies, University of the Aegean, 85132 Rhodes, Greece
| | - Dimitra Theodorou
- Ephorate of Antiquities of Kozani, Hellenic Ministry of Culture and Sports, 50004 Kozani, Greece
| | - Metaxia Tsipopoulou
- Ephor Emerita of Antiquities, Hellenic Ministry of Culture and Sports, 10682 Athens, Greece
| | - Ioannis Akamatis
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Paul Halstead
- Department of Archaeology, University of Sheffield, Minalloy House, 10-16 Regent St., Sheffield S1 3NJ, UK
| | - Kostas Kotsakis
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Dushka Urem-Kotsou
- Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Diamantis Panagiotopoulos
- Institute of Classical Archaeology, University of Heidelberg, Marstallhof 4, 69117 Heidelberg, Germany
| | - Christina Ziota
- Ephorate of Antiquities of Florina, Hellenic Ministry of Culture and Sports, 53100 Florina, Greece
| | - Sevasti Triantaphyllou
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - J Víctor Moreno-Mayar
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Center for GeoGenetics, GLOBE Institute, University of Copenhagen, 1350 Copenhagen, Denmark; National Institute of Genomic Medicine (INMEGEN), 14610 Mexico City, Mexico
| | - Joachim Burger
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Vitor C Sousa
- CE3C, Centre for Ecology, Evolution and Environmental Changes, Faculty of Sciences of the University of Lisbon, 1749-016 Lisbon, Portugal
| | - Oscar Lao
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028 Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Anna-Sapfo Malaspinas
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
| | - Christina Papageorgopoulou
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece.
| |
Collapse
|
37
|
Abstract
Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, PA.,Molecular, Cellular, and Integrative Biosciences, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
38
|
Elhaik E, Graur D. On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn't. Genes (Basel) 2021; 12:genes12040527. [PMID: 33916341 PMCID: PMC8066263 DOI: 10.3390/genes12040527] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/22/2021] [Accepted: 03/29/2021] [Indexed: 12/12/2022] Open
Abstract
In the last 15 years or so, soft selective sweep mechanisms have been catapulted from a curiosity of little evolutionary importance to a ubiquitous mechanism claimed to explain most adaptive evolution and, in some cases, most evolution. This transformation was aided by a series of articles by Daniel Schrider and Andrew Kern. Within this series, a paper entitled “Soft sweeps are the dominant mode of adaptation in the human genome” (Schrider and Kern, Mol. Biol. Evolut. 2017, 34(8), 1863–1877) attracted a great deal of attention, in particular in conjunction with another paper (Kern and Hahn, Mol. Biol. Evolut. 2018, 35(6), 1366–1371), for purporting to discredit the Neutral Theory of Molecular Evolution (Kimura 1968). Here, we address an alleged novelty in Schrider and Kern’s paper, i.e., the claim that their study involved an artificial intelligence technique called supervised machine learning (SML). SML is predicated upon the existence of a training dataset in which the correspondence between the input and output is known empirically to be true. Curiously, Schrider and Kern did not possess a training dataset of genomic segments known a priori to have evolved either neutrally or through soft or hard selective sweeps. Thus, their claim of using SML is thoroughly and utterly misleading. In the absence of legitimate training datasets, Schrider and Kern used: (1) simulations that employ many manipulatable variables and (2) a system of data cherry-picking rivaling the worst excesses in the literature. These two factors, in addition to the lack of negative controls and the irreproducibility of their results due to incomplete methodological detail, lead us to conclude that all evolutionary inferences derived from so-called SML algorithms (e.g., S/HIC) should be taken with a huge shovel of salt.
Collapse
Affiliation(s)
- Eran Elhaik
- Department of Biology, Lund University, Sölvegatan 35, 22362 Lund, Sweden
- Correspondence:
| | - Dan Graur
- Department of Biology & Biochemistry, University of Houston, Science & Research Building 2, Suite #342, 3455 Cullen Bldv., Houston, TX 77204-5001, USA;
| |
Collapse
|
39
|
Garud NR, Messer PW, Petrov DA. Detection of hard and soft selective sweeps from Drosophila melanogaster population genomic data. PLoS Genet 2021; 17:e1009373. [PMID: 33635910 PMCID: PMC7946363 DOI: 10.1371/journal.pgen.1009373] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 03/10/2021] [Accepted: 01/17/2021] [Indexed: 12/12/2022] Open
Abstract
Whether hard sweeps or soft sweeps dominate adaptation has been a matter of much debate. Recently, we developed haplotype homozygosity statistics that (i) can detect both hard and soft sweeps with similar power and (ii) can classify the detected sweeps as hard or soft. The application of our method to population genomic data from a natural population of Drosophila melanogaster (DGRP) allowed us to rediscover three known cases of adaptation at the loci Ace, Cyp6g1, and CHKov1 known to be driven by soft sweeps, and detected additional candidate loci for recent and strong sweeps. Surprisingly, all of the top 50 candidates showed patterns much more consistent with soft rather than hard sweeps. Recently, Harris et al. 2018 criticized this work, suggesting that all the candidate loci detected by our haplotype statistics, including the positive controls, are unlikely to be sweeps at all and that instead these haplotype patterns can be more easily explained by complex neutral demographic models. They also claim that these neutral non-sweeps are likely to be hard instead of soft sweeps. Here, we reanalyze the DGRP data using a range of complex admixture demographic models and reconfirm our original published results suggesting that the majority of recent and strong sweeps in D. melanogaster are first likely to be true sweeps, and second, that they do appear to be soft. Furthermore, we discuss ways to take this work forward given that most demographic models employed in such analyses are necessarily too simple to capture the full demographic complexity, while more realistic models are unlikely to be inferred correctly because they require a large number of free parameters.
Collapse
Affiliation(s)
- Nandita R. Garud
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
| | - Philipp W. Messer
- Department of Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Dmitri A. Petrov
- Department of Biology, Stanford University, Stanford, California, United States of America
| |
Collapse
|
40
|
Abstract
HIV can evolve remarkably quickly in response to antiretroviral therapies and the immune system. This evolution stymies treatment effectiveness and prevents the development of an HIV vaccine. Consequently, there has been a great interest in using population genetics to disentangle the forces that govern the HIV adaptive landscape (selection, drift, mutation, and recombination). Traditional population genetics approaches look at the current state of genetic variation and infer the processes that can generate it. However, because HIV evolves rapidly, we can also sample populations repeatedly over time and watch evolution in action. In this paper, we demonstrate how time series data can bound evolutionary parameters in a way that complements and informs traditional population genetic approaches. Specifically, we focus on our recent paper (Feder et al., 2016, eLife), in which we show that, as improved HIV drugs have led to fewer patients failing therapy due to resistance evolution, less genetic diversity has been maintained following the fixation of drug resistance mutations. Because soft sweeps of multiple drug resistance mutations spreading simultaneously have been previously documented in response to the less effective HIV therapies used early in the epidemic, we interpret the maintenance of post-sweep diversity in response to poor therapies as further evidence of soft sweeps and therefore a high population mutation rate (θ) in these intra-patient HIV populations. Because improved drugs resulted in rarer resistance evolution accompanied by lower post-sweep diversity, we suggest that both observations can be explained by decreased population mutation rates and a resultant transition to hard selective sweeps. A recent paper (Harris et al., 2018, PLOS Genetics) proposed an alternative interpretation: Diversity maintenance following drug resistance evolution in response to poor therapies may have been driven by recombination during slow, hard selective sweeps of single mutations. Then, if better drugs have led to faster hard selective sweeps of resistance, recombination will have less time to rescue diversity during the sweep, recapitulating the decrease in post-sweep diversity as drugs have improved. In this paper, we use time series data to show that drug resistance evolution during ineffective treatment is very fast, providing new evidence that soft sweeps drove early HIV treatment failure.
Collapse
Affiliation(s)
- Alison F. Feder
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Pleuni S. Pennings
- Department of Biology, San Francisco State University, San Francisco, California, United States of America
| | - Dmitri A. Petrov
- Department of Biology, Stanford University, Stanford, California, United States of America
| |
Collapse
|
41
|
Ehrlich MA, Wagner DN, Oleksiak MF, Crawford DL. Polygenic Selection within a Single Generation Leads to Subtle Divergence among Ecological NichesINc. Genome Biol Evol 2020; 13:6031913. [PMID: 33313716 PMCID: PMC7875003 DOI: 10.1093/gbe/evaa257] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 09/09/2020] [Accepted: 12/09/2020] [Indexed: 11/23/2022] Open
Abstract
Selection on standing genetic variation may be effective enough to allow for adaptation to distinct niche environments within a single generation. Minor allele frequency changes at multiple, redundant loci of small effect can produce remarkable phenotypic shifts. Yet, demonstrating rapid adaptation via polygenic selection in the wild remains challenging. Here we harness natural replicate populations that experience similar selection pressures and harbor high within-, yet negligible among-population genetic variation. Such populations can be found among the teleost Fundulus heteroclitus that inhabits marine estuaries characterized by high environmental heterogeneity. We identify 10,861 single nucleotide polymorphisms in F. heteroclitus that belong to a single, panmictic population yet reside in environmentally distinct niches (one coastal basin and three replicate tidal ponds). By sampling at two time points within a single generation, we quantify both allele frequency change within as well as spatial divergence among niche subpopulations. We observe few individually significant allele frequency changes yet find that the “number” of moderate changes exceeds the neutral expectation by 10–100%. We find allele frequency changes to be significantly concordant in both direction and magnitude among all niche subpopulations, suggestive of parallel selection. In addition, within-generation allele frequency changes generate subtle but significant divergence among niches, indicative of local adaptation. Although we cannot distinguish between selection and genotype-dependent migration as drivers of within-generation allele frequency changes, the trait/s determining fitness and/or migration likelihood appear to be polygenic. In heterogeneous environments, polygenic selection and polygenic, genotype-dependent migration offer conceivable mechanisms for within-generation, local adaptation to distinct niches.
Collapse
Affiliation(s)
- Moritz A Ehrlich
- Marine Biology and Ecology, Rosenstiel School of Marine and Atmospheric Science, University of Miami, FL, USA
| | - Dominique N Wagner
- Marine Biology and Ecology, Rosenstiel School of Marine and Atmospheric Science, University of Miami, FL, USA
| | - Marjorie F Oleksiak
- Marine Biology and Ecology, Rosenstiel School of Marine and Atmospheric Science, University of Miami, FL, USA
| | - Douglas L Crawford
- Marine Biology and Ecology, Rosenstiel School of Marine and Atmospheric Science, University of Miami, FL, USA
| |
Collapse
|
42
|
Morales-Arce AY, Sabin SJ, Stone AC, Jensen JD. The population genomics of within-host Mycobacterium tuberculosis. Heredity (Edinb) 2020; 126:1-9. [PMID: 33060846 DOI: 10.1038/s41437-020-00377-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 10/02/2020] [Accepted: 10/03/2020] [Indexed: 11/09/2022] Open
Abstract
Recent progress in genomic sequencing from patient samples has allowed for the first detailed insight into the within-host genetic diversity of Mycobacterium tuberculosis (M.TB), revealing remarkably low levels of variation. While this has often been attributed to low mutation rates, other factors have been described, including resistance evolution (i.e., selective sweeps), widespread purifying and background selection, and, more recently, progeny skew. Here we review recent findings pertaining to the processes governing the evolutionary dynamics of M.TB, discuss their implications for improving our understanding of this important human pathogen, and make recommendations for future work. Significantly, this emerging evolutionary framework involving the joint estimation of demographic, selective, and reproductive processes is forming a new paradigm for the study of within-host pathogen evolution that will be widely applicable across organisms.
Collapse
Affiliation(s)
- Ana Y Morales-Arce
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA.
| | - Susanna J Sabin
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Anne C Stone
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA.,School of Human Evolution and Social Change, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D Jensen
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA. .,School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
43
|
Abstract
It is increasingly evident that natural selection plays a prominent role in shaping patterns of diversity across the genome. The most commonly studied modes of natural selection are positive selection and negative selection, which refer to directional selection for and against derived mutations, respectively. Positive selection can result in hitchhiking events, in which a beneficial allele rapidly replaces all others in the population, creating a valley of diversity around the selected site along with characteristic skews in allele frequencies and linkage disequilibrium among linked neutral polymorphisms. Similarly, negative selection reduces variation not only at selected sites but also at linked sites, a phenomenon called background selection (BGS). Thus, discriminating between these two forces may be difficult, and one might expect efforts to detect hitchhiking to produce an excess of false positives in regions affected by BGS. Here, we examine the similarity between BGS and hitchhiking models via simulation. First, we show that BGS may somewhat resemble hitchhiking in simplistic scenarios in which a region constrained by negative selection is flanked by large stretches of unconstrained sites, echoing previous results. However, this scenario does not mirror the actual spatial arrangement of selected sites across the genome. By performing forward simulations under more realistic scenarios of BGS, modeling the locations of protein-coding and conserved noncoding DNA in real genomes, we show that the spatial patterns of variation produced by BGS rarely mimic those of hitchhiking events. Indeed, BGS is not substantially more likely than neutrality to produce false signatures of hitchhiking. This holds for simulations modeled after both humans and Drosophila, and for several different demographic histories. These results demonstrate that appropriately designed scans for hitchhiking need not consider BGS's impact on false-positive rates. However, we do find evidence that BGS increases the false-negative rate for hitchhiking, an observation that demands further investigation.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina 27514
| |
Collapse
|
44
|
Mughal MR, Koch H, Huang J, Chiaromonte F, DeGiorgio M. Learning the properties of adaptive regions with functional data analysis. PLoS Genet 2020; 16:e1008896. [PMID: 32853200 PMCID: PMC7480868 DOI: 10.1371/journal.pgen.1008896] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 09/09/2020] [Accepted: 05/29/2020] [Indexed: 12/12/2022] Open
Abstract
Identifying regions of positive selection in genomic data remains a challenge in population genetics. Most current approaches rely on comparing values of summary statistics calculated in windows. We present an approach termed SURFDAWave, which translates measures of genetic diversity calculated in genomic windows to functional data. By transforming our discrete data points to be outputs of continuous functions defined over genomic space, we are able to learn the features of these functions that signify selection. This enables us to confidently identify complex modes of natural selection, including adaptive introgression. We are also able to predict important selection parameters that are responsible for shaping the inferred selection events. By applying our model to human population-genomic data, we recapitulate previously identified regions of selective sweeps, such as OCA2 in Europeans, and predict that its beneficial mutation reached a frequency of 0.02 before it swept 1,802 generations ago, a time when humans were relatively new to Europe. In addition, we identify BNC2 in Europeans as a target of adaptive introgression, and predict that it harbors a beneficial mutation that arose in an archaic human population that split from modern humans within the hypothesized modern human-Neanderthal divergence range.
Collapse
Affiliation(s)
- Mehreen R. Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Hillary Koch
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jinguo Huang
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Francesca Chiaromonte
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
| |
Collapse
|
45
|
Abstract
One of the most useful models in population genetics is that of a selective sweep and the consequent hitch-hiking of linked neutral alleles. While variations on this model typically assume constant population size, many instances of strong selection and rapid adaptation in nature may co-occur with complex demography. Here, we extend the hitch-hiking model to evolutionary rescue, where adaptation and demography not only co-occur but are intimately entwined. Our results show how this feedback between demography and evolution determines-and restricts-the genetic signatures of evolutionary rescue, and how these differ from the signatures of sweeps in populations of constant size. In particular, we find rescue to harden sweeps from standing variance or new mutation (but not from migration), reduce genetic diversity both at the selected site and genome-wide, and increase the range of observed Tajima's D values. For a given initial rate of population decline, the feedback between demography and evolution makes all of these differences more dramatic under weaker selection, where bottlenecks are prolonged. Nevertheless, it is likely difficult to infer the co-incident timing of the sweep and bottleneck from these simple signatures, never mind a feedback between them. Temporal samples spanning contemporary rescue events may offer one way forward.
Collapse
Affiliation(s)
- Matthew M Osmond
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Graham Coop
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis, California 95616
| |
Collapse
|
46
|
Abstract
First inspired by the seminal work of Lewontin and Krakauer (1973. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74(1):175-195.) and Maynard Smith and Haigh (1974. The hitch-hiking effect of a favourable gene. Genet Res. 23(1):23-35.), genomic scans for positive selection remain a widely utilized tool in modern population genomic analysis. Yet, the relative frequency and genomic impact of selective sweeps have remained a contentious point in the field for decades, largely owing to an inability to accurately identify their presence and quantify their effects-with current methodologies generally being characterized by low true-positive rates and/or high false-positive rates under many realistic demographic models. Most of these approaches are based on Wright-Fisher assumptions and the Kingman coalescent and generally rely on detecting outlier regions which do not conform to these neutral expectations. However, previous theoretical results have demonstrated that selective sweeps are well characterized by an alternative class of model known as the multiple-merger coalescent. Taken together, this suggests the possibility of not simply identifying regions which reject the Kingman, but rather explicitly testing the relative fit of a genomic window to the multiple-merger coalescent. We describe the advantages of such an approach, which owe to the branching structure differentiating selective and neutral models, and demonstrate improved power under certain demographic scenarios relative to a commonly used approach. However, regions of the demographic parameter space continue to exist in which neither this approach nor existing methodologies have sufficient power to detect selective sweeps.
Collapse
|
47
|
Abstract
Over the past few years several methodological and data-driven advances have greatly improved our ability to robustly detect genomic signatures of selection in humans. New methods applied to large samples of present-day genomes provide increased power, while ancient DNA allows precise estimation of timing and tempo. However, despite these advances, we are still limited in our ability to translate these signatures into understanding about which traits were actually under selection, and why. Combining information from different populations and timescales may allow interpretation of selective sweeps. Other modes of selection have proved more difficult to detect. In particular, despite strong evidence of the polygenicity of most human traits, evidence for polygenic selection is weak, and its importance in recent human evolution remains unclear. Balancing selection and archaic introgression seem important for the maintenance of potentially adaptive immune diversity, but perhaps less so for other traits.
Collapse
Affiliation(s)
- Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, United States.
| |
Collapse
|
48
|
Johri P, Charlesworth B, Jensen JD. Toward an Evolutionarily Appropriate Null Model: Jointly Inferring Demography and Purifying Selection. Genetics 2020; 215:173-192. [PMID: 32152045 PMCID: PMC7198275 DOI: 10.1534/genetics.119.303002] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 03/05/2020] [Indexed: 01/27/2023] Open
Abstract
The question of the relative evolutionary roles of adaptive and nonadaptive processes has been a central debate in population genetics for nearly a century. While advances have been made in the theoretical development of the underlying models, and statistical methods for estimating their parameters from large-scale genomic data, a framework for an appropriate null model remains elusive. A model incorporating evolutionary processes known to be in constant operation, genetic drift (as modulated by the demographic history of the population) and purifying selection, is lacking. Without such a null model, the role of adaptive processes in shaping within- and between-population variation may not be accurately assessed. Here, we investigate how population size changes and the strength of purifying selection affect patterns of variation at "neutral" sites near functional genomic components. We propose a novel statistical framework for jointly inferring the contribution of the relevant selective and demographic parameters. By means of extensive performance analyses, we quantify the utility of the approach, identify the most important statistics for parameter estimation, and compare the results with existing methods. Finally, we reanalyze genome-wide population-level data from a Zambian population of Drosophila melanogaster, and find that it has experienced a much slower rate of population growth than was inferred when the effects of purifying selection were neglected. Our approach represents an appropriate null model, against which the effects of positive selection can be assessed.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona 85287
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona 85287
| |
Collapse
|
49
|
Abstract
A major research goal in evolutionary genetics is to uncover loci experiencing positive selection. One approach involves finding 'selective sweeps' patterns, which can either be 'hard sweeps' formed by de novo mutation, or 'soft sweeps' arising from recurrent mutation or existing standing variation. Existing theory generally assumes outcrossing populations, and it is unclear how dominance affects soft sweeps. We consider how arbitrary dominance and inbreeding via self-fertilization affect hard and soft sweep signatures. With increased self-fertilization, they are maintained over longer map distances due to reduced effective recombination and faster beneficial allele fixation times. Dominance can affect sweep patterns in outcrossers if the derived variant originates from either a single novel allele, or from recurrent mutation. These models highlight the challenges in distinguishing hard and soft sweeps, and propose methods to differentiate between scenarios.
Collapse
Affiliation(s)
- Matthew Hartfield
- Department of Ecology and Evolutionary Biology, University of Toronto, Ontario M5S 3B2, Canada,
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark, and
- Institute of Evolutionary Biology, The University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark, and
| |
Collapse
|
50
|
Apata M, Pfeifer SP. Recent population genomic insights into the genetic basis of arsenic tolerance in humans: the difficulties of identifying positively selected loci in strongly bottlenecked populations. Heredity (Edinb) 2020; 124:253-262. [PMID: 31776483 PMCID: PMC6972707 DOI: 10.1038/s41437-019-0285-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 10/22/2019] [Accepted: 11/13/2019] [Indexed: 02/06/2023] Open
Abstract
Recent advances in genomics have enabled researchers to shed light on the evolutionary processes driving human adaptation, by revealing the genetic architectures underlying traits ranging from lactase persistence, to skin pigmentation, to hypoxic response, to arsenic tolerance. Complicating the identification of targets of positive selection in modern human populations is their complex demographic history, characterized by population bottlenecks and expansions, population structure, migration, and admixture. In particular, founder effects and recent strong population size reductions, such as those experienced by the indigenous peoples of the Americas, have severe impacts on genetic variation that can lead to the accumulation of large allele frequency differences between populations due to genetic drift rather than natural selection. While distinguishing the effects of demographic history from selection remains challenging, neglecting neutral processes can lead to the incorrect identification of candidate loci. We here review the recent population genomic insights into the genetic basis of arsenic tolerance in Andean populations, and utilize this example to highlight both the difficulties pertaining to the identification of local adaptations in strongly bottlenecked populations, as well as the importance of controlling for demographic history in selection scans.
Collapse
Affiliation(s)
- Mario Apata
- Center for Evolution & Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, 85821, USA
| | - Susanne P Pfeifer
- Center for Evolution & Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, 85821, USA.
| |
Collapse
|