1
|
Amin MR, Hasan M, DeGiorgio M. Digital Image Processing to Detect Adaptive Evolution. Mol Biol Evol 2024; 41:msae242. [PMID: 39565932 PMCID: PMC11631197 DOI: 10.1093/molbev/msae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 10/28/2024] [Accepted: 11/13/2024] [Indexed: 11/22/2024] Open
Abstract
In recent years, advances in image processing and machine learning have fueled a paradigm shift in detecting genomic regions under natural selection. Early machine learning techniques employed population-genetic summary statistics as features, which focus on specific genomic patterns expected by adaptive and neutral processes. Though such engineered features are important when training data are limited, the ease at which simulated data can now be generated has led to the recent development of approaches that take in image representations of haplotype alignments and automatically extract important features using convolutional neural networks. Digital image processing methods termed α-molecules are a class of techniques for multiscale representation of objects that can extract a diverse set of features from images. One such α-molecule method, termed wavelet decomposition, lends greater control over high-frequency components of images. Another α-molecule method, termed curvelet decomposition, is an extension of the wavelet concept that considers events occurring along curves within images. We show that application of these α-molecule techniques to extract features from image representations of haplotype alignments yield high true positive rate and accuracy to detect hard and soft selective sweep signatures from genomic data with both linear and nonlinear machine learning classifiers. Moreover, we find that such models are easy to visualize and interpret, with performance rivaling those of contemporary deep learning approaches for detecting sweeps.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
2
|
Carvajal-Rodríguez A. iHDSel software: The price equation and the population stability index to detect genomic patterns compatible with selective sweeps. An example with SARS-CoV-2. Biol Methods Protoc 2024; 9:bpae089. [PMID: 39679303 PMCID: PMC11646571 DOI: 10.1093/biomethods/bpae089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2024] [Revised: 11/19/2024] [Accepted: 11/25/2024] [Indexed: 12/17/2024] Open
Abstract
A large number of methods have been developed and continue to evolve for detecting the signatures of selective sweeps in genomes. Significant advances have been made, including the combination of different statistical strategies and the incorporation of artificial intelligence (machine learning) methods. Despite these advances, several common problems persist, such as the unknown null distribution of the statistics used, necessitating simulations and resampling to assign significance to the statistics. Additionally, it is not always clear how deviations from the specific assumptions of each method might affect the results. In this work, allelic classes of haplotypes are used along with the informational interpretation of the Price equation to design a statistic with a known distribution that can detect genomic patterns caused by selective sweeps. The statistic consists of Jeffreys divergence, also known as the population stability index, applied to the distribution of allelic classes of haplotypes in two samples. Results with simulated data show optimal performance of the statistic in detecting divergent selection. Analysis of real severe acute respiratory syndrome coronavirus 2 genome data also shows that some of the sites playing key roles in the virus's fitness and immune escape capability are detected by the method. The new statistic, called JHAC , is incorporated into the iHDSel (informed HacDivSel) software available at https://acraaj.webs.uvigo.es/iHDSel.html.
Collapse
Affiliation(s)
- Antonio Carvajal-Rodríguez
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, Vigo, 36310 Spain
| |
Collapse
|
3
|
Soni V, Terbot JW, Versoza CJ, Pfeifer SP, Jensen JD. A whole-genome scan for evidence of recent positive and balancing selection in aye-ayes ( Daubentonia madagascariensis) utilizing a well-fit evolutionary baseline model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.08.622667. [PMID: 39605496 PMCID: PMC11601216 DOI: 10.1101/2024.11.08.622667] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
The aye-aye (Daubentonia madagascariensis) is one of the 25 most endangered primate species in the world, maintaining amongst the lowest genetic diversity of any primate measured to date. Characterizing patterns of genetic variation within aye-aye populations, and the relative influences of neutral and selective processes in shaping that variation, is thus important for future conservation efforts. In this study, we performed the first whole-genome scans for recent positive and balancing selection in the species, utilizing high-coverage population genomic data from newly sequenced individuals. We generated null thresholds for our genomic scans by creating an evolutionarily appropriate baseline model that incorporates the demographic history of this aye-aye population, and identified a small number of candidate genes. Most notably, a suite of genes involved in olfaction - a key trait in these nocturnal primates - were identified as experiencing long-term balancing selection. We also conducted analyses to quantify the expected statistical power to detect positive and balancing selection in this population using site frequency spectrum-based inference methods, once accounting for the potentially confounding contributions of population history, recombination and mutation rate variation, and purifying and background selection. This work, presenting the first high-quality, genome-wide polymorphism data across the functional regions of the aye-aye genome, thus provides important insights into the landscape of episodic selective forces in this highly endangered species.
Collapse
Affiliation(s)
- Vivak Soni
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - John W. Terbot
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Cyril J. Versoza
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Susanne P. Pfeifer
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D. Jensen
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
4
|
Gouy A, Wang X, Kapopoulou A, Neuenschwander S, Schmid E, Excoffier L, Heckel G. Genomes of Microtus Rodents Highlight the Importance of Olfactory and Immune Systems in Their Fast Radiation. Genome Biol Evol 2024; 16:evae233. [PMID: 39445808 PMCID: PMC11579656 DOI: 10.1093/gbe/evae233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 10/02/2024] [Accepted: 10/07/2024] [Indexed: 10/25/2024] Open
Abstract
The characterization of genes and biological functions underlying functional diversification and the formation of species is a major goal of evolutionary biology. In this study, we investigated the fast radiation of Microtus voles, one of the most speciose group of mammals, which shows strong genetic divergence despite few readily observable morphological differences. We produced an annotated reference genome for the common vole, Microtus arvalis, and resequenced the genomes of 10 different species and evolutionary lineages spanning the Microtus speciation continuum. Our full-genome sequences illustrate the recent and fast diversification of this group, and we identified genes in highly divergent genomic windows that have likely particular roles in their radiation. We found three biological functions enriched for highly divergent genes in most Microtus species and lineages: olfaction, immunity and metabolism. In particular, olfaction-related genes (mostly olfactory receptors and vomeronasal receptors) are fast evolving in all Microtus species indicating the exceptional importance of the olfactory system in the evolution of these rodents. Of note is e.g. the shared signature among vole species on Olfr1019 which has been associated with fear responses against predator odors in rodents. Our analyses provide a genome-wide basis for the further characterization of the ecological factors and processes of natural and sexual selection that have contributed to the fast radiation of Microtus voles.
Collapse
Affiliation(s)
- Alexandre Gouy
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Xuejing Wang
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Adamandia Kapopoulou
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Emanuel Schmid
- Vital-IT, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Gerald Heckel
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
5
|
Marsh JI, Johri P. Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection. Mol Biol Evol 2024; 41:msae118. [PMID: 38874402 PMCID: PMC11245712 DOI: 10.1093/molbev/msae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/05/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
Inferring the demographic history of populations provides fundamental insights into species dynamics and is essential for developing a null model to accurately study selective processes. However, background selection and selective sweeps can produce genomic signatures at linked sites that mimic or mask signals associated with historical population size change. While the theoretical biases introduced by the linked effects of selection have been well established, it is unclear whether ancestral recombination graph (ARG)-based approaches to demographic inference in typical empirical analyses are susceptible to misinference due to these effects. To address this, we developed highly realistic forward simulations of human and Drosophila melanogaster populations, including empirically estimated variability of gene density, mutation rates, recombination rates, purifying, and positive selection, across different historical demographic scenarios, to broadly assess the impact of selection on demographic inference using a genealogy-based approach. Our results indicate that the linked effects of selection minimally impact demographic inference for human populations, although it could cause misinference in populations with similar genome architecture and population parameters experiencing more frequent recurrent sweeps. We found that accurate demographic inference of D. melanogaster populations by ARG-based methods is compromised by the presence of pervasive background selection alone, leading to spurious inferences of recent population expansion, which may be further worsened by recurrent sweeps, depending on the proportion and strength of beneficial mutations. Caution and additional testing with species-specific simulations are needed when inferring population history with non-human populations using ARG-based approaches to avoid misinference due to the linked effects of selection.
Collapse
Affiliation(s)
- Jacob I Marsh
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Parul Johri
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
6
|
Soni V, Jensen JD. Temporal challenges in detecting balancing selection from population genomic data. G3 (BETHESDA, MD.) 2024; 14:jkae069. [PMID: 38551137 DOI: 10.1093/g3journal/jkae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 12/21/2023] [Accepted: 03/19/2024] [Indexed: 04/28/2024]
Abstract
The role of balancing selection in maintaining genetic variation remains an open question in population genetics. Recent years have seen numerous studies identifying candidate loci potentially experiencing balancing selection, most predominantly in human populations. There are however numerous alternative evolutionary processes that may leave similar patterns of variation, thereby potentially confounding inference, and the expected signatures of balancing selection additionally change in a temporal fashion. Here we use forward-in-time simulations to quantify expected statistical power to detect balancing selection using both site frequency spectrum- and linkage disequilibrium-based methods under a variety of evolutionarily realistic null models. We find that whilst site frequency spectrum-based methods have little power immediately after a balanced mutation begins segregating, power increases with time since the introduction of the balanced allele. Conversely, linkage disequilibrium-based methods have considerable power whilst the allele is young, and power dissipates rapidly as the time since introduction increases. Taken together, this suggests that site frequency spectrum-based methods are most effective at detecting long-term balancing selection (>25N generations since the introduction of the balanced allele) whilst linkage disequilibrium-based methods are effective over much shorter timescales (<1N generations), thereby leaving a large time frame over which current methods have little power to detect the action of balancing selection. Finally, we investigate the extent to which alternative evolutionary processes may mimic these patterns, and demonstrate the need for caution in attempting to distinguish the signatures of balancing selection from those of both neutral processes (e.g. population structure and admixture) as well as of alternative selective processes (e.g. partial selective sweeps).
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ 85281, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Center for Evolution & Medicine, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
7
|
Panigrahi M, Rajawat D, Nayak SS, Ghildiyal K, Sharma A, Jain K, Lei C, Bhushan B, Mishra BP, Dutt T. Landmarks in the history of selective sweeps. Anim Genet 2023; 54:667-688. [PMID: 37710403 DOI: 10.1111/age.13355] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/28/2023] [Indexed: 09/16/2023]
Abstract
Half a century ago, a seminal article on the hitchhiking effect by Smith and Haigh inaugurated the concept of the selection signature. Selective sweeps are characterised by the rapid spread of an advantageous genetic variant through a population and hence play an important role in shaping evolution and research on genetic diversity. The process by which a beneficial allele arises and becomes fixed in a population, leading to a increase in the frequency of other linked alleles, is known as genetic hitchhiking or genetic draft. Kimura's neutral theory and hitchhiking theory are complementary, with Kimura's neutral evolution as the 'null model' and positive selection as the 'signal'. Both are widely accepted in evolution, especially with genomics enabling precise measurements. Significant advances in genomic technologies, such as next-generation sequencing, high-density SNP arrays and powerful bioinformatics tools, have made it possible to systematically investigate selection signatures in a variety of species. Although the history of selection signatures is relatively recent, progress has been made in the last two decades, owing to the increasing availability of large-scale genomic data and the development of computational methods. In this review, we embark on a journey through the history of research on selective sweeps, ranging from early theoretical work to recent empirical studies that utilise genomic data.
Collapse
Affiliation(s)
- Manjit Panigrahi
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Divya Rajawat
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | | | - Kanika Ghildiyal
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Anurodh Sharma
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Karan Jain
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Chuzhao Lei
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Bharat Bhushan
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Bishnu Prasad Mishra
- Division of Animal Biotechnology, ICAR-National Bureau of Animal Genetic Resources, Karnal, India
| | - Triveni Dutt
- Livestock Production and Management Section, Indian Veterinary Research Institute, Bareilly, India
| |
Collapse
|
8
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. Evolution 2023; 77:2113-2127. [PMID: 37395482 PMCID: PMC10547124 DOI: 10.1093/evolut/qpad120] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/15/2023] [Accepted: 06/30/2023] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modeled by a realistic mutation rate and as part of a realistic distribution of fitness effects, as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modeled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false-positive rates are in excess of true-positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
9
|
Mascarenhas R, Meirelles PM, Batalha-Filho H. Urbanization drives adaptive evolution in a Neotropical bird. Curr Zool 2023; 69:607-619. [PMID: 37637315 PMCID: PMC10449428 DOI: 10.1093/cz/zoac066] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 08/16/2022] [Indexed: 08/29/2023] Open
Abstract
Urbanization has dramatic impacts on natural habitats and such changes may potentially drive local adaptation of urban populations. Behavioral change has been specifically shown to facilitate the fast adaptation of birds to changing environments, but few studies have investigated the genetic mechanisms of this process. Such investigations could provide insights into questions about both evolutionary theory and management of urban populations. In this study, we investigated whether local adaptation has occurred in urban populations of a Neotropical bird species, Coereba flaveola, specifically addressing whether observed behavioral adaptations are correlated to genetic signatures of natural selection. To answer this question, we sampled 24 individuals in urban and rural environments, and searched for selected loci through a genome-scan approach based on RADseq genomic data, generated and assembled using a reference genome for the species. We recovered 46 loci as putative selection outliers, and 30 of them were identified as associated with biological processes possibly related to urban adaptation, such as the regulation of energetic metabolism, regulation of genetic expression, and changes in the immunological system. Moreover, genes involved in the development of the nervous system showed signatures of selection, suggesting a link between behavioral and genetic adaptations. Our findings, in conjunction with similar results in previous studies, support the idea that cities provide a similar selective pressure on urban populations and that behavioral plasticity may be enhanced through genetic changes in urban populations.
Collapse
Affiliation(s)
- Rilquer Mascarenhas
- National Institute of Science and Technology in Interdisciplinary and Transdisciplinary Studies in Ecology and Evolution (INCT IN-TREE), Instituto de Biologia, Universidade Federal da Bahia, 40170-115 Salvador, Bahia, Brazil
| | - Pedro Milet Meirelles
- National Institute of Science and Technology in Interdisciplinary and Transdisciplinary Studies in Ecology and Evolution (INCT IN-TREE), Instituto de Biologia, Universidade Federal da Bahia, 40170-115 Salvador, Bahia, Brazil
| | - Henrique Batalha-Filho
- National Institute of Science and Technology in Interdisciplinary and Transdisciplinary Studies in Ecology and Evolution (INCT IN-TREE), Instituto de Biologia, Universidade Federal da Bahia, 40170-115 Salvador, Bahia, Brazil
| |
Collapse
|
10
|
Ben-Jemaa S, Adam G, Boussaha M, Bardou P, Klopp C, Mandonnet N, Naves M. Whole genome sequencing reveals signals of adaptive admixture in Creole cattle. Sci Rep 2023; 13:12155. [PMID: 37500674 PMCID: PMC10374910 DOI: 10.1038/s41598-023-38774-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 07/14/2023] [Indexed: 07/29/2023] Open
Abstract
The Creole cattle from Guadeloupe (GUA) are well adapted to the tropical environment. Its admixed genome likely played an important role in such adaptation. Here, we sought to detect genomic signatures of selection in the GUA genome. For this purpose, we sequenced 23 GUA individuals and combined our data with sequenced genomes of 99 animals representative of European, African and indicine groups. We detect 17,228,983 single nucleotide polymorphisms (SNPs) in the GUA genome, providing the most detailed exploration, to date, of patterns of genetic variation in this breed. We confirm the higher level of African and indicine ancestries, compared to the European ancestry and we highlight the African origin of indicine ancestry in the GUA genome. We identify five strong candidate regions showing an excess of indicine ancestry and consistently supported across the different detection methods. These regions encompass genes with adaptive roles in relation to immunity, thermotolerance and physical activity. We confirmed a previously identified horn-related gene, RXFP2, as a gene under strong selective pressure in the GUA population likely owing to human-driven (socio-cultural) pressure. Findings from this study provide insight into the genetic mechanisms associated with resilience traits in livestock.
Collapse
Affiliation(s)
- Slim Ben-Jemaa
- INRAE, ASSET, 97170, Petit-Bourg, France.
- Laboratoire des Productions Animales et Fourragères, Institut National de la Recherche Agronomique de Tunisie, Université de Carthage, 2049, Ariana, Tunisia.
| | | | - Mekki Boussaha
- AgroParisTech, GABI, INRAE, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Philippe Bardou
- GenPhySE, Ecole Nationale Vétérinaire de Toulouse (ENVT), INRA, Université de Toulouse, 24 Chemin de Borde Rouge, 31320, Castanet-Tolosan, France
- Sigenae, INRAE, 24 Chemin de Borde Rouge, 31320, Castanet-Tolosan, France
| | - Christophe Klopp
- Genotoul Bioinfo, BioInfoMics, MIAT UR875, Sigenae, INRAE, Castanet-Tolosan, France
| | | | | |
Collapse
|
11
|
Soni V, Johri P, Jensen JD. Evaluating power to detect recurrent selective sweeps under increasingly realistic evolutionary null models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.15.545166. [PMID: 37398347 PMCID: PMC10312679 DOI: 10.1101/2023.06.15.545166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modelled by a realistic mutation rate and as part of a realistic distribution of fitness effects (DFE), as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modelled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false positive rates are in excess of true positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong. Teaser Text Outlier-based genomic scans have proven a popular approach for identifying loci that have potentially experienced recent positive selection. However, it has previously been shown that an evolutionarily appropriate baseline model that incorporates non-equilibrium population histories, purifying and background selection, and variation in mutation and recombination rates is necessary to reduce often extreme false positive rates when performing genomic scans. Here we evaluate the power to detect recurrent selective sweeps using common SFS-based and haplotype-based methods under these increasingly realistic models. We find that while these appropriate evolutionary baselines are essential to reduce false positive rates, the power to accurately detect recurrent selective sweeps is generally low across much of the biologically relevant parameter space.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Present address: Department of Biology, Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | | |
Collapse
|
12
|
Nandakumar M, Lundberg M, Carlsson F, Råberg L. Balancing selection on the complement system of a wild rodent. BMC Ecol Evol 2023; 23:21. [PMID: 37231383 DOI: 10.1186/s12862-023-02122-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 05/10/2023] [Indexed: 05/27/2023] Open
Abstract
BACKGROUND Selection pressure exerted by pathogens can influence patterns of genetic diversity in the host. In the immune system especially, numerous genes encode proteins involved in antagonistic interactions with pathogens, paving the way for coevolution that results in increased genetic diversity as a consequence of balancing selection. The complement system is a key component of innate immunity. Many complement proteins interact directly with pathogens, either by recognising pathogen molecules for complement activation, or by serving as targets of pathogen immune evasion mechanisms. Complement genes can therefore be expected to be important targets of pathogen-mediated balancing selection, but analyses of such selection on this part of the immune system have been limited. RESULTS Using a population sample of whole-genome resequencing data from wild bank voles (n = 31), we estimated the extent of genetic diversity and tested for signatures of balancing selection in multiple complement genes (n = 44). Complement genes showed higher values of standardised β (a statistic expected to be high under balancing selection) than the genome-wide average of protein coding genes. One complement gene, FCNA, a pattern recognition molecule that interacts directly with pathogens, was found to have a signature of balancing selection, as indicated by the Hudson-Kreitman-Aguadé test (HKA) test. Scans for localised signatures of balancing selection in this gene indicated that the target of balancing selection was found in exonic regions involved in ligand binding. CONCLUSION The present study adds to the growing evidence that balancing selection may be an important evolutionary force on components of the innate immune system. The identified target in the complement system typifies the expectation that balancing selection acts on genes encoding proteins involved in direct interactions with pathogens.
Collapse
Affiliation(s)
| | - Max Lundberg
- Department of Biology, Lund University, Lund, Sweden
| | | | - Lars Råberg
- Department of Biology, Lund University, Lund, Sweden
| |
Collapse
|
13
|
Terbot JW, Johri P, Liphardt SW, Soni V, Pfeifer SP, Cooper BS, Good JM, Jensen JD. Developing an appropriate evolutionary baseline model for the study of SARS-CoV-2 patient samples. PLoS Pathog 2023; 19:e1011265. [PMID: 37018331 PMCID: PMC10075409 DOI: 10.1371/journal.ppat.1011265] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2023] Open
Abstract
Over the past 3 years, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has spread through human populations in several waves, resulting in a global health crisis. In response, genomic surveillance efforts have proliferated in the hopes of tracking and anticipating the evolution of this virus, resulting in millions of patient isolates now being available in public databases. Yet, while there is a tremendous focus on identifying newly emerging adaptive viral variants, this quantification is far from trivial. Specifically, multiple co-occurring and interacting evolutionary processes are constantly in operation and must be jointly considered and modeled in order to perform accurate inference. We here outline critical individual components of such an evolutionary baseline model-mutation rates, recombination rates, the distribution of fitness effects, infection dynamics, and compartmentalization-and describe the current state of knowledge pertaining to the related parameters of each in SARS-CoV-2. We close with a series of recommendations for future clinical sampling, model construction, and statistical analysis.
Collapse
Affiliation(s)
- John W Terbot
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Parul Johri
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Schuyler W Liphardt
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
| | - Vivak Soni
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Susanne P Pfeifer
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| | - Brandon S Cooper
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
| | - Jeffrey M Good
- University of Montana, Division of Biological Sciences, Missoula, Montana, United States of America
| | - Jeffrey D Jensen
- Arizona State University, School of Life Sciences, Center for Evolution & Medicine, Tempe, Arizona, United States of America
| |
Collapse
|
14
|
Jensen JD. Population genetic concerns related to the interpretation of empirical outliers and the neglect of common evolutionary processes. Heredity (Edinb) 2023; 130:109-110. [PMID: 36829044 PMCID: PMC9981695 DOI: 10.1038/s41437-022-00575-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 02/26/2023] Open
Affiliation(s)
- Jeffrey D Jensen
- School of Life Science, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
15
|
Burny C, Nolte V, Dolezal M, Schlötterer C. Genome-wide selection signatures reveal widespread synergistic effects of two different stressors in Drosophila melanogaster. Proc Biol Sci 2022; 289:20221857. [PMID: 36259211 PMCID: PMC9579754 DOI: 10.1098/rspb.2022.1857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Experimental evolution combined with whole-genome sequencing (evolve and resequence (E&R)) is a powerful approach to study the adaptive architecture of selected traits. Nevertheless, so far the focus has been on the selective response triggered by a single stressor. Building on the highly parallel selection response of founder populations with reduced variation, we evaluated how the presence of a second stressor affects the genomic selection response. After 20 generations of adaptation to laboratory conditions at either 18°C or 29°C, strong genome-wide selection signatures were observed. Only 38% of the selection signatures can be attributed to laboratory adaptation (no difference between temperature regimes). The remaining selection responses are either caused by temperature-specific effects, or reflect the joint effects of temperature and laboratory adaptation (same direction, but the magnitude differs between temperatures). The allele frequency changes resulting from the combined effects of temperature and laboratory adaptation were more extreme in the hot environment for 83% of the affected genomic regions-indicating widespread synergistic effects of the two stressors. We conclude that E&R with reduced genetic variation is a powerful approach to study genome-wide fitness consequences driven by the combined effects of multiple environmental factors.
Collapse
Affiliation(s)
- Claire Burny
- Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, Vienna 1210, Austria.,Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna 1210, Austria
| | - Viola Nolte
- Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, Vienna 1210, Austria
| | - Marlies Dolezal
- Plattform Bioinformatik und Biostatistik, Vetmeduni Vienna, Vienna 1210, Austria
| | - Christian Schlötterer
- Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, Vienna 1210, Austria
| |
Collapse
|
16
|
Kumar H, Panigrahi M, Panwar A, Rajawat D, Nayak SS, Saravanan KA, Kaisa K, Parida S, Bhushan B, Dutt T. Machine-Learning Prospects for Detecting Selection Signatures Using Population Genomics Data. J Comput Biol 2022; 29:943-960. [PMID: 35639362 DOI: 10.1089/cmb.2021.0447] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Natural selection has been given a lot of attention because it relates to the adaptation of populations to their environments, both biotic and abiotic. An allele is selected when it is favored by natural selection. Consequently, the favored allele increases in frequency in the population and neighboring linked variation diminishes, causing so-called selective sweeps. A high-throughput genomic sequence allows one to disentangle the evolutionary forces at play in populations. With the development of high-throughput genome sequencing technologies, it has become easier to detect these selective sweeps/selection signatures. Various methods can be used to detect selective sweeps, from simple implementations using summary statistics to complex statistical approaches. One of the important problems of these statistical models is the potential to provide inaccurate results when their assumptions are violated. The use of machine learning (ML) in population genetics has been introduced as an alternative method of detecting selection by treating the problem of detecting selection signatures as a classification problem. Since the availability of population genomics data is increasing, researchers may incorporate ML into these statistical models to infer signatures of selection with higher predictive accuracy and better resolution. This article describes how ML can be used to aid in detecting and studying natural selection patterns using population genomic data.
Collapse
Affiliation(s)
- Harshit Kumar
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Manjit Panigrahi
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Anuradha Panwar
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Divya Rajawat
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Sonali Sonejita Nayak
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - K A Saravanan
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Kaiho Kaisa
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Subhashree Parida
- Divisions of Pharmacology and Toxicology, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Bharat Bhushan
- Divisions of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| | - Triveni Dutt
- Livestock Production and Management Section, ICAR-Indian Veterinary Research Institute, Izatnagar, India
| |
Collapse
|
17
|
Johri P, Aquadro CF, Beaumont M, Charlesworth B, Excoffier L, Eyre-Walker A, Keightley PD, Lynch M, McVean G, Payseur BA, Pfeifer SP, Stephan W, Jensen JD. Recommendations for improving statistical inference in population genomics. PLoS Biol 2022; 20:e3001669. [PMID: 35639797 PMCID: PMC9154105 DOI: 10.1371/journal.pbio.3001669] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Charles F. Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Mark Beaumont
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne, Switzerland
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Peter D. Keightley
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Susanne P. Pfeifer
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| |
Collapse
|
18
|
Abrams MB, Brem RB. Temperature-dependent genetics of thermotolerance between yeast species. Front Ecol Evol 2022; 10:859904. [PMID: 36911365 PMCID: PMC10004143 DOI: 10.3389/fevo.2022.859904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Many traits of industrial and basic biological interest arose long ago, and manifest now as fixed differences between a focal species and its reproductively isolated relatives. In these systems, extant individuals can hold clues to the mechanisms by which phenotypes evolved in their ancestors. We harnessed yeast thermotolerance as a test case for such molecular-genetic inferences. In viability experiments, we showed that extant Saccharomyces cerevisiae survived at temperatures where cultures of its sister species S. paradoxus died out. Then, focusing on loci that contribute to this difference, we found that the genetic mechanisms of high-temperature growth changed with temperature. We also uncovered an enrichment of low-frequency variants at thermotolerance loci in S. cerevisiae population sequences, suggestive of a history of non-neutral selective forces acting at these genes. We interpret these results in light of models of the evolutionary mechanisms by which the thermotolerance trait arose in the S. cerevisiae lineage. Together, our results and interpretation underscore the power of genetic approaches to explore how an ancient trait came to be.
Collapse
Affiliation(s)
- Melanie B. Abrams
- UC Berkeley, Department of Plant and Microbial Biology, Berkeley, CA, USA
| | - Rachel B. Brem
- UC Berkeley, Department of Plant and Microbial Biology, Berkeley, CA, USA
| |
Collapse
|
19
|
Salloum PM, Santure AW, Lavery SD, de Villemereuil P. Finding the adaptive needles in a population-structured haystack: a case study in a New Zealand mollusc. J Anim Ecol 2022; 91:1209-1221. [PMID: 35318661 PMCID: PMC9311215 DOI: 10.1111/1365-2656.13692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Accepted: 03/09/2022] [Indexed: 11/30/2022]
Abstract
Genetic adaptation to future environmental conditions is crucial to help species persist as the climate changes. Genome scans are powerful tools to understand adaptive landscapes, enabling us to correlate genetic diversity with environmental gradients while disentangling neutral from adaptive variation. However, low gene flow can lead to both local adaptation and highly structured populations, and is a major confounding factor for genome scans, resulting in an inflated number of candidate loci. Here, we compared candidate locus detection in a marine mollusc (Onithochiton neglectus), taking advantage of a natural geographical contrast in the levels of genetic structure between its populations. O. neglectus is endemic to New Zealand and distributed throughout an environmental gradient from the subtropical north to the subantarctic south. Due to a brooding developmental mode, populations tend to be locally isolated. However, adult hitchhiking on rafting kelp increases connectivity among southern populations. We applied two genome scans for outliers (Bayescan and PCAdapt) and two genotype–environment association (GEA) tests (BayeScEnv and RDA). To limit issues with false positives, we combined results using the geometric mean of q‐values and performed association tests with random environmental variables. This novel approach is a compromise between stringent and relaxed approaches widely used before, and allowed us to classify candidate loci as low confidence or high confidence. Genome scans for outliers detected a large number of significant outliers in strong and moderately structured populations. No high‐confidence GEA loci were detected in the context of strong population structure. However, 86 high‐confidence loci were associated predominantly with latitudinally varying abiotic factors in the less structured southern populations. This suggests that the degree of connectivity driven by kelp rafting over the southern scale may be insufficient to counteract local adaptation in this species. Our study supports the expectation that genome scans may be prone to errors in highly structured populations. Nonetheless, it also empirically demonstrates that careful statistical controls enable the identification of candidate loci that invite more detailed investigations. Ultimately, genome scans are valuable tools to help guide further research aiming to determine the potential of non‐model species to adapt to future environments.
Collapse
Affiliation(s)
- P M Salloum
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - A W Santure
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - S D Lavery
- School of Biological Sciences, University of Auckland, Auckland, New Zealand.,Institute of Marine Science, Leigh Marine Laboratory, University of Auckland, Warkworth, New Zealand
| | - P de Villemereuil
- Institut de Systématique, Évolution, Biodiversité (ISYEB), École Pratique des Hautes Études
- PSL, MNHN, CNRS, SU, UA, Paris, France
| |
Collapse
|
20
|
Moinet A, Schlichta F, Peischl S, Excoffier L. Strong neutral sweeps occurring during a population contraction. Genetics 2022; 220:6529544. [PMID: 35171980 PMCID: PMC8982045 DOI: 10.1093/genetics/iyac021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 01/22/2022] [Indexed: 11/14/2022] Open
Abstract
A strong reduction in diversity around a specific locus is often interpreted as a recent rapid fixation of a positively selected allele, a phenomenon called a selective sweep. Rapid fixation of neutral variants can however lead to a similar reduction in local diversity, especially when the population experiences changes in population size, e.g. bottlenecks or range expansions. The fact that demographic processes can lead to signals of nucleotide diversity very similar to signals of selective sweeps is at the core of an ongoing discussion about the roles of demography and natural selection in shaping patterns of neutral variation. Here, we quantitatively investigate the shape of such neutral valleys of diversity under a simple model of a single population size change, and we compare it to signals of a selective sweep. We analytically describe the expected shape of such "neutral sweeps" and show that selective sweep valleys of diversity are, for the same fixation time, wider than neutral valleys. On the other hand, it is always possible to parametrize our model to find a neutral valley that has the same width as a given selected valley. Our findings provide further insight into how simple demographic models can create valleys of genetic diversity similar to those attributed to positive selection.
Collapse
Affiliation(s)
- Antoine Moinet
- Interfaculty Bioinformatics Unit, University of Bern, Bern 3012, Switzerland,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Computational and Molecular Population Genetics Lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| | - Flávia Schlichta
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Computational and Molecular Population Genetics Lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| | - Stephan Peischl
- Interfaculty Bioinformatics Unit, University of Bern, Bern 3012, Switzerland,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Corresponding author.
| | - Laurent Excoffier
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Computational and Molecular Population Genetics Lab, Institute of Ecology and Evolution, University of Bern, Baltzerstrasse 6, 3012 Bern, Switzerland
| |
Collapse
|
21
|
Morales-Arce AY, Johri P, Jensen JD. Inferring the distribution of fitness effects in patient-sampled and experimental virus populations: two case studies. Heredity (Edinb) 2022; 128:79-87. [PMID: 34987185 PMCID: PMC8728706 DOI: 10.1038/s41437-021-00493-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 12/12/2021] [Accepted: 12/13/2021] [Indexed: 11/19/2022] Open
Abstract
We here propose an analysis pipeline for inferring the distribution of fitness effects (DFE) from either patient-sampled or experimentally-evolved viral populations, that explicitly accounts for non-Wright-Fisher and non-equilibrium population dynamics inherent to pathogens. We examine the performance of this approach via extensive power and performance analyses, and highlight two illustrative applications - one from an experimentally-passaged RNA virus, and the other from a clinically-sampled DNA virus. Finally, we discuss how such DFE inference may shed light on major research questions in virus evolution, ranging from a quantification of the population genetic processes governing genome size, to the role of Hill-Robertson interference in dictating adaptive outcomes, to the potential design of novel therapeutic approaches to eradicate within-patient viral populations via induced mutational meltdown.
Collapse
Affiliation(s)
- Ana Y Morales-Arce
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Parul Johri
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D Jensen
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
22
|
Johri P, Stephan W, Jensen JD. Soft selective sweeps: Addressing new definitions, evaluating competing models, and interpreting empirical outliers. PLoS Genet 2022; 18:e1010022. [PMID: 35202407 PMCID: PMC8870509 DOI: 10.1371/journal.pgen.1010022] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The ability to accurately identify and quantify genetic signatures associated with soft selective sweeps based on patterns of nucleotide variation has remained controversial. We here provide counter viewpoints to recent publications in PLOS Genetics that have argued not only for the statistical identifiability of soft selective sweeps, but also for their pervasive evolutionary role in both Drosophila and HIV populations. We present evidence that these claims owe to a lack of consideration of competing evolutionary models, unjustified interpretations of empirical outliers, as well as to new definitions of the processes themselves. Our results highlight the dangers of fitting evolutionary models based on hypothesized and episodic processes without properly first considering common processes and, more generally, of the tendency in certain research areas to view pervasive positive selection as a foregone conclusion.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| |
Collapse
|
23
|
Johri P, Charlesworth B, Howell EK, Lynch M, Jensen JD. Revisiting the notion of deleterious sweeps. Genetics 2021; 219:iyab094. [PMID: 34125884 PMCID: PMC9101445 DOI: 10.1093/genetics/iyab094] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/08/2021] [Indexed: 11/14/2022] Open
Abstract
It has previously been shown that, conditional on its fixation, the time to fixation of a semi-dominant deleterious autosomal mutation in a randomly mating population is the same as that of an advantageous mutation. This result implies that deleterious mutations could generate selective sweep-like effects. Although their fixation probabilities greatly differ, the much larger input of deleterious relative to beneficial mutations suggests that this phenomenon could be important. We here examine how the fixation of mildly deleterious mutations affects levels and patterns of polymorphism at linked sites-both in the presence and absence of interference amongst deleterious mutations-and how this class of sites may contribute to divergence between-populations and species. We find that, while deleterious fixations are unlikely to represent a significant proportion of outliers in polymorphism-based genomic scans within populations, minor shifts in the frequencies of deleterious mutations can influence the proportions of private variants and the value of FST after a recent population split. As sites subject to deleterious mutations are necessarily found in functional genomic regions, interpretations in terms of recurrent positive selection may require reconsideration.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FL, UK
| | - Emma K Howell
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
- Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
24
|
Charlesworth B, Jensen JD. Effects of Selection at Linked Sites on Patterns of Genetic Variability. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021; 52:177-197. [PMID: 37089401 PMCID: PMC10120885 DOI: 10.1146/annurev-ecolsys-010621-044528] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Patterns of variation and evolution at a given site in a genome can be strongly influenced by the effects of selection at genetically linked sites. In particular, the recombination rates of genomic regions correlate with their amount of within-population genetic variability, the degree to which the frequency distributions of DNA sequence variants differ from their neutral expectations, and the levels of adaptation of their functional components. We review the major population genetic processes that are thought to lead to these patterns, focusing on their effects on patterns of variability: selective sweeps, background selection, associative overdominance, and Hill–Robertson interference among deleterious mutations. We emphasize the difficulties in distinguishing among the footprints of these processes and disentangling them from the effects of purely demographic factors such as population size changes. We also discuss how interactions between selective and demographic processes can significantly affect patterns of variability within genomes.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona 85281, USA
| |
Collapse
|
25
|
Johri P, Riall K, Becher H, Excoffier L, Charlesworth B, Jensen JD. The Impact of Purifying and Background Selection on the Inference of Population History: Problems and Prospects. Mol Biol Evol 2021; 38:2986-3003. [PMID: 33591322 PMCID: PMC8233493 DOI: 10.1093/molbev/msab050] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Current procedures for inferring population history generally assume complete neutrality—that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Kellen Riall
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Hannes Becher
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
26
|
Johri P, Riall K, Becher H, Excoffier L, Charlesworth B, Jensen JD. The impact of purifying and background selection on the inference of population history: problems and prospects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021. [PMID: 33501439 PMCID: PMC7836109 DOI: 10.1101/2020.04.28.066365] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Current procedures for inferring population history generally assume complete neutrality - that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects (DFE) and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the DFE as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Kellen Riall
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Hannes Becher
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne 3012, Switzerland.,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
27
|
Abstract
Drosophila melanogaster, a small dipteran of African origin, represents one of the best-studied model organisms. Early work in this system has uniquely shed light on the basic principles of genetics and resulted in a versatile collection of genetic tools that allow to uncover mechanistic links between genotype and phenotype. Moreover, given its worldwide distribution in diverse habitats and its moderate genome-size, Drosophila has proven very powerful for population genetics inference and was one of the first eukaryotes whose genome was fully sequenced. In this book chapter, we provide a brief historical overview of research in Drosophila and then focus on recent advances during the genomic era. After describing different types and sources of genomic data, we discuss mechanisms of neutral evolution including the demographic history of Drosophila and the effects of recombination and biased gene conversion. Then, we review recent advances in detecting genome-wide signals of selection, such as soft and hard selective sweeps. We further provide a brief introduction to background selection, selection of noncoding DNA and codon usage and focus on the role of structural variants, such as transposable elements and chromosomal inversions, during the adaptive process. Finally, we discuss how genomic data helps to dissect neutral and adaptive evolutionary mechanisms that shape genetic and phenotypic variation in natural populations along environmental gradients. In summary, this book chapter serves as a starting point to Drosophila population genomics and provides an introduction to the system and an overview to data sources, important population genetic concepts and recent advances in the field.
Collapse
|
28
|
Morales-Arce AY, Sabin SJ, Stone AC, Jensen JD. The population genomics of within-host Mycobacterium tuberculosis. Heredity (Edinb) 2020; 126:1-9. [PMID: 33060846 DOI: 10.1038/s41437-020-00377-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 10/02/2020] [Accepted: 10/03/2020] [Indexed: 11/09/2022] Open
Abstract
Recent progress in genomic sequencing from patient samples has allowed for the first detailed insight into the within-host genetic diversity of Mycobacterium tuberculosis (M.TB), revealing remarkably low levels of variation. While this has often been attributed to low mutation rates, other factors have been described, including resistance evolution (i.e., selective sweeps), widespread purifying and background selection, and, more recently, progeny skew. Here we review recent findings pertaining to the processes governing the evolutionary dynamics of M.TB, discuss their implications for improving our understanding of this important human pathogen, and make recommendations for future work. Significantly, this emerging evolutionary framework involving the joint estimation of demographic, selective, and reproductive processes is forming a new paradigm for the study of within-host pathogen evolution that will be widely applicable across organisms.
Collapse
Affiliation(s)
- Ana Y Morales-Arce
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA.
| | - Susanna J Sabin
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Anne C Stone
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA.,School of Human Evolution and Social Change, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D Jensen
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA. .,School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
29
|
Harris RB, Jensen JD. Considering Genomic Scans for Selection as Coalescent Model Choice. Genome Biol Evol 2020; 12:871-877. [PMID: 32396636 PMCID: PMC7313662 DOI: 10.1093/gbe/evaa093] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/06/2020] [Indexed: 12/17/2022] Open
Abstract
First inspired by the seminal work of Lewontin and Krakauer (1973. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74(1):175-195.) and Maynard Smith and Haigh (1974. The hitch-hiking effect of a favourable gene. Genet Res. 23(1):23-35.), genomic scans for positive selection remain a widely utilized tool in modern population genomic analysis. Yet, the relative frequency and genomic impact of selective sweeps have remained a contentious point in the field for decades, largely owing to an inability to accurately identify their presence and quantify their effects-with current methodologies generally being characterized by low true-positive rates and/or high false-positive rates under many realistic demographic models. Most of these approaches are based on Wright-Fisher assumptions and the Kingman coalescent and generally rely on detecting outlier regions which do not conform to these neutral expectations. However, previous theoretical results have demonstrated that selective sweeps are well characterized by an alternative class of model known as the multiple-merger coalescent. Taken together, this suggests the possibility of not simply identifying regions which reject the Kingman, but rather explicitly testing the relative fit of a genomic window to the multiple-merger coalescent. We describe the advantages of such an approach, which owe to the branching structure differentiating selective and neutral models, and demonstrate improved power under certain demographic scenarios relative to a commonly used approach. However, regions of the demographic parameter space continue to exist in which neither this approach nor existing methodologies have sufficient power to detect selective sweeps.
Collapse
|
30
|
Johri P, Charlesworth B, Jensen JD. Toward an Evolutionarily Appropriate Null Model: Jointly Inferring Demography and Purifying Selection. Genetics 2020; 215:173-192. [PMID: 32152045 PMCID: PMC7198275 DOI: 10.1534/genetics.119.303002] [Citation(s) in RCA: 107] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 03/05/2020] [Indexed: 01/27/2023] Open
Abstract
The question of the relative evolutionary roles of adaptive and nonadaptive processes has been a central debate in population genetics for nearly a century. While advances have been made in the theoretical development of the underlying models, and statistical methods for estimating their parameters from large-scale genomic data, a framework for an appropriate null model remains elusive. A model incorporating evolutionary processes known to be in constant operation, genetic drift (as modulated by the demographic history of the population) and purifying selection, is lacking. Without such a null model, the role of adaptive processes in shaping within- and between-population variation may not be accurately assessed. Here, we investigate how population size changes and the strength of purifying selection affect patterns of variation at "neutral" sites near functional genomic components. We propose a novel statistical framework for jointly inferring the contribution of the relevant selective and demographic parameters. By means of extensive performance analyses, we quantify the utility of the approach, identify the most important statistics for parameter estimation, and compare the results with existing methods. Finally, we reanalyze genome-wide population-level data from a Zambian population of Drosophila melanogaster, and find that it has experienced a much slower rate of population growth than was inferred when the effects of purifying selection were neglected. Our approach represents an appropriate null model, against which the effects of positive selection can be assessed.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona 85287
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona 85287
| |
Collapse
|
31
|
Ebrahimi A, Lawson SS, McKenna JR, Jacobs DF. Morpho-Physiological and Genomic Evaluation of Juglans Species Reveals Regional Maladaptation to Cold Stress. FRONTIERS IN PLANT SCIENCE 2020; 11:229. [PMID: 32210997 PMCID: PMC7077431 DOI: 10.3389/fpls.2020.00229] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 02/14/2020] [Indexed: 05/02/2023]
Abstract
Climate change may have unpredictable effects on the cold hardiness of woody species planted outside of their range of origin. Extreme undulations in temperatures may exacerbate susceptibility to cold stress, thereby interfering with productivity and ecosystem functioning. Juglans L. and their naturally occurring interspecific F1 hybrids, are distributed natively across many temperate regions, and J. regia has been extensively introduced. Cold hardiness, an environmental and genetic factor yet to be evaluated in many native and introduced Juglans species, may be a limiting factor under future climate change and following species introductions. We evaluated cold hardiness of native North American and Eastern Asian Juglans along with J. regia genotypes using field data from the Midwestern United States (Indiana), controlled freezing tests, and genome sequencing with close assessment of Juglans cold hardy genes. Many Juglans species previously screened for cold-hardiness were genotypes derived from the Midwest, California, and Europe. In 2014, despite general climate adaptation, Midwestern winter temperatures of -30°C killed J. regia originating from California; however, naturalized Midwestern J. regia survived and displayed low damage. Hybridization of J. regia with black walnut (J. nigra) and butternut (J. cinerea) produced F1s displaying greater cold tolerance than pure J. regia. Cold hardiness and growth are variable in Midwestern J. regia compared to native Juglans, East Asian Juglans, and F1 hybrids. Phylogeny analyses revealed that J. cinerea sorted with East Asian species using the nuclear genome but with North American species using the organellar genome. Investigation of selected cold hardy genes revealed that J. regia was distinct from other species and exhibited less genetic diversity than native Juglans species Average whole genome heterozygosity and Tajima's D for cold hardy genes was low within J. regia samples and significantly higher for hybrid as well as J. nigra. We confirmed that molecular and morpho-physiological data were highly correlated and thus can be used effectively to characterize cold hardiness in Juglans species. We conclude that the genetic diversity within local J. regia populations is low and additional germplasm is needed for development of more regionally adapted J. regia varieties.
Collapse
Affiliation(s)
- Aziz Ebrahimi
- Hardwood Tree Improvement and Regeneration Center, Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, United States
| | - Shaneka S. Lawson
- USDA Forest Service, Northern Research Station, Hardwood Tree Improvement and Regeneration Center, West Lafayette, IN, United States
| | - James R. McKenna
- USDA Forest Service, Northern Research Station, Hardwood Tree Improvement and Regeneration Center, West Lafayette, IN, United States
| | - Douglass F. Jacobs
- Hardwood Tree Improvement and Regeneration Center, Department of Forestry and Natural Resources, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
32
|
Scossa F, Fernie AR. The evolution of metabolism: How to test evolutionary hypotheses at the genomic level. Comput Struct Biotechnol J 2020; 18:482-500. [PMID: 32180906 PMCID: PMC7063335 DOI: 10.1016/j.csbj.2020.02.009] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Revised: 02/12/2020] [Accepted: 02/13/2020] [Indexed: 01/21/2023] Open
Abstract
The origin of primordial metabolism and its expansion to form the metabolic networks extant today represent excellent systems to study the impact of natural selection and the potential adaptive role of novel compounds. Here we present the current hypotheses made on the origin of life and ancestral metabolism and present the theories and mechanisms by which the large chemical diversity of plants might have emerged along evolution. In particular, we provide a survey of statistical methods that can be used to detect signatures of selection at the gene and population level, and discuss potential and limits of these methods for investigating patterns of molecular adaptation in plant metabolism.
Collapse
Affiliation(s)
- Federico Scossa
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476 Potsdam-Golm, Germany
- Council for Agricultural Research and Economics (CREA), Research Centre for Genomics and Bioinformatics (CREA-GB), Via Ardeatina 546, 00178 Rome, Italy
| | - Alisdair R. Fernie
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476 Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| |
Collapse
|
33
|
Apata M, Pfeifer SP. Recent population genomic insights into the genetic basis of arsenic tolerance in humans: the difficulties of identifying positively selected loci in strongly bottlenecked populations. Heredity (Edinb) 2020; 124:253-262. [PMID: 31776483 PMCID: PMC6972707 DOI: 10.1038/s41437-019-0285-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 10/22/2019] [Accepted: 11/13/2019] [Indexed: 02/06/2023] Open
Abstract
Recent advances in genomics have enabled researchers to shed light on the evolutionary processes driving human adaptation, by revealing the genetic architectures underlying traits ranging from lactase persistence, to skin pigmentation, to hypoxic response, to arsenic tolerance. Complicating the identification of targets of positive selection in modern human populations is their complex demographic history, characterized by population bottlenecks and expansions, population structure, migration, and admixture. In particular, founder effects and recent strong population size reductions, such as those experienced by the indigenous peoples of the Americas, have severe impacts on genetic variation that can lead to the accumulation of large allele frequency differences between populations due to genetic drift rather than natural selection. While distinguishing the effects of demographic history from selection remains challenging, neglecting neutral processes can lead to the incorrect identification of candidate loci. We here review the recent population genomic insights into the genetic basis of arsenic tolerance in Andean populations, and utilize this example to highlight both the difficulties pertaining to the identification of local adaptations in strongly bottlenecked populations, as well as the importance of controlling for demographic history in selection scans.
Collapse
Affiliation(s)
- Mario Apata
- Center for Evolution & Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, 85821, USA
| | - Susanne P Pfeifer
- Center for Evolution & Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, 85821, USA.
| |
Collapse
|
34
|
Koropoulis A, Alachiotis N, Pavlidis P. Detecting Positive Selection in Populations Using Genetic Data. Methods Mol Biol 2020; 2090:87-123. [PMID: 31975165 DOI: 10.1007/978-1-0716-0199-0_5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
High-throughput genomic sequencing allows to disentangle the evolutionary forces acting in populations. Among evolutionary forces, positive selection has received a lot of attention because it is related to the adaptation of populations in their environments, both biotic and abiotic. Positive selection, also known as Darwinian selection, occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and, due to genetic hitchhiking, neighboring linked variation diminishes, creating so-called selective sweeps. Such a process leaves traces in genomes that can be detected in a future time point. Detecting traces of positive selection in genomes is achieved by searching for signatures introduced by selective sweeps, such as regions of reduced variation, a specific shift of the site frequency spectrum, and particular linkage disequilibrium (LD) patterns in the region. A variety of approaches can be used for detecting selective sweeps, ranging from simple implementations that compute summary statistics to more advanced statistical approaches, e.g., Bayesian approaches, maximum-likelihood-based methods, and machine learning methods. In this chapter, we discuss selective sweep detection methodologies on the basis of their capacity to analyze whole genomes or just subgenomic regions, and on the specific polymorphism patterns they exploit as selective sweep signatures. We also summarize the results of comparisons among five open-source software releases (SweeD, SweepFinder, SweepFinder2, OmegaPlus, and RAiSD) regarding sensitivity, specificity, and execution times. Furthermore, we test and discuss machine learning methods and present a thorough performance analysis. In equilibrium neutral models or mild bottlenecks, most methods are able to detect selective sweeps accurately. Methods and tools that rely on linkage disequilibrium (LD) rather than single SNPs exhibit higher true positive rates than the site frequency spectrum (SFS)-based methods under the model of a single sweep or recurrent hitchhiking. However, their false positive rate is elevated when a misspecified demographic model is used to build the distribution of the statistic under the null hypothesis. Both LD and SFS-based approaches suffer from decreased accuracy on localizing the true target of selection in bottleneck scenarios. Furthermore, we present an extensive analysis of the effects of gene flow on selective sweep detection, a problem that has been understudied in selective sweep literature.
Collapse
Affiliation(s)
- Angelos Koropoulis
- Institute of Computer Science, Foundation for Research and Technology Hellas, Heraklion, Greece
- Computer Science Department, University of Crete, Crete, Heraklion, Greece
| | - Nikolaos Alachiotis
- Institute of Computer Science, Foundation for Research and Technology Hellas, Heraklion, Greece
| | - Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology Hellas, Heraklion, Greece.
| |
Collapse
|
35
|
Cullingham CI, Peery RM, Fortier CE, Mahon EL, Cooke JEK, Coltman DW. Linking genotype to phenotype to identify genetic variation relating to host susceptibility in the mountain pine beetle system. Evol Appl 2020; 13:48-61. [PMID: 31892943 PMCID: PMC6935584 DOI: 10.1111/eva.12773] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Revised: 01/09/2019] [Accepted: 01/13/2019] [Indexed: 12/24/2022] Open
Abstract
Identifying genetic variants responsible for phenotypic variation under selective pressure has the potential to enable productive gains in natural resource conservation and management. Despite this potential, identifying adaptive candidate loci is not trivial, and linking genotype to phenotype is a major challenge in contemporary genetics. Many of the population genetic approaches commonly used to identify adaptive candidates will simultaneously detect false positives, particularly in nonmodel species, where experimental evidence is seldom provided for putative roles of the adaptive candidates identified by outlier approaches. In this study, we use outcomes from population genetics, phenotype association, and gene expression analyses as multiple lines of evidence to validate candidate genes. Using lodgepole and jack pine as our nonmodel study species, we analyzed 17 adaptive candidate loci together with 78 putatively neutral loci at 58 locations across Canada (N > 800) to determine whether relationships could be established between these candidate loci and phenotype related to mountain pine beetle susceptibility. We identified two candidate loci that were significant across all population genetic tests, and demonstrated significant changes in transcript abundance in trees subjected to wounding or inoculation with the mountain pine beetle fungal associate Grosmannia clavigera. Both candidates are involved in central physiological processes that are likely to be invoked in a trees response to stress. One of these two candidate loci showed a significant association with mountain pine beetle attack status in lodgepole pine. The spatial distribution of the attack-associated allele further coincides with other indicators of susceptibility in lodgepole pine. These analyses, in which population genetics was combined with laboratory and field experimental validation approaches, represent first steps toward linking genetic variation to the phenotype of mountain pine beetle susceptibility in lodgepole and jack pine, and provide a roadmap for more comprehensive analyses.
Collapse
Affiliation(s)
| | - Rhiannon M. Peery
- Department of Biological SciencesUniversity of AlbertaEdmontonAlbertaCanada
| | - Colleen E. Fortier
- Department of Biological SciencesUniversity of AlbertaEdmontonAlbertaCanada
| | - Elizabeth L. Mahon
- Department of Biological SciencesUniversity of AlbertaEdmontonAlbertaCanada
- Department of Wood ScienceUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Janice E. K. Cooke
- Department of Biological SciencesUniversity of AlbertaEdmontonAlbertaCanada
| | - David W. Coltman
- Department of Biological SciencesUniversity of AlbertaEdmontonAlbertaCanada
| |
Collapse
|
36
|
Thornton KR. Polygenic Adaptation to an Environmental Shift: Temporal Dynamics of Variation Under Gaussian Stabilizing Selection and Additive Effects on a Single Trait. Genetics 2019; 213:1513-1530. [PMID: 31653678 PMCID: PMC6893385 DOI: 10.1534/genetics.119.302662] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 10/21/2019] [Indexed: 11/26/2022] Open
Abstract
Predictions about the effect of natural selection on patterns of linked neutral variation are largely based on models involving the rapid fixation of unconditionally beneficial mutations. However, when phenotypes adapt to a new optimum trait value, the strength of selection on individual mutations decreases as the population adapts. Here, I use explicit forward simulations of a single trait with additive-effect mutations adapting to an "optimum shift." Detectable "hitchhiking" patterns are only apparent if (i) the optimum shifts are large with respect to equilibrium variation for the trait, (ii) mutation rates to large-effect mutations are low, and (iii) large-effect mutations rapidly increase in frequency and eventually reach fixation, which typically occurs after the population reaches the new optimum. For the parameters simulated here, partial sweeps do not appreciably affect patterns of linked variation, even when the mutations are strongly selected. The contribution of new mutations vs. standing variation to fixation depends on the mutation rate affecting trait values. Given the fixation of a strongly selected variant, patterns of hitchhiking are similar on average for the two classes of sweeps because sweeps from standing variation involving large-effect mutations are rare when the optimum shifts. The distribution of effect sizes of new mutations has little effect on the time to reach the new optimum, but reducing the mutational variance increases the magnitude of hitchhiking patterns. In general, populations reach the new optimum prior to the completion of any sweeps, and the times to fixation are longer for this model than for standard models of directional selection. The long fixation times are due to a combination of declining selection pressures during adaptation and the possibility of interference among weakly selected sites for traits with high mutation rates.
Collapse
Affiliation(s)
- Kevin R Thornton
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697
| |
Collapse
|
37
|
Kapopoulou A, Pfeifer SP, Jensen JD, Laurent S. The Demographic History of African Drosophila melanogaster. Genome Biol Evol 2019; 10:2338-2342. [PMID: 30169784 PMCID: PMC6363051 DOI: 10.1093/gbe/evy185] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/27/2018] [Indexed: 11/14/2022] Open
Abstract
As one of the most commonly utilized organisms in the study of local adaptation, an accurate characterization of the demographic history of Drosophila melanogaster remains as an important research question. This owes both to the inherent interest in characterizing the population history of this model organism, as well as to the well-established importance of an accurate null demographic model for increasing power and decreasing false positive rates in genomic scans for positive selection. Although considerable attention has been afforded to this issue in non-African populations, less is known about the demographic history of African populations, including from the ancestral range of the species. While qualitative predictions and hypotheses have previously been forwarded, we here present a quantitative model fitting of the population history characterizing both the ancestral Zambian population range as well as the subsequently colonized west African populations, which themselves served as the source of multiple non-African colonization events. We here report the split time of the West African population at 72 kya, a date corresponding to human migration into this region as well as a period of climatic changes in the African continent. Furthermore, we have estimated population sizes at this split time. These parameter estimates thus represent an important null model for future investigations in to African and non-African D. melanogaster populations alike.
Collapse
Affiliation(s)
- Adamandia Kapopoulou
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Susanne P Pfeifer
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,School of Life Sciences, Center for Evolution and Medicine, Arizona State University, Tempe, Arizona
| | - Jeffrey D Jensen
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,School of Life Sciences, Center for Evolution and Medicine, Arizona State University, Tempe, Arizona
| | - Stefan Laurent
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,Department of Comparative Development and Genetics, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| |
Collapse
|
38
|
The population genetics of crypsis in vertebrates: recent insights from mice, hares, and lizards. Heredity (Edinb) 2019; 124:1-14. [PMID: 31399719 PMCID: PMC6906368 DOI: 10.1038/s41437-019-0257-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 07/16/2019] [Accepted: 07/25/2019] [Indexed: 12/22/2022] Open
Abstract
By combining well-established population genetic theory with high-throughput sequencing data from natural populations, major strides have recently been made in understanding how, why, and when vertebrate populations evolve crypsis. Here, we focus on background matching, a particular facet of crypsis that involves the ability of an organism to conceal itself through matching its color to the surrounding environment. While interesting in and of itself, the study of this phenotype has also provided fruitful population genetic insights into the interplay of strong positive selection with other evolutionary processes. Specifically, and predicated upon the findings of previous candidate gene association studies, a primary focus of this recent literature involves the realization that the inference of selection from DNA sequence data first requires a robust model of population demography in order to identify genomic regions which do not conform to neutral expectations. Moreover, these demographic estimates provide crucial information about the origin and timing of the onset of selective pressures associated with, for example, the colonization of a novel environment. Furthermore, such inference has revealed crypsis to be a particularly useful phenotype for investigating the interplay of migration and selection—with examples of gene flow constraining rates of adaptation, or alternatively providing the genetic variants that may ultimately sweep through the population. Here, we evaluate the underlying evidence, review the strengths and weaknesses of the many population genetic methodologies used in these studies, and discuss how these insights have aided our general understanding of the evolutionary process.
Collapse
|
39
|
Kjeldsen SR, Raadsma HW, Leigh KA, Tobey JR, Phalen D, Krockenberger A, Ellis WA, Hynes E, Higgins DP, Zenger KR. Genomic comparisons reveal biogeographic and anthropogenic impacts in the koala (Phascolarctos cinereus): a dietary-specialist species distributed across heterogeneous environments. Heredity (Edinb) 2019; 122:525-544. [PMID: 30209291 PMCID: PMC6461856 DOI: 10.1038/s41437-018-0144-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Revised: 06/07/2018] [Accepted: 08/01/2018] [Indexed: 02/05/2023] Open
Abstract
The Australian koala is an iconic marsupial with highly specific dietary requirements distributed across heterogeneous environments, over a large geographic range. The distribution and genetic structure of koala populations has been heavily influenced by human actions, specifically habitat modification, hunting and translocation of koalas. There is currently limited information on population diversity and gene flow at a species-wide scale, or with consideration to the potential impacts of local adaptation. Using species-wide sampling across heterogeneous environments, and high-density genome-wide markers (SNPs and PAVs), we show that most koala populations display levels of diversity comparable to other outbred species, except for those populations impacted by population reductions. Genetic clustering analysis and phylogenetic reconstruction reveals a lack of support for current taxonomic classification of three koala subspecies, with only a single evolutionary significant unit supported. Furthermore, ~70% of genetic variance is accounted for at the individual level. The Sydney Basin region is highlighted as a unique reservoir of genetic diversity, having higher diversity levels (i.e., Blue Mountains region; AvHecorr=0.20, PL% = 68.6). Broad-scale population differentiation is primarily driven by an isolation by distance genetic structure model (49% of genetic variance), with clinal local adaptation corresponding to habitat bioregions. Signatures of selection were detected between bioregions, with no single region returning evidence of strong selection. The results of this study show that although the koala is widely considered to be a dietary-specialist species, this apparent specialisation has not limited the koala's ability to maintain gene flow and adapt across divergent environments as long as the required food source is available.
Collapse
Affiliation(s)
- Shannon R Kjeldsen
- Centre for Sustainable Tropical Fisheries and Aquaculture, James Cook University, Townsville, QLD, 4811, Australia.
| | - Herman W Raadsma
- Sydney School of Veterinary Science, Faculty of Science, The University of Sydney, Camden, Private Mail Bag 4003, Narellan, NSW, 2570, Australia
| | - Kellie A Leigh
- Sydney School of Veterinary Science, Faculty of Science, The University of Sydney, Camden, Private Mail Bag 4003, Narellan, NSW, 2570, Australia
- Science for Wildlife, PO Box 286, Cammeray, NSW, 2062, Australia
| | - Jennifer R Tobey
- San Diego Zoo Institute for Conservation Research, Escondido, CA, 92027, USA
| | - David Phalen
- Sydney School of Veterinary Science, Faculty of Science, The University of Sydney, Camden, Private Mail Bag 4003, Narellan, NSW, 2570, Australia
| | - Andrew Krockenberger
- Centre for Tropical Biodiversity and Climate Change, Division of Research and Innovation, James Cook University, Cairns, QLD, 4878, Australia
| | - William A Ellis
- School of Agriculture and Food Science, The University of Queensland, St Lucia, QLD, 4072, Australia
| | - Emily Hynes
- Ecoplan Australia, PO Box 968, Torquay, VIC, 3228, Australia
| | - Damien P Higgins
- Sydney School of Veterinary Science, Faculty of Science, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Kyall R Zenger
- Centre for Sustainable Tropical Fisheries and Aquaculture, James Cook University, Townsville, QLD, 4811, Australia
| |
Collapse
|
40
|
Jensen JD, Payseur BA, Stephan W, Aquadro CF, Lynch M, Charlesworth D, Charlesworth B. The importance of the Neutral Theory in 1968 and 50 years on: A response to Kern and Hahn 2018. Evolution 2019; 73:111-114. [PMID: 30460993 PMCID: PMC6496948 DOI: 10.1111/evo.13650] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Accepted: 11/09/2018] [Indexed: 01/31/2023]
Abstract
A recent article reassessing the Neutral Theory of Molecular Evolution claims that it is no longer as important as is widely believed. The authors argue that "the neutral theory was supported by unreliable theoretical and empirical evidence from the beginning, and that in light of modern, genome-scale data, we can firmly reject its universality." Claiming that "the neutral theory has been overwhelmingly rejected," they propose instead that natural selection is the major force shaping both between-species divergence and within-species variation. Although this is probably a minority view, it is important to evaluate such claims carefully in the context of current knowledge, as inaccuracies can sometimes morph into an accepted narrative for those not familiar with the underlying science. We here critically examine and ultimately reject Kern and Hahn's arguments and assessment, and instead propose that it is now abundantly clear that the foundational ideas presented five decades ago by Kimura and Ohta are indeed correct.
Collapse
Affiliation(s)
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison,
Madison, Wisconsin
| | - Wolfgang Stephan
- Leibniz-Institute for Evolution and Biodiversity Science,
Berlin, Germany
| | - Charles F. Aquadro
- Department of Molecular Biology & Genetics, Cornell
University, Ithaca, New York
| | - Michael Lynch
- Center for Mechanisms of Evolution, Arizona State
University, Tempe, Arizona
| | - Deborah Charlesworth
- Institute of Evolutionary Biology, School of Biological
Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological
Sciences, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
41
|
Bay RA, Harrigan RJ, Buermann W, Underwood VL, Gibbs HL, Smith TB, Ruegg K. Response to Comment on "Genomic signals of selection predict climate-driven population declines in a migratory bird". Science 2018; 361:361/6401/eaat7279. [PMID: 30072513 DOI: 10.1126/science.aat7956] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 07/05/2018] [Indexed: 11/02/2022]
Abstract
Fitzpatrick et al discuss issues that they had with analyses and interpretation in our recent manuscript on genomic correlates of climate in yellow warblers. We provide evidence that our findings would not change with different analysis and maintain that our study represents a promising direction for integrating the potential for climate adaptation as one of many tools in conservation management.
Collapse
Affiliation(s)
- Rachael A Bay
- Institute for the Environment and Sustainability, University of California, Los Angeles, CA 90095, USA. .,Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Ryan J Harrigan
- Institute for the Environment and Sustainability, University of California, Los Angeles, CA 90095, USA
| | - Wolfgang Buermann
- Institute for the Environment and Sustainability, University of California, Los Angeles, CA 90095, USA.,Institute for Climate and Atmospheric Science, School of Earth and Environment, University of Leeds, Leeds LS2 9JT, UK
| | - Vinh Le Underwood
- Institute for the Environment and Sustainability, University of California, Los Angeles, CA 90095, USA
| | - H Lisle Gibbs
- Department of Evolution, Ecology, and Organismal Biology and Ohio Biodiversity Conservation Partnership, Ohio State University, Columbus, OH 43210, USA
| | - Thomas B Smith
- Institute for the Environment and Sustainability, University of California, Los Angeles, CA 90095, USA.,Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA
| | - Kristen Ruegg
- Institute for the Environment and Sustainability, University of California, Los Angeles, CA 90095, USA.,Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA 95064, USA
| |
Collapse
|
42
|
Wang MS, Otecko NO, Wang S, Wu DD, Yang MM, Xu YL, Murphy RW, Peng MS, Zhang YP. An Evolutionary Genomic Perspective on the Breeding of Dwarf Chickens. Mol Biol Evol 2018; 34:3081-3088. [PMID: 28961939 DOI: 10.1093/molbev/msx227] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The evolutionary history for dwarfism in chickens remains an enigma. Herein, we explore the evolution of the Serama, the smallest breed of chicken. Leveraging comparative population genomics, analyses identify several genes that are potentially associated with the growth and development of bones and muscles. These genes, and in particular both POU1F1 and IGF1, are under strong positive selection. Three allopatric dwarf bantams (Serama, Yuanbao, and Daweishan) with different breeding-histories, form distinct clusters and exhibit unique population structures. Parallel genetic mechanisms underlay their variation in body size. These findings provide insights into the multiple and complex pathways, depending on genomic variation, that chicken can take in response to aviculture selection for dwarfism.
Collapse
Affiliation(s)
- Ming-Shan Wang
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, China
| | - Newton O Otecko
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, China
| | - Sheng Wang
- Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture of China, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Dong-Dong Wu
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, China
| | - Min-Min Yang
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yi-Long Xu
- Xiaodu Veterinary Station in Tongnan District, Chongqing, China
| | - Robert W Murphy
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,Centre for Biodiversity and Conservation Biology, Royal Ontario Museum, Toronto, ON, Canada
| | - Min-Sheng Peng
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, China
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, China.,State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China
| |
Collapse
|
43
|
Weigand H, Leese F. Detecting signatures of positive selection in non-model species using genomic data. Zool J Linn Soc 2018. [DOI: 10.1093/zoolinnean/zly007] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Hannah Weigand
- Aquatic Ecosystem Research, University of Duisburg-Essen, Universitätsstraße, Essen, Germany
| | - Florian Leese
- Aquatic Ecosystem Research, University of Duisburg-Essen, Universitätsstraße, Essen, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstraße, Essen, Germany
| |
Collapse
|
44
|
LaBonte NR, Zhao P, Woeste K. Signatures of Selection in the Genomes of Chinese Chestnut ( Castanea mollissima Blume): The Roots of Nut Tree Domestication. FRONTIERS IN PLANT SCIENCE 2018; 9:810. [PMID: 29988533 PMCID: PMC6026767 DOI: 10.3389/fpls.2018.00810] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 05/25/2018] [Indexed: 05/18/2023]
Abstract
Chestnuts (Castanea) are major nut crops in East Asia and southern Europe, and are unique among temperate nut crops in that the harvested seeds are starchy rather than oily. Chestnut species have been cultivated for three millennia or more in China, so it is likely that artificial selection has affected the genome of orchard-grown chestnuts. The genetics of Chinese chestnut (Castanea mollissima Blume) domestication are also of interest to breeders of hybrid American chestnut, especially if the low-growing, branching habit of Chinese chestnut, an impediment to American chestnut restoration, is partly the result of artificial selection. We resequenced genomes of wild and orchard-derived Chinese chestnuts and identified selective sweeps based on pooled whole-genome SNP datasets. We present candidate gene loci for chestnut domestication and discuss the potential phenotypic effects of candidate loci, some of which may be useful genes for chestnut improvement in Asia and North America. Selective sweeps included predicted genes potentially related to flower phenology and development, fruit maturation, and secondary metabolism, and included some genes homologous to domestication candidates in other woody plants.
Collapse
Affiliation(s)
- Nicholas R. LaBonte
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, IL, United States
- *Correspondence: Nicholas R. LaBonte
| | - Peng Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, College of Life Sciences, Northwest University, Xi'an, China
| | - Keith Woeste
- Hardwood Tree Improvement and Regeneration Center, Northern Research Station, USDA Forest Service, West Lafayette, IN, United States
| |
Collapse
|
45
|
|
46
|
Coalescent Processes with Skewed Offspring Distributions and Nonequilibrium Demography. Genetics 2017; 208:323-338. [PMID: 29127263 DOI: 10.1534/genetics.117.300499] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 10/30/2017] [Indexed: 11/18/2022] Open
Abstract
Nonequilibrium demography impacts coalescent genealogies leaving detectable, well-studied signatures of variation. However, similar genomic footprints are also expected under models of large reproductive skew, posing a serious problem when trying to make inference. Furthermore, current approaches consider only one of the two processes at a time, neglecting any genomic signal that could arise from their simultaneous effects, preventing the possibility of jointly inferring parameters relating to both offspring distribution and population history. Here, we develop an extended Moran model with exponential population growth, and demonstrate that the underlying ancestral process converges to a time-inhomogeneous psi-coalescent. However, by applying a nonlinear change of time scale-analogous to the Kingman coalescent-we find that the ancestral process can be rescaled to its time-homogeneous analog, allowing the process to be simulated quickly and efficiently. Furthermore, we derive analytical expressions for the expected site-frequency spectrum under the time-inhomogeneous psi-coalescent, and develop an approximate-likelihood framework for the joint estimation of the coalescent and growth parameters. By means of extensive simulation, we demonstrate that both can be estimated accurately from whole-genome data. In addition, not accounting for demography can lead to serious biases in the inferred coalescent model, with broad implications for genomic studies ranging from ecology to conservation biology. Finally, we use our method to analyze sequence data from Japanese sardine populations, and find evidence of high variation in individual reproductive success, but few signs of a recent demographic expansion.
Collapse
|
47
|
Abstract
Relatively little is known about the evolutionary history of the African green monkey (genus Chlorocebus) due to the lack of sampled polymorphism data from wild populations. Yet, this characterization of genetic diversity is not only critical for a better understanding of their own history, but also for human biomedical research given that they are one of the most widely used primate models. Here, I analyze the demographic and selective history of the African green monkey, utilizing one of the most comprehensive catalogs of wild genetic diversity to date, consisting of 1,795,643 autosomal single nucleotide polymorphisms in 25 individuals, representing all five major populations: C. a. aethiops, C. a. cynosurus, C. a. pygerythrus, C. a. sabaeus, and C. a tantalus. Assuming a mutation rate of 5.9 × 10-9 per base pair per generation and a generation time of 8.5 years, divergence time estimates range from 523 to 621 kya for the basal split of C. a. aethiops from the other four populations. Importantly, the resulting tree characterizing the relationship and split-times between these populations differs significantly from that presented in the original genome paper, owing to their neglect of within-population variation when calculating between population-divergence. In addition, I find that the demographic history of all five populations is well explained by a model of population fragmentation and isolation, rather than novel colonization events. Finally, utilizing these demographic models as a null, I investigate the selective history of the populations, identifying candidate regions potentially related to adaptation in response to pathogen exposure.
Collapse
Affiliation(s)
- Susanne P Pfeifer
- School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland.,School of Life Sciences, Arizona State University, Tempe, AZ
| |
Collapse
|
48
|
Refining the Use of Linkage Disequilibrium as a Robust Signature of Selective Sweeps. Genetics 2017; 203:1807-25. [PMID: 27516617 DOI: 10.1534/genetics.115.185900] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Accepted: 04/05/2016] [Indexed: 12/12/2022] Open
Abstract
During a selective sweep, characteristic patterns of linkage disequilibrium can arise in the genomic region surrounding a selected locus. These have been used to infer past selective sweeps. However, the recombination rate is known to vary substantially along the genome for many species. We here investigate the effectiveness of current (Kelly's [Formula: see text] and [Formula: see text]) and novel statistics at inferring hard selective sweeps based on linkage disequilibrium distortions under different conditions, including a human-realistic demographic model and recombination rate variation. When the recombination rate is constant, Kelly's [Formula: see text] offers high power, but is outperformed by a novel statistic that we test, which we call [Formula: see text] We also find this statistic to be effective at detecting sweeps from standing variation. When recombination rate fluctuations are included, there is a considerable reduction in power for all linkage disequilibrium-based statistics. However, this can largely be reversed by appropriately controlling for expected linkage disequilibrium using a genetic map. To further test these different methods, we perform selection scans on well-characterized HapMap data, finding that all three statistics-[Formula: see text] Kelly's [Formula: see text] and [Formula: see text]-are able to replicate signals at regions previously identified as selection candidates based on population differentiation or the site frequency spectrum. While [Formula: see text] replicates most candidates when recombination map data are not available, the [Formula: see text] and [Formula: see text] statistics are more successful when recombination rate variation is controlled for. Given both this and their higher power in simulations of selective sweeps, these statistics are preferred when information on local recombination rate variation is available.
Collapse
|
49
|
Pavlidis P, Alachiotis N. A survey of methods and tools to detect recent and strong positive selection. ACTA ACUST UNITED AC 2017; 24:7. [PMID: 28405579 PMCID: PMC5385031 DOI: 10.1186/s40709-017-0064-0] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 03/29/2017] [Indexed: 01/25/2023]
Abstract
Positive selection occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and due to genetic hitchhiking the neighboring linked variation diminishes, creating so-called selective sweeps. Detecting traces of positive selection in genomes is achieved by searching for signatures introduced by selective sweeps, such as regions of reduced variation, a specific shift of the site frequency spectrum, and particular LD patterns in the region. A variety of methods and tools can be used for detecting sweeps, ranging from simple implementations that compute summary statistics such as Tajima's D, to more advanced statistical approaches that use combinations of statistics, maximum likelihood, machine learning etc. In this survey, we present and discuss summary statistics and software tools, and classify them based on the selective sweep signature they detect, i.e., SFS-based vs. LD-based, as well as their capacity to analyze whole genomes or just subgenomic regions. Additionally, we summarize the results of comparisons among four open-source software releases (SweeD, SweepFinder, SweepFinder2, and OmegaPlus) regarding sensitivity, specificity, and execution times. In equilibrium neutral models or mild bottlenecks, both SFS- and LD-based methods are able to detect selective sweeps accurately. Methods and tools that rely on LD exhibit higher true positive rates than SFS-based ones under the model of a single sweep or recurrent hitchhiking. However, their false positive rate is elevated when a misspecified demographic model is used to represent the null hypothesis. When the correct (or similar to the correct) demographic model is used instead, the false positive rates are considerably reduced. The accuracy of detecting the true target of selection is decreased in bottleneck scenarios. In terms of execution time, LD-based methods are typically faster than SFS-based methods, due to the nature of required arithmetic.
Collapse
Affiliation(s)
- Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, 70013 Crete, Greece
| | - Nikolaos Alachiotis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, 70013 Crete, Greece
| |
Collapse
|
50
|
Freedman AH, Lohmueller KE, Wayne RK. Evolutionary History, Selective Sweeps, and Deleterious Variation in the Dog. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2016. [DOI: 10.1146/annurev-ecolsys-121415-032155] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The dog is our oldest domesticate and has experienced a wide variety of demographic histories, including a bottleneck associated with domestication and individual bottlenecks associated with the formation of modern breeds. Admixture with gray wolves, and among dog breeds and populations, has also occurred throughout its history. Likewise, the intensity and focus of selection have varied, from an initial focus on traits enhancing cohabitation with humans, to more directed selection on specific phenotypic characteristics and behaviors. In this review, we summarize and synthesize genetic findings from genome-wide and complete genome studies that document the genomic consequences of demography and selection, including the effects on adaptive and deleterious variation. Consistent with the evolutionary history of the dog, signals of natural and artificial selection are evident in the dog genome. However, conclusions from studies of positive selection are fraught with the problem of false positives given that demographic history is often not taken into account.
Collapse
Affiliation(s)
- Adam H. Freedman
- Informatics Group, Faculty of Arts and Sciences, Harvard University, Cambridge, Massachusetts 02138
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095
| | - Robert K. Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095
| |
Collapse
|