1
|
Sabarís G, Schuettengruber B, Papadopoulos GL, Coronado-Zamora M, Fitz-James MH, González J, Cavalli G. A mechanistic basis for genetic assimilation in natural fly populations. Proc Natl Acad Sci U S A 2025; 122:e2415982122. [PMID: 40063800 PMCID: PMC11929479 DOI: 10.1073/pnas.2415982122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 01/22/2025] [Indexed: 03/25/2025] Open
Abstract
Genetic assimilation is a process by which a trait originally driven by the environment becomes independent of the initial cue and is expressed constitutively in a population. More than seven decades have passed since Waddington's pioneering demonstration of the acquisition of morphological traits through genetic assimilation, but the underlying mechanism remains unknown. Here, we address this gap by performing combined genomic analyses of Waddington's genetic assimilation experiments using the ectopic veins (EV) phenocopy in Drosophila as a model. Our study reveals the assimilation of EV in both outbred and inbred fly natural populations, despite their limited genetic diversity. We identified key changes in the expression of developmental genes and pinpointed selected alleles involved in EV assimilation. The assimilation of EV is mainly driven by the selection of regulatory alleles already present in the ancestral populations, including the downregulation of the receptor tyrosine kinase gene Cad96Ca by the insertion of a transposable element in its 3' untranslated region. The genetic variation at this locus in the inbred population is maintained by a large chromosomal inversion. In outbred populations, the evolution of EV results from a polygenic response shaped by the selective environment. Our results support a model in which selection for multiple preexisting alleles in the ancestral population, rather than stress-induced genetic or epigenetic variation, drives the evolution of EV in natural fly populations.
Collapse
Affiliation(s)
- Gonzalo Sabarís
- Institute of Human Genetics, CNRS, University of Montpellier, Montpellier34396 cedex 5, France
| | - Bernd Schuettengruber
- Institute of Human Genetics, CNRS, University of Montpellier, Montpellier34396 cedex 5, France
| | - Giorgio L. Papadopoulos
- Institute of Human Genetics, CNRS, University of Montpellier, Montpellier34396 cedex 5, France
| | - Marta Coronado-Zamora
- Institute of Evolutionary Biology, Agencia Estatal Consejo Superior de Investigaciones Científicas, Universitat Pompeu Fabra, Barcelona08003, Spain
| | | | - Josefa González
- Institute of Evolutionary Biology, Agencia Estatal Consejo Superior de Investigaciones Científicas, Universitat Pompeu Fabra, Barcelona08003, Spain
| | - Giacomo Cavalli
- Institute of Human Genetics, CNRS, University of Montpellier, Montpellier34396 cedex 5, France
| |
Collapse
|
2
|
Willis S, Micheletti S, Andrews KR, Narum S. PoolParty2: An integrated pipeline for analysing pooled or indexed low-coverage whole-genome sequencing data to discover the genetic basis of diversity. Mol Ecol Resour 2023. [PMID: 37921673 DOI: 10.1111/1755-0998.13888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 09/29/2023] [Accepted: 10/19/2023] [Indexed: 11/04/2023]
Abstract
Whole-genome sequencing data allow survey of variation from across the genome, reducing the constraint of balancing genome sub-sampling with estimating recombination rates and linkage between sampled markers and target loci. As sequencing costs decrease, low-coverage whole-genome sequencing of pooled or indexed-individual samples is commonly utilized to identify loci associated with phenotypes or environmental axes in non-model organisms. There are, however, relatively few publicly available bioinformatic pipelines designed explicitly to analyse these types of data, and fewer still that process the raw sequencing data, provide useful metrics of quality control and then execute analyses. Here, we present an updated version of a bioinformatics pipeline called PoolParty2 that can effectively handle either pooled or indexed DNA samples and includes new features to improve computational efficiency. Using simulated data, we demonstrate the ability of our pipeline to recover segregating variants, estimate their allele frequencies accurately, and identify genomic regions harbouring loci under selection. Based on the simulated data set, we benchmark the efficacy of our pipeline with another bioinformatic suite, angsd, and illustrate the compatibility and complementarity of these suites using angsd to generate genotype likelihoods as input for identifying linkage outlier regions using alignment files and variants provided by PoolParty2. Finally, we apply our updated pipeline to an empirical dataset of low-coverage whole genomic data from population samples of Columbia River steelhead trout (Oncorhynchus mykiss), results from which demonstrate the genomic impacts of decades of artificial selection in a prominent hatchery stock. Thus, we not only demonstrate the utility of PoolParty2 for genomic studies that combine sequencing data from multiple individuals, but also illustrate how it compliments other bioinformatics resources such as angsd.
Collapse
Affiliation(s)
- Stuart Willis
- Hagerman Genetics Lab, Columbia River Inter-Tribal Fish Commission, Hagerman, Idaho, USA
| | - Steven Micheletti
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Kimberly R Andrews
- Institute for Interdisciplinary Data Sciences, University of Idaho, Moscow, Idaho, USA
| | - Shawn Narum
- Hagerman Genetics Lab, Columbia River Inter-Tribal Fish Commission, Hagerman, Idaho, USA
| |
Collapse
|
3
|
Matthews AE, Boves TJ, Percy KL, Schelsky WM, Wijeratne AJ. Population Genomics of Pooled Samples: Unveiling Symbiont Infrapopulation Diversity and Host-Symbiont Coevolution. Life (Basel) 2023; 13:2054. [PMID: 37895435 PMCID: PMC10608719 DOI: 10.3390/life13102054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/30/2023] [Accepted: 10/10/2023] [Indexed: 10/29/2023] Open
Abstract
Microscopic symbionts represent crucial links in biological communities. However, they present technical challenges in high-throughput sequencing (HTS) studies due to their small size and minimal high-quality DNA yields, hindering our understanding of host-symbiont coevolution at microevolutionary and macroevolutionary scales. One approach to overcome those barriers is to pool multiple individuals from the same infrapopulation (i.e., individual host) and sequence them together (Pool-Seq), but individual-level information is then compromised. To simultaneously address both issues (i.e., minimal DNA yields and loss of individual-level information), we implemented a strategic Pool-Seq approach to assess variation in sequencing performance and categorize genetic diversity (single nucleotide polymorphisms (SNPs)) at both the individual-level and infrapopulation-level for microscopic feather mites. To do so, we collected feathers harboring mites (Proctophyllodidae: Amerodectes protonotaria) from four individual Prothonotary Warblers (Parulidae: Protonotaria citrea). From each of the four hosts (i.e., four mite infrapopulations), we conducted whole-genome sequencing on three extraction pools consisting of different numbers of mites (1 mite, 5 mites, and 20 mites). We found that samples containing pools of multiple mites had more sequencing reads map to the feather mite reference genome than did the samples containing only a single mite. Mite infrapopulations were primarily genetically structured by their associated individual hosts (not pool size) and the majority of SNPs were shared by all pools within an infrapopulation. Together, these results suggest that the patterns observed are driven by evolutionary processes occurring at the infrapopulation level and are not technical signals due to pool size. In total, despite the challenges presented by microscopic symbionts in HTS studies, this work highlights the value of both individual-level and infrapopulation-level sequencing toward our understanding of host-symbiont coevolution at multiple evolutionary scales.
Collapse
Affiliation(s)
- Alix E. Matthews
- College of Sciences and Mathematics and Molecular Biosciences Program, Arkansas State University, Jonesboro, AR 72401, USA
- Department of Biological Sciences, Arkansas State University, Jonesboro, AR 72401, USA; (T.J.B.); (A.J.W.)
| | - Than J. Boves
- Department of Biological Sciences, Arkansas State University, Jonesboro, AR 72401, USA; (T.J.B.); (A.J.W.)
| | - Katie L. Percy
- Audubon Delta, National Audubon Society, Baton Rouge, LA 70808, USA;
- United States Department of Agriculture, Natural Resources Conservation Service, Addis, LA 70710, USA
| | - Wendy M. Schelsky
- Department of Evolution, Ecology, and Behavior, School of Integrative Biology, University of Illinois, Urbana-Champaign, Champaign, IL 61801, USA;
- Prairie Research Institute, Illinois Natural History Survey, University of Illinois, Urbana-Champaign, Champaign, IL 61820, USA
| | - Asela J. Wijeratne
- Department of Biological Sciences, Arkansas State University, Jonesboro, AR 72401, USA; (T.J.B.); (A.J.W.)
| |
Collapse
|
4
|
Schneider M, Shrestha A, Ballvora A, Léon J. High-throughput estimation of allele frequencies using combined pooled-population sequencing and haplotype-based data processing. PLANT METHODS 2022; 18:34. [PMID: 35313910 PMCID: PMC8935755 DOI: 10.1186/s13007-022-00852-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 02/07/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND In addition to heterogeneity and artificial selection, natural selection is one of the forces used to combat climate change and improve agrobiodiversity in evolutionary plant breeding. Accurate identification of the specific genomic effects of natural selection will likely accelerate transfer between populations. Thus, insights into changes in allele frequency, adequate population size, gene flow and drift are essential. However, observing such effects often involves a trade-off between costs and resolution when a large sample of genotypes for many loci is analysed. Pool genotyping approaches achieve high resolution and precision in estimating allele frequency when sequence coverage is high. Nevertheless, high-coverage pool sequencing of large genomes is expensive. RESULTS Three pool samples (n = 300, 300, 288) from a barley backcross population were generated to assess the population's allele frequency. The tested population (BC2F21) has undergone 18 generations of natural adaption to conventional farming practice. The accuracies of estimated pool-based allele frequencies and genome coverage yields were compared using three next-generation sequencing genotyping methods. To achieve accurate allele frequency estimates with low sequence coverage, we employed a haplotyping approach. Low coverage allele frequencies of closely located single polymorphisms were aggregated into a single haplotype allele frequency, yielding 2-to-271-times higher depth and increased precision. When we combined different haplotyping tactics, we found that gene and chip marker-based haplotype analyses performed equivalently or better compared with simple contig haplotype windows. Comparing multiple pool samples and referencing against an individual sequencing approach revealed that whole-genome pool re-sequencing (WGS) achieved the highest correlation with individual genotyping (≥ 0.97). In contrast, transcriptome-based genotyping (MACE) and genotyping by sequencing (GBS) pool replicates were significantly associated with higher error rates and lower correlations, but are still valuable to detect large allele frequency variations. CONCLUSIONS The proposed strategy identified the allele frequency of populations with high accuracy at low cost. This is particularly relevant to evolutionary plant breeding of crops with very large genomes, such as barley. Whole-genome low coverage re-sequencing at 0.03 × coverage per genotype accurately estimated the allele frequency when a loci-based haplotyping approach was applied. The implementation of annotated haplotypes capitalises on the biological background and statistical robustness.
Collapse
Affiliation(s)
- Michael Schneider
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany
- Institute for Quantitative Genetics and Genomics of Plants, University Duesseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany
| | - Asis Shrestha
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany
- Institute for Quantitative Genetics and Genomics of Plants, University Duesseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany
| | - Agim Ballvora
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany
| | - Jens Léon
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany.
| |
Collapse
|
5
|
Kapun M, Nunez JCB, Bogaerts-Márquez M, Murga-Moreno J, Paris M, Outten J, Coronado-Zamora M, Tern C, Rota-Stabelli O, Guerreiro MPG, Casillas S, Orengo DJ, Puerma E, Kankare M, Ometto L, Loeschcke V, Onder BS, Abbott JK, Schaeffer SW, Rajpurohit S, Behrman EL, Schou MF, Merritt TJS, Lazzaro BP, Glaser-Schmitt A, Argyridou E, Staubach F, Wang Y, Tauber E, Serga SV, Fabian DK, Dyer KA, Wheat CW, Parsch J, Grath S, Veselinovic MS, Stamenkovic-Radak M, Jelic M, Buendía-Ruíz AJ, Gómez-Julián MJ, Espinosa-Jimenez ML, Gallardo-Jiménez FD, Patenkovic A, Eric K, Tanaskovic M, Ullastres A, Guio L, Merenciano M, Guirao-Rico S, Horváth V, Obbard DJ, Pasyukova E, Alatortsev VE, Vieira CP, Vieira J, Torres JR, Kozeretska I, Maistrenko OM, Montchamp-Moreau C, Mukha DV, Machado HE, Lamb K, Paulo T, Yusuf L, Barbadilla A, Petrov D, Schmidt P, Gonzalez J, Flatt T, Bergland AO. Drosophila Evolution over Space and Time (DEST): A New Population Genomics Resource. Mol Biol Evol 2021; 38:5782-5805. [PMID: 34469576 PMCID: PMC8662648 DOI: 10.1093/molbev/msab259] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome data sets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate data sets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in >20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This data set, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental metadata. A web-based genome browser and web portal provide easy access to the SNP data set. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan data set. Our resource will enable population geneticists to analyze spatiotemporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail.
Collapse
Affiliation(s)
- Martin Kapun
- Department of Evolutionary Biology and Environmental Studies, University of
Zürich, Switzerland
- Department of Cell & Developmental Biology, Center of Anatomy and Cell
Biology, Medical University of Vienna, Vienna, Austria
| | - Joaquin C B Nunez
- Department of Biology, University of Virginia, Charlottesville,
VA, USA
| | | | - Jesús Murga-Moreno
- Department of Genetics and Microbiology, Universitat Autònoma de
Barcelona, Barcelona, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de
Barcelona, Barcelona, Spain
| | - Margot Paris
- Department of Biology, University of Fribourg, Fribourg, Switzerland
| | - Joseph Outten
- Department of Biology, University of Virginia, Charlottesville,
VA, USA
| | | | - Courtney Tern
- Department of Biology, University of Virginia, Charlottesville,
VA, USA
| | - Omar Rota-Stabelli
- Center Agriculture Food Environment, University of Trento, San Michele all'
Adige, Italy
| | | | - Sònia Casillas
- Department of Genetics and Microbiology, Universitat Autònoma de
Barcelona, Barcelona, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de
Barcelona, Barcelona, Spain
| | - Dorcas J Orengo
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia,
Universitat de Barcelona, Barcelona, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de
Barcelona, Barcelona, Spain
| | - Eva Puerma
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia,
Universitat de Barcelona, Barcelona, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de
Barcelona, Barcelona, Spain
| | - Maaria Kankare
- Department of Biological and Environmental Science, University of
Jyväskylä, Jyväskylä, Finland
| | - Lino Ometto
- Department of Biology and Biotechnology, University of Pavia,
Pavia, Italy
| | | | - Banu S Onder
- Department of Biology, Hacettepe University, Ankara, Turkey
| | | | - Stephen W Schaeffer
- Department of Biology, The Pennsylvania State University,
University Park, PA, USA
| | - Subhash Rajpurohit
- Department of Biology, University of Pennsylvania, Philadelphia,
PA, USA
- Division of Biological and Life Sciences, School of Arts and Sciences,
Ahmedabad University, Ahmedabad, India
| | - Emily L Behrman
- Department of Biology, University of Pennsylvania, Philadelphia,
PA, USA
- Janelia Research Campus, Ashburn, VA, USA
| | - Mads F Schou
- Department of Biology, Aarhus University, Aarhus, Denmark
- Department of Biology, Lund University, Lund, Sweden
| | - Thomas J S Merritt
- Department of Chemistry & Biochemistry, Laurentian
University, Sudbury, ON, Canada
| | - Brian P Lazzaro
- Department of Entomology, Cornell University, Ithaca, NY,
USA
| | - Amanda Glaser-Schmitt
- Division of Evolutionary Biology, Faculty of Biology,
Ludwig-Maximilians-Universität, Munich, Germany
| | - Eliza Argyridou
- Division of Evolutionary Biology, Faculty of Biology,
Ludwig-Maximilians-Universität, Munich, Germany
| | - Fabian Staubach
- Department of Evolution and Ecology, University of Freiburg,
Freiburg, Germany
| | - Yun Wang
- Department of Evolution and Ecology, University of Freiburg,
Freiburg, Germany
| | - Eran Tauber
- Department of Evolutionary and Environmental Biology, Institute of Evolution,
University of Haifa, Haifa, Israel
| | - Svitlana V Serga
- Department of General and Medical Genetics, Taras Shevchenko National
University of Kyiv, Kyiv, Ukraine
- State Institution National Antarctic Scientific Center, Ministry of Education
and Science of Ukraine, Kyiv, Ukraine
| | - Daniel K Fabian
- Department of Genetics, University of Cambridge, Cambridge,
United Kingdom
| | - Kelly A Dyer
- Department of Genetics, University of Georgia, Athens, GA,
USA
| | | | - John Parsch
- Division of Evolutionary Biology, Faculty of Biology,
Ludwig-Maximilians-Universität, Munich, Germany
| | - Sonja Grath
- Division of Evolutionary Biology, Faculty of Biology,
Ludwig-Maximilians-Universität, Munich, Germany
| | | | | | - Mihailo Jelic
- Faculty of Biology, University of Belgrade, Belgrade, Serbia
| | | | | | | | | | - Aleksandra Patenkovic
- Institute for Biological Research “Siniša Stanković”, National Institute of
Republic of Serbia, University of Belgrade, Belgrade, Serbia
| | - Katarina Eric
- Institute for Biological Research “Siniša Stanković”, National Institute of
Republic of Serbia, University of Belgrade, Belgrade, Serbia
| | - Marija Tanaskovic
- Institute for Biological Research “Siniša Stanković”, National Institute of
Republic of Serbia, University of Belgrade, Belgrade, Serbia
| | - Anna Ullastres
- Institute of Evolutionary Biology, CSIC-Universitat Pompeu Fabra,
Barcelona, Spain
| | - Lain Guio
- Institute of Evolutionary Biology, CSIC-Universitat Pompeu Fabra,
Barcelona, Spain
| | - Miriam Merenciano
- Institute of Evolutionary Biology, CSIC-Universitat Pompeu Fabra,
Barcelona, Spain
| | - Sara Guirao-Rico
- Institute of Evolutionary Biology, CSIC-Universitat Pompeu Fabra,
Barcelona, Spain
| | - Vivien Horváth
- Institute of Evolutionary Biology, CSIC-Universitat Pompeu Fabra,
Barcelona, Spain
| | - Darren J Obbard
- Institute of Evolutionary Biology, University of Edinburgh,
Edinburgh, United Kingdom
| | - Elena Pasyukova
- Institute of Molecular Genetics of the National Research Centre “Kurchatov
Institute”, Moscow, Russia
| | - Vladimir E Alatortsev
- Institute of Molecular Genetics of the National Research Centre “Kurchatov
Institute”, Moscow, Russia
| | - Cristina P Vieira
- Instituto de Biologia Molecular e Celular (IBMC), Porto, Portugal
- Instituto de Investigação e Inovação em Saúde, Universidade do
Porto, Porto, Portugal
| | - Jorge Vieira
- Instituto de Biologia Molecular e Celular (IBMC), Porto, Portugal
- Instituto de Investigação e Inovação em Saúde, Universidade do
Porto, Porto, Portugal
| | | | - Iryna Kozeretska
- Department of General and Medical Genetics, Taras Shevchenko National
University of Kyiv, Kyiv, Ukraine
- State Institution National Antarctic Scientific Center, Ministry of Education
and Science of Ukraine, Kyiv, Ukraine
| | - Oleksandr M Maistrenko
- Department of General and Medical Genetics, Taras Shevchenko National
University of Kyiv, Kyiv, Ukraine
- Structural and Computational Biology Unit, European Molecular Biology
Laboratory, Heidelberg, Germany
| | | | - Dmitry V Mukha
- Vavilov Institute of General Genetics, Russian Academy of
Sciences, Moscow, Russia
| | - Heather E Machado
- Department of Biology, Stanford University, Stanford, CA,
USA
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom
| | - Keric Lamb
- Department of Biology, University of Virginia, Charlottesville,
VA, USA
| | - Tânia Paulo
- Departamento de Biologia Animal, Instituto Gulbenkian de Ciência,
Oeiras, Portugal
| | - Leeban Yusuf
- Center for Biological Diversity, University of St. Andrews, St
Andrews, United Kingdom
| | - Antonio Barbadilla
- Department of Genetics and Microbiology, Universitat Autònoma de
Barcelona, Barcelona, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de
Barcelona, Barcelona, Spain
| | - Dmitri Petrov
- Department of Biology, Stanford University, Stanford, CA,
USA
| | - Paul Schmidt
- Department of Biology, The Pennsylvania State University,
University Park, PA, USA
| | - Josefa Gonzalez
- Institute of Evolutionary Biology, CSIC-Universitat Pompeu Fabra,
Barcelona, Spain
| | - Thomas Flatt
- Department of Biology, University of Fribourg, Fribourg, Switzerland
| | - Alan O Bergland
- Department of Biology, University of Virginia, Charlottesville,
VA, USA
| |
Collapse
|
6
|
Bertram J. Allele frequency divergence reveals ubiquitous influence of positive selection in Drosophila. PLoS Genet 2021; 17:e1009833. [PMID: 34591854 PMCID: PMC8509871 DOI: 10.1371/journal.pgen.1009833] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 10/12/2021] [Accepted: 09/22/2021] [Indexed: 12/04/2022] Open
Abstract
Resolving the role of natural selection is a basic objective of evolutionary biology. It is generally difficult to detect the influence of selection because ubiquitous non-selective stochastic change in allele frequencies (genetic drift) degrades evidence of selection. As a result, selection scans typically only identify genomic regions that have undergone episodes of intense selection. Yet it seems likely such episodes are the exception; the norm is more likely to involve subtle, concurrent selective changes at a large number of loci. We develop a new theoretical approach that uncovers a previously undocumented genome-wide signature of selection in the collective divergence of allele frequencies over time. Applying our approach to temporally resolved allele frequency measurements from laboratory and wild Drosophila populations, we quantify the selective contribution to allele frequency divergence and find that selection has substantial effects on much of the genome. We further quantify the magnitude of the total selection coefficient (a measure of the combined effects of direct and linked selection) at a typical polymorphic locus, and find this to be large (of order 1%) even though most mutations are not directly under selection. We find that selective allele frequency divergence is substantially elevated at intermediate allele frequencies, which we argue is most parsimoniously explained by positive-not negative-selection. Thus, in these populations most mutations are far from evolving neutrally in the short term (tens of generations), including mutations with neutral fitness effects, and the result cannot be explained simply as an ongoing purging of deleterious mutations.
Collapse
Affiliation(s)
- Jason Bertram
- Environmental Resilience Institute, Indiana University, Bloomington, Indiana, United States of America
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| |
Collapse
|
7
|
Errbii M, Keilwagen J, Hoff KJ, Steffen R, Altmüller J, Oettler J, Schrader L. Transposable elements and introgression introduce genetic variation in the invasive ant Cardiocondyla obscurior. Mol Ecol 2021; 30:6211-6228. [PMID: 34324751 DOI: 10.1111/mec.16099] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 07/22/2021] [Accepted: 07/26/2021] [Indexed: 12/11/2022]
Abstract
Introduced populations of invasive organisms have to cope with novel environmental challenges, while having reduced genetic variation caused by founder effects. The mechanisms associated with this "genetic paradox of invasive species" has received considerable attention, yet few studies have examined the genomic architecture of invasive species. Populations of the heart node ant Cardiocondyla obscurior belong to two distinct lineages, a New World lineage so far only found in Latin America and a more globally distributed Old World lineage. In the present study, we use population genomic approaches to compare populations of the two lineages with apparent divergent invasive potential. We find that the strong genetic differentiation of the two lineages began at least 40,000 generations ago and that activity of transposable elements (TEs) has contributed significantly to the divergence of both lineages, possibly linked to the very unusual genomic distribution of TEs in this species. Furthermore, we show that introgression from the Old World lineage is a dominant source of genetic diversity in the New World lineage, despite the lineages' strong genetic differentiation. Our study uncovers mechanisms underlying novel genetic variation in introduced populations of C. obscurior that could contribute to the species' adaptive potential.
Collapse
Affiliation(s)
- Mohammed Errbii
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Jens Keilwagen
- Institute for Biosafety in Plant Biotechnology, Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Quedlinburg, Germany
| | - Katharina J Hoff
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany.,Center for Functional Genomics of Microbes, University of Greifswald, Greifswald, Germany
| | - Raphael Steffen
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Janine Altmüller
- Cologne Center for Genomics, Institute of Human Genetics, University of Cologne, Cologne, Germany.,Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Genomics, Berlin, Germany.,Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Jan Oettler
- Lehrstuhl für Zoologie/Evolutionsbiologie, University Regensburg, Regensburg, Germany
| | - Lukas Schrader
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| |
Collapse
|
8
|
Lou RN, Jacobs A, Wilder A, Therkildsen NO. A beginner's guide to low-coverage whole genome sequencing for population genomics. Mol Ecol 2021; 30:5966-5993. [PMID: 34250668 DOI: 10.1111/mec.16077] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 06/30/2021] [Accepted: 07/01/2021] [Indexed: 11/26/2022]
Abstract
Low-coverage whole genome sequencing (lcWGS) has emerged as a powerful and cost-effective approach for population genomic studies in both model and non-model species. However, with read depths too low to confidently call individual genotypes, lcWGS requires specialized analysis tools that explicitly account for genotype uncertainty. A growing number of such tools have become available, but it can be difficult to get an overview of what types of analyses can be performed reliably with lcWGS data, and how the distribution of sequencing effort between the number of samples analyzed and per-sample sequencing depths affects inference accuracy. In this introductory guide to lcWGS, we first illustrate how the per-sample cost for lcWGS is now comparable to RAD-seq and Pool-seq in many systems. We then provide an overview of software packages that explicitly account for genotype uncertainty in different types of population genomic inference. Next, we use both simulated and empirical data to assess the accuracy of allele frequency and genetic diversity estimation, detection of population structure, and selection scans under different sequencing strategies. Our results show that spreading a given amount of sequencing effort across more samples with lower depth per sample consistently improves the accuracy of most types of inference, with a few notable exceptions. Finally, we assess the potential for using imputation to bolster inference from lcWGS data in non-model species, and discuss current limitations and future perspectives for lcWGS-based population genomics research. With this overview, we hope to make lcWGS more approachable and stimulate its broader adoption.
Collapse
Affiliation(s)
- Runyang Nicolas Lou
- Department of Natural Resources and the Environment, Cornell University, Ithaca, NY, 14853, USA
| | - Arne Jacobs
- Department of Natural Resources and the Environment, Cornell University, Ithaca, NY, 14853, USA.,Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Aryn Wilder
- San Diego Zoo Wildlife Alliance, Escondido, CA, 92027, USA
| | - Nina O Therkildsen
- Department of Natural Resources and the Environment, Cornell University, Ithaca, NY, 14853, USA
| |
Collapse
|
9
|
Paril JF, Balding DJ, Fournier-Level A. Optimizing sampling design and sequencing strategy for the genomic analysis of quantitative traits in natural populations. Mol Ecol Resour 2021; 22:137-152. [PMID: 34192415 DOI: 10.1111/1755-0998.13458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 05/02/2021] [Accepted: 06/25/2021] [Indexed: 11/27/2022]
Abstract
Mapping the genes underlying ecologically relevant traits in natural populations is fundamental to develop a molecular understanding of species adaptation. Current sequencing technologies enable the characterization of a species' genetic diversity across the landscape or even over its whole range. The relevant capture of the genetic diversity across the landscape is critical for a successful genetic mapping of traits and there are no clear guidelines on how to achieve an optimal sampling and which sequencing strategy to implement. Here we determine, through simulation, the sampling scheme that maximizes the power to map the genetic basis of a complex trait in an outbreeding species across an idealized landscape and draw genomic predictions for the trait, comparing individual and pool sequencing strategies. Our results show that quantitative trait locus detection power and prediction accuracy are higher when more populations over the landscape are sampled and this is more cost-effectively done with pool sequencing than with individual sequencing. Additionally, we recommend sampling populations from areas of high genetic diversity. As progress in sequencing enables the integration of trait-based functional ecology into landscape genomics studies, these findings will guide study designs allowing direct measures of genetic effects in natural populations across the environment.
Collapse
Affiliation(s)
- Jefferson F Paril
- School of Biosciences, The University of Melbourne, Parkville, Victoria, Australia
| | - David J Balding
- School of Biosciences, The University of Melbourne, Parkville, Victoria, Australia.,Melbourne Integrative Genomics, The University of Melbourne, Parkville, Victoria, Australia.,School of Mathematics and Statistics, The University of Melbourne, Parkville, Victoria, Australia
| | - Alexandre Fournier-Level
- School of Biosciences, The University of Melbourne, Parkville, Victoria, Australia.,Melbourne Integrative Genomics, The University of Melbourne, Parkville, Victoria, Australia
| |
Collapse
|
10
|
Machado HE, Bergland AO, Taylor R, Tilk S, Behrman E, Dyer K, Fabian DK, Flatt T, González J, Karasov TL, Kim B, Kozeretska I, Lazzaro BP, Merritt TJS, Pool JE, O'Brien K, Rajpurohit S, Roy PR, Schaeffer SW, Serga S, Schmidt P, Petrov DA. Broad geographic sampling reveals the shared basis and environmental correlates of seasonal adaptation in Drosophila. eLife 2021; 10:e67577. [PMID: 34155971 PMCID: PMC8248982 DOI: 10.7554/elife.67577] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 06/21/2021] [Indexed: 11/16/2022] Open
Abstract
To advance our understanding of adaptation to temporally varying selection pressures, we identified signatures of seasonal adaptation occurring in parallel among Drosophila melanogaster populations. Specifically, we estimated allele frequencies genome-wide from flies sampled early and late in the growing season from 20 widely dispersed populations. We identified parallel seasonal allele frequency shifts across North America and Europe, demonstrating that seasonal adaptation is a general phenomenon of temperate fly populations. Seasonally fluctuating polymorphisms are enriched in large chromosomal inversions, and we find a broad concordance between seasonal and spatial allele frequency change. The direction of allele frequency change at seasonally variable polymorphisms can be predicted by weather conditions in the weeks prior to sampling, linking the environment and the genomic response to selection. Our results suggest that fluctuating selection is an important evolutionary force affecting patterns of genetic variation in Drosophila.
Collapse
Affiliation(s)
- Heather E Machado
- Department of Biology, Stanford UniversityStanfordUnited States
- Wellcome Sanger InstituteHinxtonUnited Kingdom
| | - Alan O Bergland
- Department of Biology, Stanford UniversityStanfordUnited States
- Department of Biology, University of VirginiaCharlottesvilleUnited States
| | - Ryan Taylor
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Susanne Tilk
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Emily Behrman
- Department of Biology, University of PennsylvaniaPhiladelphiaUnited States
| | - Kelly Dyer
- Department of Genetics, University of GeorgiaAthensUnited States
| | - Daniel K Fabian
- Institute of Population Genetics, Vetmeduni ViennaViennaAustria
- Centre for Pathogen Evolution, Department of Zoology, University of CambridgeCambridgeUnited Kingdom
| | - Thomas Flatt
- Institute of Population Genetics, Vetmeduni ViennaViennaAustria
- Department of Biology, University of FribourgFribourgSwitzerland
| | - Josefa González
- Institute of Evolutionary Biology, CSIC- Universitat Pompeu FabraBarcelonaSpain
| | - Talia L Karasov
- Department of Biology, University of UtahSalt Lake CityUnited States
| | - Bernard Kim
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Iryna Kozeretska
- Taras Shevchenko National University of KyivKyivUkraine
- National Antarctic Scientific Centre of Ukraine, Taras Shevchenko Blvd.KyivUkraine
| | - Brian P Lazzaro
- Department of Entomology, Cornell UniversityIthacaUnited States
| | - Thomas JS Merritt
- Department of Chemistry & Biochemistry, Laurentian UniversitySudburyCanada
| | - John E Pool
- Laboratory of Genetics, University of Wisconsin-MadisonMadisonUnited States
| | - Katherine O'Brien
- Department of Biology, University of PennsylvaniaPhiladelphiaUnited States
| | - Subhash Rajpurohit
- Department of Biology, University of PennsylvaniaPhiladelphiaUnited States
| | - Paula R Roy
- Department of Ecology and Evolutionary Biology, University of KansasLawrenceUnited States
| | - Stephen W Schaeffer
- Department of Biology, The Pennsylvania State UniversityUniversity ParkUnited States
| | - Svitlana Serga
- Taras Shevchenko National University of KyivKyivUkraine
- National Antarctic Scientific Centre of Ukraine, Taras Shevchenko Blvd.KyivUkraine
| | - Paul Schmidt
- Department of Biology, University of PennsylvaniaPhiladelphiaUnited States
| | - Dmitri A Petrov
- Department of Biology, Stanford UniversityStanfordUnited States
| |
Collapse
|
11
|
Guirao‐Rico S, González J. Benchmarking the performance of Pool-seq SNP callers using simulated and real sequencing data. Mol Ecol Resour 2021; 21:1216-1229. [PMID: 33534960 PMCID: PMC8251607 DOI: 10.1111/1755-0998.13343] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Revised: 12/21/2020] [Accepted: 01/27/2021] [Indexed: 12/13/2022]
Abstract
Population genomics is a fast-developing discipline with promising applications in a growing number of life sciences fields. Advances in sequencing technologies and bioinformatics tools allow population genomics to exploit genome-wide information to identify the molecular variants underlying traits of interest and the evolutionary forces that modulate these variants through space and time. However, the cost of genomic analyses of multiple populations is still too high to address them through individual genome sequencing. Pooling individuals for sequencing can be a more effective strategy in Single Nucleotide Polymorphism (SNP) detection and allele frequency estimation because of a higher total coverage. However, compared to individual sequencing, SNP calling from pools has the additional difficulty of distinguishing rare variants from sequencing errors, which is often avoided by establishing a minimum threshold allele frequency for the analysis. Finding an optimal balance between minimizing information loss and reducing sequencing costs is essential to ensure the success of population genomics studies. Here, we have benchmarked the performance of SNP callers for Pool-seq data, based on different approaches, under different conditions, and using computer simulations and real data. We found that SNP callers performance varied for allele frequencies up to 0.35. We also found that SNP callers based on Bayesian (SNAPE-pooled) or maximum likelihood (MAPGD) approaches outperform the two heuristic callers tested (VarScan and PoolSNP), in terms of the balance between sensitivity and FDR both in simulated and sequencing data. Our results will help inform the selection of the most appropriate SNP caller not only for large-scale population studies but also in cases where the Pool-seq strategy is the only option, such as in metagenomic or polyploid studies.
Collapse
Affiliation(s)
- Sara Guirao‐Rico
- Institute of Evolutionary BiologyCSIC‐Universitat Pompeu FabraBarcelonaSpain
| | - Josefa González
- Institute of Evolutionary BiologyCSIC‐Universitat Pompeu FabraBarcelonaSpain
| |
Collapse
|
12
|
Abstract
Drosophila melanogaster, a small dipteran of African origin, represents one of the best-studied model organisms. Early work in this system has uniquely shed light on the basic principles of genetics and resulted in a versatile collection of genetic tools that allow to uncover mechanistic links between genotype and phenotype. Moreover, given its worldwide distribution in diverse habitats and its moderate genome-size, Drosophila has proven very powerful for population genetics inference and was one of the first eukaryotes whose genome was fully sequenced. In this book chapter, we provide a brief historical overview of research in Drosophila and then focus on recent advances during the genomic era. After describing different types and sources of genomic data, we discuss mechanisms of neutral evolution including the demographic history of Drosophila and the effects of recombination and biased gene conversion. Then, we review recent advances in detecting genome-wide signals of selection, such as soft and hard selective sweeps. We further provide a brief introduction to background selection, selection of noncoding DNA and codon usage and focus on the role of structural variants, such as transposable elements and chromosomal inversions, during the adaptive process. Finally, we discuss how genomic data helps to dissect neutral and adaptive evolutionary mechanisms that shape genetic and phenotypic variation in natural populations along environmental gradients. In summary, this book chapter serves as a starting point to Drosophila population genomics and provides an introduction to the system and an overview to data sources, important population genetic concepts and recent advances in the field.
Collapse
|
13
|
Sui J, Luan S, Dai P, Fu Q, Meng X, Luo K, Cao B, Kong J. High accuracy of pooled DNA genotyping by 2b-RAD sequencing in the Pacific white shrimp, Litopenaeus vannamei. PLoS One 2020; 15:e0236343. [PMID: 32730349 PMCID: PMC7392308 DOI: 10.1371/journal.pone.0236343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 07/04/2020] [Indexed: 11/18/2022] Open
Abstract
Using pooled DNA genotyping to estimate the proportional contributions from multiple families in a pooled sample is of particular interest for selective breeding in aquaculture. We compared different pooled libraries with separate 2b-RAD sequencing of Litopenaeus vannamei individuals to assess the effect of different population structures (different numbers of individuals and families) on pooled DNA sequencing, the accuracy of parent sequencing of the DNA pools and the effect of SNP numbers on pooled DNA sequencing. We demonstrated that small pooled DNA genotyping of up to 53 individuals by 2b-RAD sequencing could provide a highly accurate assessment of population allele frequencies. The accuracy increased as the number of individuals and families increased. The allele frequencies of the parents from each pool were highly correlated with those of the pools or the corresponding individuals in the pool. We chose 500-28,000 SNPs to test the effect of SNP number on the accuracy of pooled sequencing, and no linear relationship was found between them. When the SNP number was fixed, increasing the number of individuals in the mixed pool resulted in higher accuracy of each pooled genotyping. Our data confirmed that pooled DNA genotyping by 2b-RAD sequencing could achieve higher accuracy than that of individual-based genotyping. The results will provide important information for shrimp breeding programs.
Collapse
Affiliation(s)
- Juan Sui
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Sheng Luan
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Ping Dai
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Qiang Fu
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Xianhong Meng
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Kun Luo
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Baoxiang Cao
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Jie Kong
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resources, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
- * E-mail:
| |
Collapse
|
14
|
Inbar S, Cohen P, Yahav T, Privman E. Comparative study of population genomic approaches for mapping colony-level traits. PLoS Comput Biol 2020; 16:e1007653. [PMID: 32218566 PMCID: PMC7141688 DOI: 10.1371/journal.pcbi.1007653] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 04/08/2020] [Accepted: 01/13/2020] [Indexed: 12/05/2022] Open
Abstract
Social insect colonies exhibit colony-level phenotypes such as social immunity and task coordination, which are produced by the individual phenotypes. Mapping the genetic basis of such phenotypes requires associating the colony-level phenotype with the genotypes in the colony. In this paper, we examine alternative approaches to DNA extraction, library construction, and sequencing for genome wide association studies (GWAS) of colony-level traits using a population sample of Cataglyphis niger ants. We evaluate the accuracy of allele frequency estimation from sequencing a pool of individuals (pool-seq) from each colony using either whole-genome sequencing or reduced representation genomic sequencing. Based on empirical measurement of the experimental noise in sequenced DNA pools, we show that reduced representation pool-seq is drastically less accurate than whole-genome pool-seq. Surprisingly, normalized pooling of samples did not result in greater accuracy than un-normalized pooling. Subsequently, we evaluate the power of the alternative approaches for detecting quantitative trait loci (QTL) of colony-level traits by using simulations that account for an environmental effect on the phenotype. Our results can inform experimental designs and enable optimizing the power of GWAS depending on budget, availability of samples and research goals. We conclude that for a given budget, sequencing un-normalized pools of individuals from each colony provides optimal QTL detection power. Genomic mapping techniques are used to map phenotypes to genotypes. Mapping is of general interest in any biological system, including fundamental studies of biological traits, clinical studies of genetic predisposition to disease, and agro- and bio-technological studies of domesticated plants and animals. Typically, such studies associate phenotypic measurements of individuals with their genotypes. Here we evaluate methodological approaches for genomic mapping of phenotypes that are expressed at the level of a group rather than that of individuals. We demonstrate that genomic sequencing of a DNA pool from multiple samples provides increased statistical power within a limited budget. Our results facilitate more efficient use of resources in genomic mapping studies that investigate group-level phenotypes.
Collapse
Affiliation(s)
- Shani Inbar
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Haifa, Israel
| | - Pnina Cohen
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Haifa, Israel
| | - Tal Yahav
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Haifa, Israel
| | - Eyal Privman
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Haifa, Israel
| |
Collapse
|
15
|
Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments. G3 (BETHESDA, MD.) 2019; 9:4159-4168. [PMID: 31636085 PMCID: PMC6893198 DOI: 10.1534/g3.119.400755] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Evolve-and-resequence (E+R) experiments leverage next-generation sequencing technology to track the allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this power comes at the expense of high-coverage (>100x) sequencing of many pooled samples, which can be cost-prohibitive. Here, we show that accurate estimates of allele frequencies can be achieved with very shallow sequencing depths (<5x) via inference of known founder haplotypes in small genomic windows. This technique can be used to efficiently estimate frequencies for any number of bi-allelic SNPs in populations of any model organism founded with sequenced homozygous strains. Using both experimentally-pooled and simulated samples of Drosophila melanogaster, we show that haplotype inference can improve allele frequency accuracy by orders of magnitude for up to 50 generations of recombination, and is robust to moderate levels of missing data, as well as different selection regimes. Finally, we show that a simple linear model generated from these simulations can predict the accuracy of haplotype-derived allele frequencies in other model organisms and experimental designs. To make these results broadly accessible for use in E+R experiments, we introduce HAF-pipe, an open-source software tool for calculating haplotype-derived allele frequencies from raw sequencing data. Ultimately, by reducing sequencing costs without sacrificing accuracy, our method facilitates E+R designs with higher replication and resolution, and thereby, increased power to detect adaptive alleles.
Collapse
|
16
|
Kurland S, Wheat CW, de la Paz Celorio Mancera M, Kutschera VE, Hill J, Andersson A, Rubin C, Andersson L, Ryman N, Laikre L. Exploring a Pool-seq-only approach for gaining population genomic insights in nonmodel species. Ecol Evol 2019; 9:11448-11463. [PMID: 31641485 PMCID: PMC6802065 DOI: 10.1002/ece3.5646] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 08/19/2019] [Accepted: 08/20/2019] [Indexed: 12/12/2022] Open
Abstract
Developing genomic insights is challenging in nonmodel species for which resources are often scarce and prohibitively costly. Here, we explore the potential of a recently established approach using Pool-seq data to generate a de novo genome assembly for mining exons, upon which Pool-seq data are used to estimate population divergence and diversity. We do this for two pairs of sympatric populations of brown trout (Salmo trutta): one naturally sympatric set of populations and another pair of populations introduced to a common environment. We validate our approach by comparing the results to those from markers previously used to describe the populations (allozymes and individual-based single nucleotide polymorphisms [SNPs]) and from mapping the Pool-seq data to a reference genome of the closely related Atlantic salmon (Salmo salar). We find that genomic differentiation (F ST) between the two introduced populations exceeds that of the naturally sympatric populations (F ST = 0.13 and 0.03 between the introduced and the naturally sympatric populations, respectively), in concordance with estimates from the previously used SNPs. The same level of population divergence is found for the two genome assemblies, but estimates of average nucleotide diversity differ ( π ¯ ≈ 0.002 and π ¯ ≈ 0.001 when mapping to S. trutta and S. salar, respectively), although the relationships between population values are largely consistent. This discrepancy might be attributed to biases when mapping to a haploid condensed assembly made of highly fragmented read data compared to using a high-quality reference assembly from a divergent species. We conclude that the Pool-seq-only approach can be suitable for detecting and quantifying genome-wide population differentiation, and for comparing genomic diversity in populations of nonmodel species where reference genomes are lacking.
Collapse
Affiliation(s)
- Sara Kurland
- Division of Population GeneticsDepartment of ZoologyStockholm UniversityStockholmSweden
| | - Christopher W. Wheat
- Division of Population GeneticsDepartment of ZoologyStockholm UniversityStockholmSweden
| | | | - Verena E. Kutschera
- Science for Life Laboratory and Department for Biochemistry and BiophysicsStockholm UniversitySolnaSweden
| | - Jason Hill
- Division of Population GeneticsDepartment of ZoologyStockholm UniversityStockholmSweden
| | - Anastasia Andersson
- Division of Population GeneticsDepartment of ZoologyStockholm UniversityStockholmSweden
| | - Carl‐Johan Rubin
- Department of Medical Biochemistry and MicrobiologyUppsala UniversityUppsalaSweden
| | - Leif Andersson
- Department of Medical Biochemistry and MicrobiologyUppsala UniversityUppsalaSweden
- Department of Animal Breeding and GeneticsSwedish University of Agricultural SciencesUppsalaSweden
- Department of Veterinary Integrative BiosciencesTexas A&M UniversityCollege StationTXUSA
| | - Nils Ryman
- Division of Population GeneticsDepartment of ZoologyStockholm UniversityStockholmSweden
| | - Linda Laikre
- Division of Population GeneticsDepartment of ZoologyStockholm UniversityStockholmSweden
| |
Collapse
|
17
|
Dorant Y, Benestan L, Rougemont Q, Normandeau E, Boyle B, Rochette R, Bernatchez L. Comparing Pool-seq, Rapture, and GBS genotyping for inferring weak population structure: The American lobster ( Homarus americanus) as a case study. Ecol Evol 2019; 9:6606-6623. [PMID: 31236247 PMCID: PMC6580275 DOI: 10.1002/ece3.5240] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Revised: 04/10/2019] [Accepted: 04/13/2019] [Indexed: 01/02/2023] Open
Abstract
Unraveling genetic population structure is challenging in species potentially characterized by large population size and high dispersal rates, often resulting in weak genetic differentiation. Genotyping a large number of samples can improve the detection of subtle genetic structure, but this may substantially increase sequencing cost and downstream bioinformatics computational time. To overcome this challenge, alternative, cost-effective sequencing approaches, namely Pool-seq and Rapture, have been developed. We empirically measured the power of resolution and congruence of these two methods in documenting weak population structure in nonmodel species with high gene flow comparatively to a conventional genotyping-by-sequencing (GBS) approach. For this, we used the American lobster (Homarus americanus) as a case study. First, we found that GBS, Rapture, and Pool-seq approaches gave similar allele frequency estimates (i.e., correlation coefficient over 0.90) and all three revealed the same weak pattern of population structure. Yet, Pool-seq data showed F ST estimates three to five times higher than GBS and Rapture, while the latter two methods returned similar F ST estimates, indicating that individual-based approaches provided more congruent results than Pool-seq. We conclude that despite higher costs, GBS and Rapture are more convenient approaches to use in the case of species exhibiting very weak differentiation. While both GBS and Rapture approaches provided similar results with regard to estimates of population genetic parameters, GBS remains more cost-effective in project involving a relatively small numbers of genotyped individuals (e.g., <1,000). Overall, this study illustrates the complexity of estimating genetic differentiation and other summary statistics in complex biological systems characterized by large population size and migration rates.
Collapse
Affiliation(s)
- Yann Dorant
- Institut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Laura Benestan
- Institut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
- Pêches et Océans CanadaInstitut Maurice‐LamontagneMont‐JoliCanada
| | - Quentin Rougemont
- Institut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Eric Normandeau
- Institut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Brian Boyle
- Institut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
- Plateforme d'analyses génomiques, Institut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| | - Rémy Rochette
- Department of BiologyUniversity of New BrunswickSaint JohnCanada
| | - Louis Bernatchez
- Institut de Biologie Intégrative et des Systèmes (IBIS)Université LavalQuébecCanada
| |
Collapse
|
18
|
Ayala D, Zhang S, Chateau M, Fouet C, Morlais I, Costantini C, Hahn MW, Besansky NJ. Association mapping desiccation resistance within chromosomal inversions in the African malaria vector Anopheles gambiae. Mol Ecol 2018; 28:1333-1342. [PMID: 30252170 DOI: 10.1111/mec.14880] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Revised: 09/08/2018] [Accepted: 09/10/2018] [Indexed: 12/30/2022]
Abstract
Inversion polymorphisms are responsible for many ecologically important phenotypes and are often found under balancing selection. However, the same features that ensure their large role in local adaptation-especially reduced recombination between alternate arrangements-mean that uncovering the precise loci within inversions that control these phenotypes is unachievable using standard mapping approaches. Here, we take advantage of long-term balancing selection on a pair of inversions in the mosquito Anopheles gambiae to map desiccation tolerance via pool-GWAS. Two polymorphic inversions on chromosome 2 of this species (denoted 2La and 2Rb) are associated with arid and hot conditions in Africa and are maintained in spatially and temporally heterogeneous environments. After measuring thousands of wild-caught individuals for survival under desiccation stress, we used phenotypically extreme individuals homozygous for alternative arrangements at the 2La inversion to construct pools for whole-genome sequencing. Genomewide association mapping using these pools revealed dozens of significant SNPs within both 2La and 2Rb, many of which neighboured genes controlling ion channels or related functions. Our results point to the promise of similar approaches in systems with inversions maintained by balancing selection and provide a list of candidate genes underlying the specific phenotypes controlled by the two inversions studied here.
Collapse
Affiliation(s)
- Diego Ayala
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana
| | - Simo Zhang
- Department of Computer Science, Indiana University, Bloomington, Indiana
| | - Mathieu Chateau
- Institut de Recherche pour le Développement, MIVEGEC (IRD, CNRS, Univ. Montpellier), Montpellier, France
| | - Caroline Fouet
- Institut de Recherche pour le Développement, MIVEGEC (IRD, CNRS, Univ. Montpellier), Montpellier, France.,Organisation de Coordination pour la lutte contre les Endémies en Afrique Centrale (OCAEC), Yaoundé, Cameroon
| | - Isabelle Morlais
- Institut de Recherche pour le Développement, MIVEGEC (IRD, CNRS, Univ. Montpellier), Montpellier, France.,Organisation de Coordination pour la lutte contre les Endémies en Afrique Centrale (OCAEC), Yaoundé, Cameroon
| | - Carlo Costantini
- Institut de Recherche pour le Développement, MIVEGEC (IRD, CNRS, Univ. Montpellier), Montpellier, France.,Organisation de Coordination pour la lutte contre les Endémies en Afrique Centrale (OCAEC), Yaoundé, Cameroon
| | - Matthew W Hahn
- Department of Computer Science, Indiana University, Bloomington, Indiana.,Department of Biology, Indiana University, Bloomington, Indiana
| | - Nora J Besansky
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana
| |
Collapse
|
19
|
A new approach based on targeted pooled DNA sequencing identifies novel mutations in patients with Inherited Retinal Dystrophies. Sci Rep 2018; 8:15457. [PMID: 30337596 PMCID: PMC6194132 DOI: 10.1038/s41598-018-33810-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 10/04/2018] [Indexed: 01/28/2023] Open
Abstract
Inherited retinal diseases (IRD) are a heterogeneous group of diseases that mainly affect the retina; more than 250 genes have been linked to the disease and more than 20 different clinical phenotypes have been described. This heterogeneity both at the clinical and genetic levels complicates the identification of causative mutations. Therefore, a detailed genetic characterization is important for genetic counselling and decisions regarding treatment. In this study, we developed a method consisting on pooled targeted next generation sequencing (NGS) that we applied to 316 eye disease related genes, followed by High Resolution Melting and copy number variation analysis. DNA from 115 unrelated test samples was pooled and samples with known mutations were used as positive controls to assess the sensitivity of our approach. Causal mutations for IRDs were found in 36 patients achieving a detection rate of 31.3%. Overall, 49 likely causative mutations were identified in characterized patients, 14 of which were first described in this study (28.6%). Our study shows that this new approach is a cost-effective tool for detection of causative mutations in patients with inherited retinopathies.
Collapse
|
20
|
Parker DJ, Wiberg RAW, Trivedi U, Tyukmaeva VI, Gharbi K, Butlin RK, Hoikkala A, Kankare M, Ritchie MG. Inter and Intraspecific Genomic Divergence in Drosophila montana Shows Evidence for Cold Adaptation. Genome Biol Evol 2018; 10:2086-2101. [PMID: 30010752 PMCID: PMC6107330 DOI: 10.1093/gbe/evy147] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/10/2018] [Indexed: 12/25/2022] Open
Abstract
The genomes of species that are ecological specialists will likely contain signatures of genomic adaptation to their niche. However, distinguishing genes related to ecological specialism from other sources of selection and more random changes is a challenge. Here, we describe the genome of Drosophila montana, which is the most extremely cold-adapted Drosophila species known. We use branch tests to identify genes showing accelerated divergence in contrasts between cold- and warm-adapted species and identify about 250 genes that show differences, possibly driven by a lower synonymous substitution rate in cold-adapted species. We also look for evidence of accelerated divergence between D. montana and D. virilis, a previously sequenced relative, but do not find strong evidence for divergent selection on coding sequence variation. Divergent genes are involved in a variety of functions, including cuticular and olfactory processes. Finally, we also resequenced three populations of D. montana from across its ecological and geographic range. Outlier loci were more likely to be found on the X chromosome and there was a greater than expected overlap between population outliers and those genes implicated in cold adaptation between Drosophila species, implying some continuity of selective process at these different evolutionary scales.
Collapse
Affiliation(s)
- Darren J Parker
- Department of Biological and Environmental Science, University of Jyväskylä, Finland
- Center for Biological Diversity, School of Biology, University of St. Andrews, Fife, United Kingdom
- Department of Ecology and Evolution, University of Lausanne, Biophore, Switzerland
| | - R Axel W Wiberg
- Center for Biological Diversity, School of Biology, University of St. Andrews, Fife, United Kingdom
| | - Urmi Trivedi
- Edinburgh Genomics, School of Biological Sciences, University of Edinburgh, United Kingdom
| | - Venera I Tyukmaeva
- Department of Biological and Environmental Science, University of Jyväskylä, Finland
| | - Karim Gharbi
- Edinburgh Genomics, School of Biological Sciences, University of Edinburgh, United Kingdom
- Earlham Institute, Norwich Research Park, Norwich, United Kingdom
| | - Roger K Butlin
- Department of Animal and Plant Sciences, The University of Sheffield, UK
- Department of Marine Sciences, University of Gothenburg, Göteborg, Sweden
| | - Anneli Hoikkala
- Department of Biological and Environmental Science, University of Jyväskylä, Finland
| | - Maaria Kankare
- Department of Biological and Environmental Science, University of Jyväskylä, Finland
| | - Michael G Ritchie
- Center for Biological Diversity, School of Biology, University of St. Andrews, Fife, United Kingdom
| |
Collapse
|
21
|
Nouhaud P, Gautier M, Gouin A, Jaquiéry J, Peccoud J, Legeai F, Mieuzet L, Smadja CM, Lemaitre C, Vitalis R, Simon JC. Identifying genomic hotspots of differentiation and candidate genes involved in the adaptive divergence of pea aphid host races. Mol Ecol 2018; 27:3287-3300. [PMID: 30010213 DOI: 10.1111/mec.14799] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Revised: 06/01/2018] [Accepted: 06/11/2018] [Indexed: 01/01/2023]
Abstract
Identifying the genomic bases of adaptation to novel environments is a long-term objective in evolutionary biology. Because genetic differentiation is expected to increase between locally adapted populations at the genes targeted by selection, scanning the genome for elevated levels of differentiation is a first step towards deciphering the genomic architecture underlying adaptive divergence. The pea aphid Acyrthosiphon pisum is a model of choice to address this question, as it forms a large complex of plant-specialized races and cryptic species, resulting from recent adaptive radiation. Here, we characterized genomewide polymorphisms in three pea aphid races specialized on alfalfa, clover and pea crops, respectively, which we sequenced in pools (poolseq). Using a model-based approach that explicitly accounts for selection, we identified 392 genomic hotspots of differentiation spanning 47.3 Mb and 2,484 genes (respectively, 9.12% of the genome size and 8.10% of its genes). Most of these highly differentiated regions were located on the autosomes, and overall differentiation was weaker on the X chromosome. Within these hotspots, high levels of absolute divergence between races suggest that these regions experienced less gene flow than the rest of the genome, most likely by contributing to reproductive isolation. Moreover, population-specific analyses showed evidence of selection in every host race, depending on the hotspot considered. These hotspots were significantly enriched for candidate gene categories that control host-plant selection and use. These genes encode 48 salivary proteins, 14 gustatory receptors, 10 odorant receptors, five P450 cytochromes and one chemosensory protein, which represent promising candidates for the genetic basis of host-plant specialization and ecological isolation in the pea aphid complex. Altogether, our findings open new research directions towards functional studies, for validating the role of these genes on adaptive phenotypes.
Collapse
Affiliation(s)
| | - Mathieu Gautier
- CBGP, Univ Montpellier, CIRAD, INRA, IRD, Montpellier SupAgro, Montpellier, France
- Institut de Biologie Computationnelle, Univ Montpellier, Montpellier, France
| | - Anaïs Gouin
- INRA, UMR 1349 IGEPP, Le Rheu, France
- Inria/IRISA GenScale, Rennes, France
| | | | - Jean Peccoud
- Laboratoire Ecologie et Biologie des Interactions, UMR CNRS 7267, Université de Poitiers, Poitiers, France
| | - Fabrice Legeai
- INRA, UMR 1349 IGEPP, Le Rheu, France
- Inria/IRISA GenScale, Rennes, France
| | | | - Carole M Smadja
- Institut des Sciences de l'Evolution (UMR 5554) - CNRS - IRD - EPHE - CIRAD -Université de Montpellier, Montpellier, France
| | | | - Renaud Vitalis
- CBGP, Univ Montpellier, CIRAD, INRA, IRD, Montpellier SupAgro, Montpellier, France
- Institut de Biologie Computationnelle, Univ Montpellier, Montpellier, France
| | | |
Collapse
|
22
|
Puncher GN, Cariani A, Maes GE, Van Houdt J, Herten K, Cannas R, Rodriguez-Ezpeleta N, Albaina A, Estonba A, Lutcavage M, Hanke A, Rooker J, Franks JS, Quattro JM, Basilone G, Fraile I, Laconcha U, Goñi N, Kimoto A, Macías D, Alemany F, Deguara S, Zgozi SW, Garibaldi F, Oray IK, Karakulak FS, Abid N, Santos MN, Addis P, Arrizabalaga H, Tinti F. Spatial dynamics and mixing of bluefin tuna in the Atlantic Ocean and Mediterranean Sea revealed using next-generation sequencing. Mol Ecol Resour 2018; 18:620-638. [PMID: 29405659 DOI: 10.1111/1755-0998.12764] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 01/06/2018] [Accepted: 01/19/2018] [Indexed: 01/05/2023]
Abstract
The Atlantic bluefin tuna is a highly migratory species emblematic of the challenges associated with shared fisheries management. In an effort to resolve the species' stock dynamics, a genomewide search for spatially informative single nucleotide polymorphisms (SNPs) was undertaken, by way of sequencing reduced representation libraries. An allele frequency approach to SNP discovery was used, combining the data of 555 larvae and young-of-the-year (LYOY) into pools representing major geographical areas and mapping against a newly assembled genomic reference. From a set of 184,895 candidate loci, 384 were selected for validation using 167 LYOY. A highly discriminatory genotyping panel of 95 SNPs was ultimately developed by selecting loci with the most pronounced differences between western Atlantic and Mediterranean Sea LYOY. The panel was evaluated by genotyping a different set of LYOY (n = 326), and from these, 77.8% and 82.1% were correctly assigned to western Atlantic and Mediterranean Sea origins, respectively. The panel revealed temporally persistent differentiation among LYOY from the western Atlantic and Mediterranean Sea (FST = 0.008, p = .034). The composition of six mixed feeding aggregations in the Atlantic Ocean and Mediterranean Sea was characterized using genotypes from medium (n = 184) and large (n = 48) adults, applying population assignment and mixture analyses. The results provide evidence of persistent population structuring across broad geographic areas and extensive mixing in the Atlantic Ocean, particularly in the mid-Atlantic Bight and Gulf of St. Lawrence. The genomic reference and genotyping tools presented here constitute novel resources useful for future research and conservation efforts.
Collapse
Affiliation(s)
- Gregory N Puncher
- Department of Biological, Geological and Environmental Sciences/Laboratory of Genetics and Genomics of Marine Resources and Environment (GenoDREAM), University of Bologna, Ravenna, Italy.,Department of Biology, Marine Biology Research Group, Ghent University, Ghent, Belgium.,Department of Biology, University of New Brunswick, Saint John, NB, Canada
| | - Alessia Cariani
- Department of Biological, Geological and Environmental Sciences/Laboratory of Genetics and Genomics of Marine Resources and Environment (GenoDREAM), University of Bologna, Ravenna, Italy
| | - Gregory E Maes
- Centre for Sustainable Tropical Fisheries and Aquaculture, Comparative Genomics Centre, College of Science and Engineering, James Cook University, Townsville, Qld, Australia.,Centre for Human Genetics, Genomics Core, KU Leuven - UZ Leuven, Leuven, Belgium.,Laboratory of Biodiversity and Evolutionary Genomics, University of Leuven (KU Leuven), Leuven, Belgium
| | - Jeroen Van Houdt
- Centre for Human Genetics, Genomics Core, KU Leuven - UZ Leuven, Leuven, Belgium
| | - Koen Herten
- Centre for Human Genetics, Genomics Core, KU Leuven - UZ Leuven, Leuven, Belgium.,Laboratory of Biodiversity and Evolutionary Genomics, University of Leuven (KU Leuven), Leuven, Belgium
| | - Rita Cannas
- Department of Life & Environmental Sciences (DISVA), University of Cagliari, Cagliari, Italy
| | | | - Aitor Albaina
- Laboratory of Genetics Faculty of Science & Technology, Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country (UPV/EHU), Leioa, Spain.,Environmental Studies Centre (CEA), Vitoria-Gasteiz City Council, Vitoria-Gasteiz, Spain
| | - Andone Estonba
- Laboratory of Genetics Faculty of Science & Technology, Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country (UPV/EHU), Leioa, Spain
| | - Molly Lutcavage
- School for the Environment and Large Pelagics Research Center, University of Massachusetts, Boston, Gloucester, MA, USA
| | - Alex Hanke
- Fisheries and Oceans Canada, St. Andrews Biological Station, St. Andrews, NB, Canada
| | - Jay Rooker
- Department of Marine Biology, Texas A&M University at Galveston, Galveston, TX, USA.,Department of Wildlife and Fisheries Sciences, Texas A&M University, College Station, TX, USA
| | - James S Franks
- Gulf Coast Research Laboratory, Center for Fisheries Research and Development, University of Southern Mississippi, Ocean Springs, MS, USA
| | - Joseph M Quattro
- Department of Biological Sciences, University of South Carolina, Columbia, SC, USA
| | - Gualtiero Basilone
- National Research Council, Institute for Marine and Coastal Environment, Detached Unit of Capo Granitola, Trapani, Italy
| | - Igaratza Fraile
- Marine Research Division, AZTI Tecnalia, Pasaia, Gipuzkoa, Spain
| | - Urtzi Laconcha
- Marine Research Division, AZTI Tecnalia, Pasaia, Gipuzkoa, Spain.,Laboratory of Genetics Faculty of Science & Technology, Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country (UPV/EHU), Leioa, Spain
| | - Nicolas Goñi
- Marine Research Division, AZTI Tecnalia, Pasaia, Gipuzkoa, Spain
| | - Ai Kimoto
- National Research Institute of Far Seas Fisheries, Shizuoka, Japan
| | - David Macías
- Instituto Español de Oceanografía, Centro Oceanográfico de Baleares, Palma, Spain
| | - Francisco Alemany
- Instituto Español de Oceanografía, Centro Oceanográfico de Baleares, Palma, Spain
| | - Simeon Deguara
- Federation of Maltese Aquaculture Producers (FMAP), Valletta, Malta
| | - Salem W Zgozi
- Marine Biology Research Center, Tripoli-Tajura, Libya
| | - Fulvio Garibaldi
- Department of Earth, Environmental and Life Sciences, University of Genoa, Genova, Italy
| | - Isik K Oray
- Faculty of Fisheries, Istanbul University, Laleli-Istanbul, Turkey
| | | | - Noureddine Abid
- National Institute of Fisheries Research, Regional Centre of Tangier, Tanger, Morocco
| | | | - Piero Addis
- Department of Life & Environmental Sciences (DISVA), University of Cagliari, Cagliari, Italy
| | | | - Fausto Tinti
- Department of Biological, Geological and Environmental Sciences/Laboratory of Genetics and Genomics of Marine Resources and Environment (GenoDREAM), University of Bologna, Ravenna, Italy
| |
Collapse
|
23
|
Selection Mapping Identifies Loci Underpinning Autumn Dormancy in Alfalfa ( Medicago sativa). G3-GENES GENOMES GENETICS 2018; 8:461-468. [PMID: 29255116 PMCID: PMC5919736 DOI: 10.1534/g3.117.300099] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Autumn dormancy in alfalfa (Medicago sativa) is associated with agronomically important traits including regrowth rate, maturity, and winter survival. Historical recurrent selection experiments have been able to manipulate the dormancy response. We hypothesized that artificial selection for dormancy phenotypes in these experiments had altered allele frequencies of dormancy-related genes. Here, we follow this hypothesis and analyze allele frequency changes using genome-wide polymorphisms in the pre- and postselection populations from one historical selection experiment. We screened the nondormant cultivar CUF 101 and populations developed by three cycles of recurrent phenotypic selection for taller and shorter plants in autumn with markers derived from genotyping-by-sequencing (GBS). We validated the robustness of our GBS-derived allele frequency estimates using an empirical approach. Our results suggest that selection mapping is a powerful means of identifying genomic regions associated with traits, and that it can be exploited to provide regions on which to focus further mapping and cloning projects.
Collapse
|
24
|
Neethiraj R, Hornett EA, Hill JA, Wheat CW. Investigating the genomic basis of discrete phenotypes using a Pool-Seq-only approach: New insights into the genetics underlying colour variation in diverse taxa. Mol Ecol 2017; 26:4990-5002. [PMID: 28614599 DOI: 10.1111/mec.14205] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Revised: 05/09/2017] [Accepted: 05/15/2017] [Indexed: 12/11/2022]
Abstract
While large-scale genomic approaches are increasingly revealing the genetic basis of polymorphic phenotypes such as colour morphs, such approaches are almost exclusively conducted in species with high-quality genomes and annotations. Here, we use Pool-Seq data for both genome assembly and SNP frequency estimation, followed by scanning for FST outliers to identify divergent genomic regions. Using paired-end, short-read sequencing data from two groups of individuals expressing divergent phenotypes, we generate a de novo rough-draft genome, identify SNPs and calculate genomewide FST differences between phenotypic groups. As genomes generated by Pool-Seq data are highly fragmented, we also present an approach for super-scaffolding contigs using existing protein-coding data sets. Using this approach, we reanalysed genomic data from two recent studies of birds and butterflies investigating colour pattern variation and replicated their core findings, demonstrating the accuracy and power of a Pool-Seq-only approach. Additionally, we discovered new regions of high divergence and new annotations that together suggest novel parallels between birds and butterflies in the origins of their colour pattern variation.
Collapse
Affiliation(s)
| | - Emily A Hornett
- Department of Biology, Pennsylvania State University, University Park, PA, USA.,Department of Zoology, University of Cambridge, Cambridge, UK
| | - Jason A Hill
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | | |
Collapse
|
25
|
Eoche-Bosy D, Gautier M, Esquibet M, Legeai F, Bretaudeau A, Bouchez O, Fournet S, Grenier E, Montarry J. Genome scans on experimentally evolved populations reveal candidate regions for adaptation to plant resistance in the potato cyst nematode Globodera pallida. Mol Ecol 2017; 26:4700-4711. [PMID: 28734070 DOI: 10.1111/mec.14240] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Revised: 07/13/2017] [Accepted: 07/17/2017] [Indexed: 12/30/2022]
Abstract
Improving resistance durability involves to be able to predict the adaptation speed of pathogen populations. Identifying the genetic bases of pathogen adaptation to plant resistances is a useful step to better understand and anticipate this phenomenon. Globodera pallida is a major pest of potato crop for which a resistance QTL, GpaVvrn , has been identified in Solanum vernei. However, its durability is threatened as G. pallida populations are able to adapt to the resistance in few generations. The aim of this study was to investigate the genomic regions involved in the resistance breakdown by coupling experimental evolution and high-density genome scan. We performed a whole-genome resequencing of pools of individuals (Pool-Seq) belonging to G. pallida lineages derived from two independent populations having experimentally evolved on susceptible and resistant potato cultivars. About 1.6 million SNPs were used to perform the genome scan using a recent model testing for adaptive differentiation and association to population-specific covariables. We identified 275 outliers and 31 of them, which also showed a significant reduction in diversity in adapted lineages, were investigated for their genic environment. Some candidate genomic regions contained genes putatively encoding effectors and were enriched in SPRYSECs, known in cyst nematodes to be involved in pathogenicity and in (a)virulence. Validated candidate SNPs will provide a useful molecular tool to follow frequencies of virulence alleles in natural G. pallida populations and define efficient strategies of use of potato resistances maximizing their durability.
Collapse
Affiliation(s)
- D Eoche-Bosy
- IGEPP, INRA, Agrocampus Ouest, Université de Rennes 1, Le Rheu, France
| | - M Gautier
- CBGP, INRA, IRD, CIRAD, Montpellier SupAgro, Montferrier-sur-Lez, France.,IBC, Montpellier, France
| | - M Esquibet
- IGEPP, INRA, Agrocampus Ouest, Université de Rennes 1, Le Rheu, France
| | - F Legeai
- IGEPP, BIPAA, INRA, Agrocampus Ouest, Université de Rennes 1, Rennes, France.,IRISA, GenScale, INRIA, Rennes, France
| | - A Bretaudeau
- IGEPP, BIPAA, INRA, Agrocampus Ouest, Université de Rennes 1, Rennes, France.,IRISA, GenOuest COre Facility, INRIA, Rennes, France
| | - O Bouchez
- GeT-PlaGe, Genotoul, INRA, Castanet-Tolosan, France.,GenPhySE, Université de Toulouse, INRA, INPT, ENVT, Castanet-Tolosan, France
| | - S Fournet
- IGEPP, INRA, Agrocampus Ouest, Université de Rennes 1, Le Rheu, France
| | - E Grenier
- IGEPP, INRA, Agrocampus Ouest, Université de Rennes 1, Le Rheu, France
| | - J Montarry
- IGEPP, INRA, Agrocampus Ouest, Université de Rennes 1, Le Rheu, France
| |
Collapse
|
26
|
Faria VG, Sucena É. From Nature to the Lab: Establishing Drosophila Resources for Evolutionary Genetics. Front Ecol Evol 2017. [DOI: 10.3389/fevo.2017.00061] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
27
|
Weldatsadik RG, Wang J, Puhakainen K, Jiao H, Jalava J, Räisänen K, Datta N, Skoog T, Vuopio J, Jokiranta TS, Kere J. Sequence analysis of pooled bacterial samples enables identification of strain variation in group A streptococcus. Sci Rep 2017; 7:45771. [PMID: 28361960 PMCID: PMC5374712 DOI: 10.1038/srep45771] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 03/02/2017] [Indexed: 12/30/2022] Open
Abstract
Knowledge of the genomic variation among different strains of a pathogenic microbial species can help in selecting optimal candidates for diagnostic assays and vaccine development. Pooled sequencing (Pool-seq) is a cost effective approach for population level genetic studies that require large numbers of samples such as various strains of a microbe. To test the use of Pool-seq in identifying variation, we pooled DNA of 100 Streptococcus pyogenes strains of different emm types in two pools, each containing 50 strains. We used four variant calling tools (Freebayes, UnifiedGenotyper, SNVer, and SAMtools) and one emm1 strain, SF370, as a reference genome. In total 63719 SNPs and 164 INDELs were identified in the two pools concordantly by at least two of the tools. Majority of the variants (93.4%) from six individually sequenced strains used in the pools could be identified from the two pools and 72.3% and 97.4% of the variants in the pools could be mined from the analysis of the 44 complete Str. pyogenes genomes and 3407 sequence runs deposited in the European Nucleotide Archive respectively. We conclude that DNA sequencing of pooled samples of large numbers of bacterial strains is a robust, rapid and cost-efficient way to discover sequence variation.
Collapse
Affiliation(s)
- Rigbe G Weldatsadik
- Research Programs Unit, Immunobiology, University of Helsinki, and Helsinki University Central Hospital, Helsinki, Finland
| | - Jingwen Wang
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Kai Puhakainen
- Bacterial Infections Unit, National Institute for Health and Welfare, Turku, Finland.,Department of Medical Microbiology and Immunology, University of Turku, Turku, Finland
| | - Hong Jiao
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Jari Jalava
- Bacterial Infections Unit, National Institute for Health and Welfare, Turku, Finland
| | - Kati Räisänen
- Bacterial Infections Unit, National Institute for Health and Welfare, Turku, Finland
| | - Neeta Datta
- Research Programs Unit, Immunobiology, University of Helsinki, and Helsinki University Central Hospital, Helsinki, Finland
| | - Tiina Skoog
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Jaana Vuopio
- Bacterial Infections Unit, National Institute for Health and Welfare, Turku, Finland.,Department of Medical Microbiology and Immunology, University of Turku, Turku, Finland
| | - T Sakari Jokiranta
- Research Programs Unit, Immunobiology, University of Helsinki, and Helsinki University Central Hospital, Helsinki, Finland
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden.,Molecular Neurology Research Program, University of Helsinki, and Folkhälsan Institute of Genetics, Biomedicum Helsinki, Helsinki, Finland.,Department of Genetics and Molecular Medicine, King's College London, London, UK
| |
Collapse
|
28
|
Fustier MA, Brandenburg JT, Boitard S, Lapeyronnie J, Eguiarte LE, Vigouroux Y, Manicacci D, Tenaillon MI. Signatures of local adaptation in lowland and highland teosintes from whole-genome sequencing of pooled samples. Mol Ecol 2017; 26:2738-2756. [DOI: 10.1111/mec.14082] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 02/21/2017] [Indexed: 01/01/2023]
Affiliation(s)
- M.-A. Fustier
- Génétique Quantitative et Evolution - Le Moulon; INRA, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; Ferme du Moulon F-91190 Gif-sur-Yvette France
| | - J.-T. Brandenburg
- Génétique Quantitative et Evolution - Le Moulon; INRA, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; Ferme du Moulon F-91190 Gif-sur-Yvette France
| | - S. Boitard
- GenPhySe; Université de Toulouse, INRA, INPT, INP-ENVT; 24 chemin de Borde-Rouge - Auzeville Tolosane; F-31326 Castanet Tolosan France
| | - J. Lapeyronnie
- GenPhySe; Université de Toulouse, INRA, INPT, INP-ENVT; 24 chemin de Borde-Rouge - Auzeville Tolosane; F-31326 Castanet Tolosan France
| | - L. E. Eguiarte
- Departamento de Ecología Evolutiva; Instituto de Ecología; Universidad Nacional Autónoma de México; Apartado Postal 70-275 Coyoacán 04510 México D.F. Mexico
| | - Y. Vigouroux
- Institut de Recherche pour le développement (IRD); UMR Diversité, Adaptation et Développement des plantes (DIADE); Université de Montpellier; 911 avenue Agropolis, F-34394 Montpellier Cedex 5 France
| | - D. Manicacci
- Génétique Quantitative et Evolution - Le Moulon; INRA, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; Ferme du Moulon F-91190 Gif-sur-Yvette France
| | - M. I. Tenaillon
- Génétique Quantitative et Evolution - Le Moulon; INRA, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; Ferme du Moulon F-91190 Gif-sur-Yvette France
| |
Collapse
|
29
|
Corbett-Detig R, Nielsen R. A Hidden Markov Model Approach for Simultaneously Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in Samples of Arbitrary Ploidy. PLoS Genet 2017; 13:e1006529. [PMID: 28045893 PMCID: PMC5242547 DOI: 10.1371/journal.pgen.1006529] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Revised: 01/18/2017] [Accepted: 12/08/2016] [Indexed: 12/19/2022] Open
Abstract
Admixture-the mixing of genomes from divergent populations-is increasingly appreciated as a central process in evolution. To characterize and quantify patterns of admixture across the genome, a number of methods have been developed for local ancestry inference. However, existing approaches have a number of shortcomings. First, all local ancestry inference methods require some prior assumption about the expected ancestry tract lengths. Second, existing methods generally require genotypes, which is not feasible to obtain for many next-generation sequencing projects. Third, many methods assume samples are diploid, however a wide variety of sequencing applications will fail to meet this assumption. To address these issues, we introduce a novel hidden Markov model for estimating local ancestry that models the read pileup data, rather than genotypes, is generalized to arbitrary ploidy, and can estimate the time since admixture during local ancestry inference. We demonstrate that our method can simultaneously estimate the time since admixture and local ancestry with good accuracy, and that it performs well on samples of high ploidy-i.e. 100 or more chromosomes. As this method is very general, we expect it will be useful for local ancestry inference in a wider variety of populations than what previously has been possible. We then applied our method to pooled sequencing data derived from populations of Drosophila melanogaster on an ancestry cline on the east coast of North America. We find that regions of local recombination rates are negatively correlated with the proportion of African ancestry, suggesting that selection against foreign ancestry is the least efficient in low recombination regions. Finally we show that clinal outlier loci are enriched for genes associated with gene regulatory functions, consistent with a role of regulatory evolution in ecological adaptation of admixed D. melanogaster populations. Our results illustrate the potential of local ancestry inference for elucidating fundamental evolutionary processes.
Collapse
Affiliation(s)
- Russell Corbett-Detig
- Genomics Institute and Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA, United States of America
- Department of Integrative Biology, UC Berkeley, Berkeley, CA, United States of America
| | - Rasmus Nielsen
- Department of Integrative Biology, UC Berkeley, Berkeley, CA, United States of America
- The Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
30
|
Gómez‐Rodríguez C, Timmermans MJTN, Crampton‐Platt A, Vogler AP. Intraspecific genetic variation in complex assemblages from mitochondrial metagenomics: comparison with DNA barcodes. Methods Ecol Evol 2016. [DOI: 10.1111/2041-210x.12667] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Carola Gómez‐Rodríguez
- Department of Life Sciences Natural History Museum London SW7 5BD UK
- Departamento de Zoología Facultad de Biología Universidad de Santiago de Compostela c/Lope Gómez de Marzoa s/n Santiago de Compostela 15782 Spain
| | - Martijn J. T. N. Timmermans
- Department of Life Sciences Natural History Museum London SW7 5BD UK
- Department of Natural Sciences Middlesex University Hendon Campus London NW4 4BT UK
| | - Alex Crampton‐Platt
- Department of Life Sciences Natural History Museum London SW7 5BD UK
- Department of Genetics, Evolution and Environment University College London Gower Street London WC1E 6BT UK
| | - Alfried P. Vogler
- Department of Life Sciences Natural History Museum London SW7 5BD UK
- Department of Life Sciences Imperial College London Silwood Park Campus Ascot SL5 7PY UK
| |
Collapse
|
31
|
Hoban S, Kelley JL, Lotterhos KE, Antolin MF, Bradburd G, Lowry DB, Poss ML, Reed LK, Storfer A, Whitlock MC. Finding the Genomic Basis of Local Adaptation: Pitfalls, Practical Solutions, and Future Directions. Am Nat 2016; 188:379-97. [PMID: 27622873 PMCID: PMC5457800 DOI: 10.1086/688018] [Citation(s) in RCA: 458] [Impact Index Per Article: 50.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Uncovering the genetic and evolutionary basis of local adaptation is a major focus of evolutionary biology. The recent development of cost-effective methods for obtaining high-quality genome-scale data makes it possible to identify some of the loci responsible for adaptive differences among populations. Two basic approaches for identifying putatively locally adaptive loci have been developed and are broadly used: one that identifies loci with unusually high genetic differentiation among populations (differentiation outlier methods) and one that searches for correlations between local population allele frequencies and local environments (genetic-environment association methods). Here, we review the promises and challenges of these genome scan methods, including correcting for the confounding influence of a species' demographic history, biases caused by missing aspects of the genome, matching scales of environmental data with population structure, and other statistical considerations. In each case, we make suggestions for best practices for maximizing the accuracy and efficiency of genome scans to detect the underlying genetic basis of local adaptation. With attention to their current limitations, genome scan methods can be an important tool in finding the genetic basis of adaptive evolutionary change.
Collapse
Affiliation(s)
- Sean Hoban
- Morton Arboretum, Lisle, Illinois 60532; and National Institute for Mathematical and Biological Synthesis (NIMBioS), Knoxville, Tennessee 37966
| | - Joanna L. Kelley
- School of Biological Sciences, Washington State University, Pullman, Washington 99164
| | - Katie E. Lotterhos
- Department of Marine and Environmental Sciences, Northeastern University Marine Science Center, Nahant, Massachusetts 01908
| | - Michael F. Antolin
- Department of Biology, Colorado State University, Fort Collins, Colorado 80523
| | - Gideon Bradburd
- Museum of Vertebrate Zoology and Department of Environmental Science, Policy, and Management, University of California, Berkeley, California 94720
| | - David B. Lowry
- Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824
| | - Mary L. Poss
- Department of Biology and Veterinary and Biomedical Sciences, Penn State University, University Park, Pennsylvania 16802
| | - Laura K. Reed
- Department of Biological Sciences, University of Alabama, Tuscaloosa, Alabama 35406
| | - Andrew Storfer
- School of Biological Sciences, Washington State University, Pullman, Washington 99164
| | | |
Collapse
|
32
|
Estimating the Effective Population Size from Temporal Allele Frequency Changes in Experimental Evolution. Genetics 2016; 204:723-735. [PMID: 27542959 PMCID: PMC5068858 DOI: 10.1534/genetics.116.191197] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Accepted: 07/30/2016] [Indexed: 01/22/2023] Open
Abstract
The effective population size (Ne) is a major factor determining allele frequency changes in natural and experimental populations. Temporal methods provide a powerful and simple approach to estimate short-term Ne. They use allele frequency shifts between temporal samples to calculate the standardized variance, which is directly related to Ne. Here we focus on experimental evolution studies that often rely on repeated sequencing of samples in pools (Pool-seq). Pool-seq is cost-effective and often outperforms individual-based sequencing in estimating allele frequencies, but it is associated with atypical sampling properties: Additional to sampling individuals, sequencing DNA in pools leads to a second round of sampling, which increases the variance of allele frequency estimates. We propose a new estimator of Ne, which relies on allele frequency changes in temporal data and corrects for the variance in both sampling steps. In simulations, we obtain accurate Ne estimates, as long as the drift variance is not too small compared to the sampling and sequencing variance. In addition to genome-wide Ne estimates, we extend our method using a recursive partitioning approach to estimate Ne locally along the chromosome. Since the type I error is controlled, our method permits the identification of genomic regions that differ significantly in their Ne estimates. We present an application to Pool-seq data from experimental evolution with Drosophila and provide recommendations for whole-genome data. The estimator is computationally efficient and available as an R package at https://github.com/ThomasTaus/Nest.
Collapse
|
33
|
Günther T, Lampei C, Barilar I, Schmid KJ. Genomic and phenotypic differentiation of Arabidopsis thaliana along altitudinal gradients in the North Italian Alps. Mol Ecol 2016; 25:3574-92. [PMID: 27220345 DOI: 10.1111/mec.13705] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Revised: 04/19/2016] [Accepted: 05/02/2016] [Indexed: 12/25/2022]
Abstract
Altitudinal gradients in mountain regions are short-range clines of different environmental parameters such as temperature or radiation. We investigated genomic and phenotypic signatures of adaptation to such gradients in five Arabidopsis thaliana populations from the North Italian Alps that originated from 580 to 2350 m altitude by resequencing pools of 19-29 individuals from each population. The sample includes two pairs of low- and high-altitude populations from two different valleys. High-altitude populations showed a lower nucleotide diversity and negative Tajima's D values and were more closely related to each other than to low-altitude populations from the same valley. Despite their close geographic proximity, demographic analysis revealed that low- and high-altitude populations split between 260 000 and 15 000 years before present. Single nucleotide polymorphisms whose allele frequencies were highly differentiated between low- and high-altitude populations identified genomic regions of up to 50 kb length where patterns of genetic diversity are consistent with signatures of local selective sweeps. These regions harbour multiple genes involved in stress response. Variation among populations in two putative adaptive phenotypic traits, frost tolerance and response to light/UV stress was not correlated with altitude. Taken together, the spatial distribution of genetic diversity reflects a potentially adaptive differentiation between low- and high-altitude populations, whereas the phenotypic differentiation in the two traits investigated does not. It may resemble an interaction between adaptation to the local microhabitat and demographic history influenced by historical glaciation cycles, recent seed dispersal and genetic drift in local populations.
Collapse
Affiliation(s)
- Torsten Günther
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany.,Department of Evolutionary Biology, EBC, Uppsala University, Uppsala, Sweden
| | - Christian Lampei
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany
| | - Ivan Barilar
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany
| | - Karl J Schmid
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany
| |
Collapse
|
34
|
Stam R, Scheikl D, Tellier A. Pooled Enrichment Sequencing Identifies Diversity and Evolutionary Pressures at NLR Resistance Genes within a Wild Tomato Population. Genome Biol Evol 2016; 8:1501-15. [PMID: 27189991 PMCID: PMC4898808 DOI: 10.1093/gbe/evw094] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2016] [Indexed: 12/13/2022] Open
Abstract
Nod-like receptors (NLRs) are nucleotide-binding domain and leucine-rich repeats containing proteins that are important in plant resistance signaling. Many of the known pathogen resistance (R) genes in plants are NLRs and they can recognize pathogen molecules directly or indirectly. As such, divergence and copy number variants at these genes are found to be high between species. Within populations, positive and balancing selection are to be expected if plants coevolve with their pathogens. In order to understand the complexity of R-gene coevolution in wild nonmodel species, it is necessary to identify the full range of NLRs and infer their evolutionary history. Here we investigate and reveal polymorphism occurring at 220 NLR genes within one population of the partially selfing wild tomato species Solanum pennellii. We use a combination of enrichment sequencing and pooling ten individuals, to specifically sequence NLR genes in a resource and cost-effective manner. We focus on the effects which different mapping and single nucleotide polymorphism calling software and settings have on calling polymorphisms in customized pooled samples. Our results are accurately verified using Sanger sequencing of polymorphic gene fragments. Our results indicate that some NLRs, namely 13 out of 220, have maintained polymorphism within our S. pennellii population. These genes show a wide range of πN/πS ratios and differing site frequency spectra. We compare our observed rate of heterozygosity with expectations for this selfing and bottlenecked population. We conclude that our method enables us to pinpoint NLR genes which have experienced natural selection in their habitat.
Collapse
Affiliation(s)
- Remco Stam
- Section of Population Genetics, Technische Universität München, Freising, Germany
| | - Daniela Scheikl
- Section of Population Genetics, Technische Universität München, Freising, Germany
| | - Aurélien Tellier
- Section of Population Genetics, Technische Universität München, Freising, Germany
| |
Collapse
|
35
|
Pfenninger M, Patel S, Arias-Rodriguez L, Feldmeyer B, Riesch R, Plath M. Unique evolutionary trajectories in repeated adaptation to hydrogen sulphide-toxic habitats of a neotropical fish (Poecilia mexicana). Mol Ecol 2016; 24:5446-59. [PMID: 26405850 DOI: 10.1111/mec.13397] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Revised: 09/05/2015] [Accepted: 09/22/2015] [Indexed: 12/20/2022]
Abstract
Replicated ecological gradients are prime systems to study processes of molecular evolution underlying ecological divergence. Here, we investigated the repeated adaptation of the neotropical fish Poecilia mexicana to habitats containing toxic hydrogen sulphide (H2 S) and compared two population pairs of sulphide-adapted and ancestral fish by sequencing population pools of >200 individuals (Pool-Seq). We inferred the evolutionary processes shaping divergence and tested the hypothesis of increase of parallelism from SNPs to molecular pathways. Coalescence analyses showed that the divergence occurred in the face of substantial bidirectional gene flow. Population divergence involved many short, widely dispersed regions across the genome. Analyses of allele frequency spectra suggest that differentiation at most loci was driven by divergent selection, followed by a selection-mediated reduction of gene flow. Reconstructing allelic state changes suggested that selection acted mainly upon de novo mutations in the sulphide-adapted populations. Using a corrected Jaccard index to quantify parallel evolution, we found a negligible proportion of statistically significant parallel evolution of Jcorr = 0.0032 at the level of SNPs, divergent genome regions (Jcorr = 0.0061) and genes therein (Jcorr = 0.0091). At the level of metabolic pathways, the overlap was Jcorr = 0.2545, indicating increasing parallelism with increasing level of biological integration. The majority of pathways contained positively selected genes in both sulphide populations. Hence, adaptation to sulphidic habitats necessitated adjustments throughout the genome. The largely unique evolutionary trajectories may be explained by a high proportion of de novo mutations driving the divergence. Our findings favour Gould's view that evolution is often the unrepeatable result of stochastic events with highly contingent effects.
Collapse
Affiliation(s)
- Markus Pfenninger
- Molecular Ecology Group, Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberganlage 25, D-60325 Frankfurt am Main, Hessen, Germany
| | - Simit Patel
- Molecular Ecology Group, Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberganlage 25, D-60325 Frankfurt am Main, Hessen, Germany
| | - Lenin Arias-Rodriguez
- División Académica de Ciencias Biológicas, Universidad Juárez Autónoma de Tabasco (UJAT), Villahermosa, C.P. 86150 Tabasco, México
| | - Barbara Feldmeyer
- Molecular Ecology Group, Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberganlage 25, D-60325 Frankfurt am Main, Hessen, Germany
| | - Rüdiger Riesch
- School of Biological Sciences, Centre for Ecology, Evolution and Behaviour, Royal Holloway University of London, Egham Hill, Egham, Surrey TW20 0EX, UK
| | - Martin Plath
- College of Animal Science and Technology, Northwest A&F University, Xinong Road 22, 712100 Yangling, China
| |
Collapse
|
36
|
Asgharian H, Chang PL, Lysenkov S, Scobeyeva VA, Reisen WK, Nuzhdin SV. Evolutionary genomics of Culex pipiens: global and local adaptations associated with climate, life-history traits and anthropogenic factors. Proc Biol Sci 2016; 282:rspb.2015.0728. [PMID: 26085592 PMCID: PMC4590483 DOI: 10.1098/rspb.2015.0728] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
We present the first genome-wide study of recent evolution in Culex pipiens species complex focusing on the genomic extent, functional targets and likely causes of global and local adaptations. We resequenced pooled samples of six populations of C. pipiens and two populations of the outgroup Culex torrentium. We used principal component analysis to systematically study differential natural selection across populations and developed a phylogenetic scanning method to analyse admixture without haplotype data. We found evidence for the prominent role of geographical distribution in shaping population structure and specifying patterns of genomic selection. Multiple adaptive events, involving genes implicated with autogeny, diapause and insecticide resistance were limited to specific populations. We estimate that about 5–20% of the genes (including several histone genes) and almost half of the annotated pathways were undergoing selective sweeps in each population. The high occurrence of sweeps in non-genic regions and in chromatin remodelling genes indicated the adaptive importance of gene expression changes. We hypothesize that global adaptive processes in the C. pipiens complex are potentially associated with South to North range expansion, requiring adjustments in chromatin conformation. Strong local signature of adaptation and emergence of hybrid bridge vectors necessitate genomic assessment of populations before specifying control agents.
Collapse
Affiliation(s)
- Hosseinali Asgharian
- Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Peter L Chang
- Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Sergey Lysenkov
- Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA Department of Evolution, Moscow State University, Moscow 119991, Russia
| | | | - William K Reisen
- Center for Vectorborne Diseases, Department of Pathology, Microbiology and Immunology, School of Veterinary Medicine, University of California, Davis, CA 95616, USA
| | - Sergey V Nuzhdin
- Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA Department of Evolution, Moscow State University, Moscow 119991, Russia St. Petersburg State Polytechnical University, Sanct Petersburg, Russia
| |
Collapse
|
37
|
Bélanger S, Esteves P, Clermont I, Jean M, Belzile F. Genotyping-by-Sequencing on Pooled Samples and its Use in Measuring Segregation Bias during the Course of Androgenesis in Barley. THE PLANT GENOME 2016; 9. [PMID: 27898767 DOI: 10.3835/plantgenome2014.10.0073] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Estimation of allelic frequencies is often required in breeding but genotyping many individuals at many loci can be expensive. We have developed a genotyping-by-sequencing (GBS) approach for estimating allelic frequencies on pooled samples (Pool-GBS) and used it to examine segregation distortion in doubled haploid (DH) populations of barley ( L.). In the first phase, we genotyped each line individually and exploited these data to explore a strategy to call single nucleotide polymorphisms (SNPs) on pooled reads. We measured both the number of SNPs called and the variance of the estimated allelic frequencies at various depths of coverage on a subset of reads containing 5 to 25 million reads. We show that allelic frequencies could be cost-effectively and accurately estimated at a depth of 50 reads per SNP using 15 million reads. This Pool-GBS approach yielded 1984 SNPs whose allelic frequency estimates were highly reproducible (CV = 10.4%) and correlated ( = 0.9167) with the "true" frequency derived from analysis of individual lines. In a second phase, we used Pool-GBS to investigate segregation bias throughout androgenesis from microspores to a population of regenerated plants. No strong bias was detected among the microspores resulting from the meiotic divisions, whereas significant biases could be shown to arise during embryo formation and plant regeneration. In summary, this methodology provides an approach to estimate allelic frequencies more efficiently and on materials that are unsuitable for individual analysis. In addition, it allowed us to shed light on the process of androgenesis in barley.
Collapse
|
38
|
Schrider DR, Hahn MW, Begun DJ. Parallel Evolution of Copy-Number Variation across Continents in Drosophila melanogaster. Mol Biol Evol 2016; 33:1308-16. [PMID: 26809315 DOI: 10.1093/molbev/msw014] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Genetic differentiation across populations that is maintained in the presence of gene flow is a hallmark of spatially varying selection. In Drosophila melanogaster, the latitudinal clines across the eastern coasts of Australia and North America appear to be examples of this type of selection, with recent studies showing that a substantial portion of the D. melanogaster genome exhibits allele frequency differentiation with respect to latitude on both continents. As of yet there has been no genome-wide examination of differentiated copy-number variants (CNVs) in these geographic regions, despite their potential importance for phenotypic variation in Drosophila and other taxa. Here, we present an analysis of geographic variation in CNVs in D. melanogaster. We also present the first genomic analysis of geographic variation for copy-number variation in the sister species, D. simulans, in order to investigate patterns of parallel evolution in these close relatives. In D. melanogaster we find hundreds of CNVs, many of which show parallel patterns of geographic variation on both continents, lending support to the idea that they are influenced by spatially varying selection. These findings support the idea that polymorphic CNVs contribute to local adaptation in D. melanogaster In contrast, we find very few CNVs in D. simulans that are geographically differentiated in parallel on both continents, consistent with earlier work suggesting that clinal patterns are weaker in this species.
Collapse
Affiliation(s)
| | - Matthew W Hahn
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis
| |
Collapse
|
39
|
Kapun M, Fabian DK, Goudet J, Flatt T. Genomic Evidence for Adaptive Inversion Clines in Drosophila melanogaster. Mol Biol Evol 2016; 33:1317-36. [PMID: 26796550 DOI: 10.1093/molbev/msw016] [Citation(s) in RCA: 106] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Clines in chromosomal inversion polymorphisms-presumably driven by climatic gradients-are common but there is surprisingly little evidence for selection acting on them. Here we address this long-standing issue in Drosophila melanogaster by using diagnostic single nucleotide polymorphism (SNP) markers to estimate inversion frequencies from 28 whole-genome Pool-seq samples collected from 10 populations along the North American east coast. Inversions In(3L)P, In(3R)Mo, and In(3R)Payne showed clear latitudinal clines, and for In(2L)t, In(2R)NS, and In(3R)Payne the steepness of the clinal slopes changed between summer and fall. Consistent with an effect of seasonality on inversion frequencies, we detected small but stable seasonal fluctuations of In(2R)NS and In(3R)Payne in a temperate Pennsylvanian population over 4 years. In support of spatially varying selection, we observed that the cline in In(3R)Payne has remained stable for >40 years and that the frequencies of In(2L)t and In(3R)Payne are strongly correlated with climatic factors that vary latitudinally, independent of population structure. To test whether these patterns are adaptive, we compared the amount of genetic differentiation of inversions versus neutral SNPs and found that the clines in In(2L)t and In(3R)Payne are maintained nonneutrally and independent of admixture. We also identified numerous clinal inversion-associated SNPs, many of which exhibit parallel differentiation along the Australian cline and reside in genes known to affect fitness-related traits. Together, our results provide strong evidence that inversion clines are maintained by spatially-and perhaps also temporally-varying selection. We interpret our data in light of current hypotheses about how inversions are established and maintained.
Collapse
Affiliation(s)
- Martin Kapun
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Daniel K Fabian
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Jérôme Goudet
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Thomas Flatt
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
40
|
Machado HE, Bergland AO, O'Brien KR, Behrman EL, Schmidt PS, Petrov DA. Comparative population genomics of latitudinal variation in Drosophila simulans and Drosophila melanogaster. Mol Ecol 2016; 25:723-40. [PMID: 26523848 DOI: 10.1111/mec.13446] [Citation(s) in RCA: 111] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Revised: 10/26/2015] [Accepted: 10/28/2015] [Indexed: 12/15/2022]
Abstract
Examples of clinal variation in phenotypes and genotypes across latitudinal transects have served as important models for understanding how spatially varying selection and demographic forces shape variation within species. Here, we examine the selective and demographic contributions to latitudinal variation through the largest comparative genomic study to date of Drosophila simulans and Drosophila melanogaster, with genomic sequence data from 382 individual fruit flies, collected across a spatial transect of 19 degrees latitude and at multiple time points over 2 years. Consistent with phenotypic studies, we find less clinal variation in D. simulans than D. melanogaster, particularly for the autosomes. Moreover, we find that clinally varying loci in D. simulans are less stable over multiple years than comparable clines in D. melanogaster. D. simulans shows a significantly weaker pattern of isolation by distance than D. melanogaster and we find evidence for a stronger contribution of migration to D. simulans population genetic structure. While population bottlenecks and migration can plausibly explain the differences in stability of clinal variation between the two species, we also observe a significant enrichment of shared clinal genes, suggesting that the selective forces associated with climate are acting on the same genes and phenotypes in D. simulans and D. melanogaster.
Collapse
Affiliation(s)
- Heather E Machado
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305-5020, USA
| | - Alan O Bergland
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305-5020, USA
| | - Katherine R O'Brien
- School of Biological Sciences, University of Nebraska-Lincoln, 348 Manter Hall, Lincoln, NE, 68588, USA.,Department of Biology, University of Pennsylvania, 102 Leidy Laboratories, Philadelphia, PA, 19104-6313, USA
| | - Emily L Behrman
- Department of Biology, University of Pennsylvania, 102 Leidy Laboratories, Philadelphia, PA, 19104-6313, USA
| | - Paul S Schmidt
- Department of Biology, University of Pennsylvania, 102 Leidy Laboratories, Philadelphia, PA, 19104-6313, USA
| | - Dmitri A Petrov
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305-5020, USA
| |
Collapse
|
41
|
Abstract
High-throughput techniques based on restriction site-associated DNA sequencing (RADseq) are enabling the low-cost discovery and genotyping of thousands of genetic markers for any species, including non-model organisms, which is revolutionizing ecological, evolutionary and conservation genetics. Technical differences among these methods lead to important considerations for all steps of genomics studies, from the specific scientific questions that can be addressed, and the costs of library preparation and sequencing, to the types of bias and error inherent in the resulting data. In this Review, we provide a comprehensive discussion of RADseq methods to aid researchers in choosing among the many different approaches and avoiding erroneous scientific conclusions from RADseq data, a problem that has plagued other genetic marker types in the past.
Collapse
|
42
|
Telonis-Scott M, Sgrò CM, Hoffmann AA, Griffin PC. Cross-Study Comparison Reveals Common Genomic, Network, and Functional Signatures of Desiccation Resistance in Drosophila melanogaster. Mol Biol Evol 2016; 33:1053-67. [PMID: 26733490 PMCID: PMC4776712 DOI: 10.1093/molbev/msv349] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Repeated attempts to map the genomic basis of complex traits often yield different outcomes because of the influence of genetic background, gene-by-environment interactions, and/or statistical limitations. However, where repeatability is low at the level of individual genes, overlap often occurs in gene ontology categories, genetic pathways, and interaction networks. Here we report on the genomic overlap for natural desiccation resistance from a Pool-genome-wide association study experiment and a selection experiment in flies collected from the same region in southeastern Australia in different years. We identified over 600 single nucleotide polymorphisms associated with desiccation resistance in flies derived from almost 1,000 wild-caught genotypes, a similar number of loci to that observed in our previous genomic study of selected lines, demonstrating the genetic complexity of this ecologically important trait. By harnessing the power of cross-study comparison, we narrowed the candidates from almost 400 genes in each study to a core set of 45 genes, enriched for stimulus, stress, and defense responses. In addition to gene-level overlap, there was higher order congruence at the network and functional levels, suggesting genetic redundancy in key stress sensing, stress response, immunity, signaling, and gene expression pathways. We also identified variants linked to different molecular aspects of desiccation physiology previously verified from functional experiments. Our approach provides insight into the genomic basis of a complex and ecologically important trait and predicts candidate genetic pathways to explore in multiple genetic backgrounds and related species within a functional framework.
Collapse
Affiliation(s)
- Marina Telonis-Scott
- School of Biological Sciences, Monash University, Clayton, Melbourne, VIC, Australia
| | - Carla M Sgrò
- School of Biological Sciences, Monash University, Clayton, Melbourne, VIC, Australia
| | - Ary A Hoffmann
- School of BioSciences, Bio21 Institute, University of Melbourne, Parkville, Melbourne, VIC, Australia
| | - Philippa C Griffin
- School of BioSciences, Bio21 Institute, University of Melbourne, Parkville, Melbourne, VIC, Australia
| |
Collapse
|
43
|
Donnelly MJ, Isaacs AT, Weetman D. Identification, Validation, and Application of Molecular Diagnostics for Insecticide Resistance in Malaria Vectors. Trends Parasitol 2015; 32:197-206. [PMID: 26750864 DOI: 10.1016/j.pt.2015.12.001] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Revised: 11/27/2015] [Accepted: 12/02/2015] [Indexed: 12/20/2022]
Abstract
Insecticide resistance is a major obstacle to control of Anopheles malaria mosquitoes in sub-Saharan Africa and requires an improved understanding of the underlying mechanisms. Efforts to discover resistance genes and DNA markers have been dominated by candidate gene and quantitative trait locus studies of laboratory strains, but with greater availability of genome sequences a shift toward field-based agnostic discovery is anticipated. Mechanisms evolve continually to produce elevated resistance yielding multiplicative diagnostic markers, co-screening of which can give high predictive value. With a shift toward prospective analyses, identification and screening of resistance marker panels will boost monitoring and programmatic decision making.
Collapse
Affiliation(s)
- Martin J Donnelly
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK; Malaria Programme, Wellcome Trust Sanger Institute, Cambridge, UK.
| | - Alison T Isaacs
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK
| | - David Weetman
- Department of Vector Biology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK
| |
Collapse
|
44
|
Kardos M, Luikart G, Bunch R, Dewey S, Edwards W, McWilliam S, Stephenson J, Allendorf FW, Hogg JT, Kijas J. Whole‐genome resequencing uncovers molecular signatures of natural and sexual selection in wild bighorn sheep. Mol Ecol 2015; 24:5616-32. [DOI: 10.1111/mec.13415] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2014] [Revised: 09/23/2015] [Accepted: 09/28/2015] [Indexed: 12/12/2022]
Affiliation(s)
- Marty Kardos
- Division of Biological Sciences University of Montana Missoula MT 59812 USA
- Evolutionary Biology Centre Uppsala University SE‐75236 Uppsala Sweden
| | - Gordon Luikart
- Division of Biological Sciences University of Montana Missoula MT 59812 USA
- Division of Biological Sciences Flathead Lake Biological Station Fish and Wildlife Genomics Group University of Montana Polson MT 59860 USA
| | - Rowan Bunch
- CSIRO Agriculture 306 Carmody Road St Lucia Brisbane Qld 4067 Australia
| | - Sarah Dewey
- Grand Teton National Park Moose WY 83012 USA
| | - William Edwards
- Wyoming Game and Fish Department Wildlife Disease Laboratory Laramie WY 82070 USA
| | - Sean McWilliam
- CSIRO Agriculture 306 Carmody Road St Lucia Brisbane Qld 4067 Australia
| | | | - Fred W. Allendorf
- Division of Biological Sciences University of Montana Missoula MT 59812 USA
| | - John T. Hogg
- Montana Conservation Science Institute Missoula MT 59803 USA
| | - James Kijas
- CSIRO Agriculture 306 Carmody Road St Lucia Brisbane Qld 4067 Australia
| |
Collapse
|
45
|
Fracassetti M, Griffin PC, Willi Y. Validation of Pooled Whole-Genome Re-Sequencing in Arabidopsis lyrata. PLoS One 2015; 10:e0140462. [PMID: 26461136 PMCID: PMC4604096 DOI: 10.1371/journal.pone.0140462] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Accepted: 09/25/2015] [Indexed: 12/21/2022] Open
Abstract
Sequencing pooled DNA of multiple individuals from a population instead of sequencing individuals separately has become popular due to its cost-effectiveness and simple wet-lab protocol, although some criticism of this approach remains. Here we validated a protocol for pooled whole-genome re-sequencing (Pool-seq) of Arabidopsis lyrata libraries prepared with low amounts of DNA (1.6 ng per individual). The validation was based on comparing single nucleotide polymorphism (SNP) frequencies obtained by pooling with those obtained by individual-based Genotyping By Sequencing (GBS). Furthermore, we investigated the effect of sample number, sequencing depth per individual and variant caller on population SNP frequency estimates. For Pool-seq data, we compared frequency estimates from two SNP callers, VarScan and Snape; the former employs a frequentist SNP calling approach while the latter uses a Bayesian approach. Results revealed concordance correlation coefficients well above 0.8, confirming that Pool-seq is a valid method for acquiring population-level SNP frequency data. Higher accuracy was achieved by pooling more samples (25 compared to 14) and working with higher sequencing depth (4.1× per individual compared to 1.4× per individual), which increased the concordance correlation coefficient to 0.955. The Bayesian-based SNP caller produced somewhat higher concordance correlation coefficients, particularly at low sequencing depth. We recommend pooling at least 25 individuals combined with sequencing at a depth of 100× to produce satisfactory frequency estimates for common SNPs (minor allele frequency above 0.05).
Collapse
Affiliation(s)
- Marco Fracassetti
- Institute of Biology, Evolutionary Botany, University of Neuchâtel, Neuchâtel, Switzerland
- * E-mail:
| | - Philippa C. Griffin
- Institute of Biology, Evolutionary Botany, University of Neuchâtel, Neuchâtel, Switzerland
- School of BioSciences, University of Melbourne, Parkville, Victoria, Australia
| | - Yvonne Willi
- Institute of Biology, Evolutionary Botany, University of Neuchâtel, Neuchâtel, Switzerland
| |
Collapse
|
46
|
Favé MJ, Johnson RA, Cover S, Handschuh S, Metscher BD, Müller GB, Gopalan S, Abouheif E. Past climate change on Sky Islands drives novelty in a core developmental gene network and its phenotype. BMC Evol Biol 2015; 15:183. [PMID: 26338531 PMCID: PMC4560157 DOI: 10.1186/s12862-015-0448-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 08/06/2015] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND A fundamental and enduring problem in evolutionary biology is to understand how populations differentiate in the wild, yet little is known about what role organismal development plays in this process. Organismal development integrates environmental inputs with the action of gene regulatory networks to generate the phenotype. Core developmental gene networks have been highly conserved for millions of years across all animals, and therefore, organismal development may bias variation available for selection to work on. Biased variation may facilitate repeatable phenotypic responses when exposed to similar environmental inputs and ecological changes. To gain a more complete understanding of population differentiation in the wild, we integrated evolutionary developmental biology with population genetics, morphology, paleoecology and ecology. This integration was made possible by studying how populations of the ant species Monomorium emersoni respond to climatic and ecological changes across five 'Sky Islands' in Arizona, which are mountain ranges separated by vast 'seas' of desert. Sky Islands represent a replicated natural experiment allowing us to determine how repeatable is the response of M. emersoni populations to climate and ecological changes at the phenotypic, developmental, and gene network levels. RESULTS We show that a core developmental gene network and its phenotype has kept pace with ecological and climate change on each Sky Island over the last ~90,000 years before present (BP). This response has produced two types of evolutionary change within an ant species: one type is unpredictable and contingent on the pattern of isolation of Sky lsland populations by climate warming, resulting in slight changes in gene expression, organ growth, and morphology. The other type is predictable and deterministic, resulting in the repeated evolution of a novel wingless queen phenotype and its underlying gene network in response to habitat changes induced by climate warming. CONCLUSION Our findings reveal dynamics of developmental gene network evolution in wild populations. This holds important implications: (1) for understanding how phenotypic novelty is generated in the wild; (2) for providing a possible bridge between micro- and macroevolution; and (3) for understanding how development mediates the response of organisms to past, and potentially, future climate change.
Collapse
Affiliation(s)
- Marie-Julie Favé
- Department of Biology, McGill University, 1205 Dr. Penfield avenue, Montréal, Québec, Canada.
| | - Robert A Johnson
- School of Life Sciences, Arizona State University, Tempe, AZ 85287-4501, USA.
| | - Stefan Cover
- Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA.
| | - Stephan Handschuh
- Department of Theoretical Biology, University of Vienna, Althanstrasse 14, Vienna, 1090, Austria.
| | - Brian D Metscher
- Department of Theoretical Biology, University of Vienna, Althanstrasse 14, Vienna, 1090, Austria.
| | - Gerd B Müller
- Department of Theoretical Biology, University of Vienna, Althanstrasse 14, Vienna, 1090, Austria.
| | - Shyamalika Gopalan
- Department of Biology, McGill University, 1205 Dr. Penfield avenue, Montréal, Québec, Canada.
| | - Ehab Abouheif
- Department of Biology, McGill University, 1205 Dr. Penfield avenue, Montréal, Québec, Canada.
| |
Collapse
|
47
|
Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing. PLoS One 2015; 10:e0131572. [PMID: 26181055 PMCID: PMC4504669 DOI: 10.1371/journal.pone.0131572] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Accepted: 06/03/2015] [Indexed: 12/18/2022] Open
Abstract
Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying-related SNP, including membrane associated guanylate kinase (MAGI-1), KIAA1462, Rho GTPase activating protein 21 (ARHGAP21), acyl-CoA synthetase family member 2 (ACSF2), astrotactin 2 (ASTN2). Collectively, our data suggests that 8 SNP and 5 genes might be promising candidate markers or targets for marker-assisted selection of egg numbers in geese.
Collapse
|
48
|
Kofler R, Nolte V, Schlötterer C. The impact of library preparation protocols on the consistency of allele frequency estimates in Pool-Seq data. Mol Ecol Resour 2015; 16:118-22. [PMID: 26014582 PMCID: PMC4744716 DOI: 10.1111/1755-0998.12432] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Revised: 05/17/2015] [Accepted: 05/22/2015] [Indexed: 11/29/2022]
Abstract
Sequencing pools of individuals (Pool‐Seq) is a cost‐effective method to determine genome‐wide allele frequency estimates. Given the importance of meta‐analyses combining data sets, we determined the influence of different genomic library preparation protocols on the consistency of allele frequency estimates. We found that typically no more than 1% of the variation in allele frequency estimates could be attributed to differences in library preparation. Also read length had only a minor effect on the consistency of allele frequency estimates. By far, the most pronounced influence could be attributed to sequence coverage. Increasing the coverage from 30‐ to 50‐fold improved the consistency of allele frequency estimates by at least 27%. We conclude that Pool‐Seq data can be easily combined across different library preparation methods, but sufficient sequence coverage is key to reliable results.
Collapse
Affiliation(s)
- Robert Kofler
- Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, 1210 Wien, Austria
| | - Viola Nolte
- Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, 1210 Wien, Austria
| | - Christian Schlötterer
- Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, 1210 Wien, Austria
| |
Collapse
|
49
|
Beissinger TM, Rosa GJM, Kaeppler SM, Gianola D, de Leon N. Defining window-boundaries for genomic analyses using smoothing spline techniques. Genet Sel Evol 2015; 47:30. [PMID: 25928167 PMCID: PMC4404117 DOI: 10.1186/s12711-015-0105-9] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 02/04/2015] [Indexed: 01/29/2023] Open
Abstract
Background High-density genomic data is often analyzed by combining information over windows of adjacent markers. Interpretation of data grouped in windows versus at individual locations may increase statistical power, simplify computation, reduce sampling noise, and reduce the total number of tests performed. However, use of adjacent marker information can result in over- or under-smoothing, undesirable window boundary specifications, or highly correlated test statistics. We introduce a method for defining windows based on statistically guided breakpoints in the data, as a foundation for the analysis of multiple adjacent data points. This method involves first fitting a cubic smoothing spline to the data and then identifying the inflection points of the fitted spline, which serve as the boundaries of adjacent windows. This technique does not require prior knowledge of linkage disequilibrium, and therefore can be applied to data collected from individual or pooled sequencing experiments. Moreover, in contrast to existing methods, an arbitrary choice of window size is not necessary, since these are determined empirically and allowed to vary along the genome. Results Simulations applying this method were performed to identify selection signatures from pooled sequencing FST data, for which allele frequencies were estimated from a pool of individuals. The relative ratio of true to false positives was twice that generated by existing techniques. A comparison of the approach to a previous study that involved pooled sequencing FST data from maize suggested that outlying windows were more clearly separated from their neighbors than when using a standard sliding window approach. Conclusions We have developed a novel technique to identify window boundaries for subsequent analysis protocols. When applied to selection studies based on FST data, this method provides a high discovery rate and minimizes false positives. The method is implemented in the R package GenWin, which is publicly available from CRAN.
Collapse
Affiliation(s)
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin, Madison, 53706, USA. .,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, 53792, USA.
| | - Shawn M Kaeppler
- Department of Agronomy, University of Wisconsin, Madison, 53706, USA. .,Department of Energy Great Lakes Bioenergy Research Center, University of Wisconsin, Madison, 53706, USA.
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin, Madison, 53706, USA. .,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, 53792, USA. .,Department of Dairy Science, University of Wisconsin, Madison, 53706, USA.
| | - Natalia de Leon
- Department of Agronomy, University of Wisconsin, Madison, 53706, USA. .,Department of Energy Great Lakes Bioenergy Research Center, University of Wisconsin, Madison, 53706, USA.
| |
Collapse
|
50
|
Cheeseman IH, McDew-White M, Phyo AP, Sriprawat K, Nosten F, Anderson TJC. Pooled sequencing and rare variant association tests for identifying the determinants of emerging drug resistance in malaria parasites. Mol Biol Evol 2014; 32:1080-90. [PMID: 25534029 PMCID: PMC4379400 DOI: 10.1093/molbev/msu397] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
We explored the potential of pooled sequencing to swiftly and economically identify selective sweeps due to emerging artemisinin (ART) resistance in a South-East Asian malaria parasite population. ART resistance is defined by slow parasite clearance from the blood of ART-treated patients and mutations in the kelch gene (chr. 13) have been strongly implicated to play a role. We constructed triplicate pools of 70 slow-clearing (resistant) and 70 fast-clearing (sensitive) infections collected from the Thai–Myanmar border and sequenced these to high (∼150-fold) read depth. Allele frequency estimates from pools showed almost perfect correlation (Lin’s concordance = 0.98) with allele frequencies at 93 single nucleotide polymorphisms measured directly from individual infections, giving us confidence in the accuracy of this approach. By mapping genome-wide divergence (FST) between pools of drug-resistant and drug-sensitive parasites, we identified two large (>150 kb) regions (on chrs. 13 and 14) and 17 smaller candidate genome regions. To identify individual genes within these genome regions, we resequenced an additional 38 parasite genomes (16 slow and 22 fast-clearing) and performed rare variant association tests. These confirmed kelch as a major molecular marker for ART resistance (P = 6.03 × 10−6). This two-tier approach is powerful because pooled sequencing rapidly narrows down genome regions of interest, while targeted rare variant association testing within these regions can pinpoint the genetic basis of resistance. We show that our approach is robust to recurrent mutation and the generation of soft selective sweeps, which are predicted to be common in pathogen populations with large effective population sizes, and may confound more traditional gene mapping approaches.
Collapse
Affiliation(s)
| | | | - Aung Pyae Phyo
- Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Mae Sot, Thailand
| | - Kanlaya Sriprawat
- Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Mae Sot, Thailand
| | - François Nosten
- Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Mae Sot, Thailand Centre for Tropical Medicine, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | | |
Collapse
|