1
|
Karageorgi M, Lyulina AS, Bitter MC, Lappo E, Greenblum SI, Mouza ZK, Tran CT, Huynh AV, Oken H, Schmidt P, Petrov DA. Beneficial reversal of dominance maintains a resistance polymorphism under fluctuating insecticide selection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.10.23.619953. [PMID: 39554016 PMCID: PMC11566011 DOI: 10.1101/2024.10.23.619953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
Large-effect functional genetic variation is commonly found in natural populations, even though natural selection should erode such variants. Theory suggests that under fluctuating selective pressures, beneficial reversal of dominance - where alleles are dominant when beneficial and recessive when deleterious - can protect these loci from selection, allowing them to persist. However, empirical evidence for this mechanism remains elusive because testing requires direct measurements of selection and dominance in natural conditions. Here, we show that insecticide-resistant alleles at the Ace locus in Drosophila melanogaster persist worldwide at intermediate frequencies and exhibit beneficial reversal of dominance. By combining laboratory and large-scale field mesocosm experiments with insecticide manipulation, and mathematical modeling, we show that the benefits of the resistant Ace alleles are dominant while their fitness costs recessive. We further show that fluctuating insecticide selection generates chromosome-scale genomic perturbations at sites linked to the resistant Ace alleles, revealing broader genomic consequences of this mechanism. Overall, our results suggest that beneficial reversal of dominance contributes to the maintenance of functional genetic variation and impacts patterns of genomic diversity via linked fluctuating selection.
Collapse
|
2
|
Veregge M, Hirsch CD, Moscou MJ, Burghardt L, Tiffin P, Khokhani D. Virulence is not directly related to strain success in planta in Clavibacter nebraskensis. mSystems 2025; 10:e0135524. [PMID: 39611810 PMCID: PMC11748494 DOI: 10.1128/msystems.01355-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Accepted: 11/16/2024] [Indexed: 11/30/2024] Open
Abstract
Goss's wilt and leaf blight of maize is an economically important disease caused by the Gram-positive bacterium, Clavibacter nebraskensis (Cn). Little is known about the ecology and pathogenesis of this bacterium. Here, we used phenotypic assays and a high-throughput whole-genome sequencing approach to explore among-strain variation in virulence and multistrain reproductive success in planta. Our survey of 41 strains revealed that more recently sampled strains tended to have higher virulence than strains sampled before 2010 and tended to be more genetically divergent from the reference strain, isolated in 1971. More detailed assays with a representative sample of 13 of these strains revealed that host genotype (resistant or susceptible) did not strongly affect strain success and that strain success in planta in multi-strain communities was not closely associated with virulence in single-strain assays. Two weakly virulent strains, CIC354 and CIC370, had the greatest reproductive success, whereas the most highly virulent strains did not significantly change in frequency in any host genotype. A genomic analysis revealed candidate genes, including putative virulence factors (i.e., a secreted cellulase), responsible for among-strain variation in reproductive success.IMPORTANCENon-pathogenic strains of many bacterial pathogens are reported to coexist with pathogenic strains in symptomatic plants. To understand the ecology and pathogenesis of the pathogen population, it is essential to study strain dynamics in the context of the host. We created a community of 13 strains exhibiting diverse virulence phenotypes and used this community to infect the host plant. We compared the strain frequency of these strains before and after the host infection. Contrary to our hypothesis of highly virulent strains being selected by the susceptible host, we found that weakly virulent strains were selected by both resistant and susceptible host lines. We identified several genes associated with strain frequency shifts suggesting their role in strain colonization, virulence, and fitness.
Collapse
Affiliation(s)
- Molly Veregge
- Department of Plant Pathology, University of Minnesota, Twin Cities, Minnesota, USA
| | - Cory D. Hirsch
- Department of Plant Pathology, University of Minnesota, Twin Cities, Minnesota, USA
| | - Matthew J. Moscou
- Department of Plant Pathology, University of Minnesota, Twin Cities, Minnesota, USA
- USDA-ARS Cereal Disease Laboratory, University of Minnesota, St. Paul, Minnesota, USA
| | - Liana Burghardt
- Department of Plant Science, Pennsylvania State University, Center Valley, Pennsylvania, USA
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, Twin Cities, Minnesota, USA
| | - Devanshi Khokhani
- Department of Plant Pathology, University of Minnesota, Twin Cities, Minnesota, USA
| |
Collapse
|
3
|
Bitter MC, Greenblum S, Rajpurohit S, Bergland AO, Hemker JA, Betancourt NJ, Tilk S, Berardi S, Oken H, Schmidt P, Petrov DA. Pervasive fitness trade-offs revealed by rapid adaptation in large experimental populations of Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.28.620721. [PMID: 39554054 PMCID: PMC11565731 DOI: 10.1101/2024.10.28.620721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
Life-history trade-offs are an inherent feature of organismal biology that evolutionary theory posits play a key role in patterns of divergence within and between species. Efforts to quantify trade-offs are largely confined to phenotypic measurements and the identification of negative genetic-correlations among fitness-relevant traits. Here, we use time-series genomic data collected during experimental evolution in large, genetically diverse populations of Drosophila melanogaster to directly measure the manifestation of trade-offs in response to temporally fluctuating selection pressures on ecological timescales. Specifically, we quantify the genome-wide signal of antagonistic pleiotropy suggestive of trade-offs between reproduction and stress tolerance. We further identify a putative role of two cosmopolitan inversions in these trade-offs, and show that loci responding to selection during lab-based, reproduction selection exhibit signals of fluctuating selection in an outdoor mesocosm exposed to natural environmental conditions. Our results demonstrate the utility of time-series genomic data in revealing the presence and genomic architecture underlying fitness trade-offs, and add credence to models positing a role of generic life history trade-offs in the maintenance of variation in natural populations.
Collapse
Affiliation(s)
- M C Bitter
- Department of Biology, Stanford University, Stanford, CA, USA
| | - S Greenblum
- Department of Biology, Stanford University, Stanford, CA, USA
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - S Rajpurohit
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
- Division of Biological and Life Sciences, School of Arts and Sciences, Ahmedabad University, Gujarat, India
| | - A O Bergland
- Department of Biology, Stanford University, Stanford, CA, USA
- Department of Biology, University of Virginia, Charlottesville, VA, USA
| | - J A Hemker
- Department of Biology, Stanford University, Stanford, CA, USA
| | - N J Betancourt
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - S Tilk
- Department of Biology, Stanford University, Stanford, CA, USA
| | - S Berardi
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - H Oken
- Department of Biology, Stanford University, Stanford, CA, USA
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
- Division of Biological and Life Sciences, School of Arts and Sciences, Ahmedabad University, Gujarat, India
- Department of Biology, University of Virginia, Charlottesville, VA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - P Schmidt
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - D A Petrov
- Department of Biology, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
4
|
Bitter MC, Berardi S, Oken H, Huynh A, Lappo E, Schmidt P, Petrov DA. Continuously fluctuating selection reveals fine granularity of adaptation. Nature 2024; 634:389-396. [PMID: 39143223 DOI: 10.1038/s41586-024-07834-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 07/16/2024] [Indexed: 08/16/2024]
Abstract
Temporally fluctuating environmental conditions are a ubiquitous feature of natural habitats. Yet, how finely natural populations adaptively track fluctuating selection pressures via shifts in standing genetic variation is unknown1,2. Here we generated genome-wide allele frequency data every 1-2 generations from a genetically diverse population of Drosophila melanogaster in extensively replicated field mesocosms from late June to mid-December (a period of approximately 12 total generations). Adaptation throughout the fundamental ecological phases of population expansion, peak density and collapse was underpinned by extremely rapid, parallel changes in genomic variation across replicates. Yet, the dominant direction of selection fluctuated repeatedly, even within each of these ecological phases. Comparing patterns of change in allele frequency to an independent dataset procured from the same experimental system demonstrated that the targets of selection are predictable across years. In concert, our results reveal a fitness relevance of standing variation that is likely to be masked by inference approaches based on static population sampling or insufficiently resolved time-series data. We propose that such fine-scaled, temporally fluctuating selection may be an important force contributing to the maintenance of functional genetic variation in natural populations and an important stochastic force impacting genome-wide patterns of diversity at linked neutral sites, akin to genetic draft.
Collapse
Affiliation(s)
- M C Bitter
- Department of Biology, Stanford University, Stanford, CA, USA.
| | - S Berardi
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - H Oken
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - A Huynh
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Egor Lappo
- Department of Biology, Stanford University, Stanford, CA, USA
| | - P Schmidt
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA.
| | - D A Petrov
- Department of Biology, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
5
|
Czech L, Spence JP, Expósito-Alonso M. grenedalf: population genetic statistics for the next generation of pool sequencing. Bioinformatics 2024; 40:btae508. [PMID: 39185959 PMCID: PMC11357794 DOI: 10.1093/bioinformatics/btae508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 08/02/2024] [Accepted: 08/23/2024] [Indexed: 08/27/2024] Open
Abstract
SUMMARY Pool sequencing is an efficient method for capturing genome-wide allele frequencies from multiple individuals, with broad applications such as studying adaptation in Evolve-and-Resequence experiments, monitoring of genetic diversity in wild populations, and genotype-to-phenotype mapping. Here, we present grenedalf, a command line tool written in C++ that implements common population genetic statistics such as θ, Tajima's D, and FST for Pool sequencing. It is orders of magnitude faster than current tools, and is focused on providing usability and scalability, while also offering a plethora of input file formats and convenience options. AVAILABILITY AND IMPLEMENTATION grenedalf is published under the GPL-3, and freely available at github.com/lczech/grenedalf.
Collapse
Affiliation(s)
- Lucas Czech
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, United States
- Section for GeoGenetics, Globe Institute, University of Copenhagen, 1350 København, Denmark
| | - Jeffrey P Spence
- Department of Genetics, Stanford University, Stanford, CA 94305, United States
| | - Moisés Expósito-Alonso
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, United States
- Department of Biology, Stanford University, Stanford, CA 94305, United States
- Department of Global Ecology, Carnegie Institution for Science, Stanford, CA 94305, United States
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA 94720, United States
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA 94720, United States
| |
Collapse
|
6
|
Wang Y, Dutta R, Futschik A. Estimating Haplotype Structure and Frequencies: A Bayesian Approach to Unknown Design in Pooled Genomic Data. J Comput Biol 2024; 31:708-726. [PMID: 38957993 DOI: 10.1089/cmb.2023.0211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024] Open
Abstract
The estimation of haplotype structure and frequencies provides crucial information about the composition of genomes. Techniques, such as single-individual haplotyping, aim to reconstruct individual haplotypes from diploid genome sequencing data. However, our focus is distinct. We address the challenge of reconstructing haplotype structure and frequencies from pooled sequencing samples where multiple individuals are sequenced simultaneously. A frequentist method to address this issue has recently been proposed. In contrast to this and other methods that compute point estimates, our proposed Bayesian hierarchical model delivers a posterior that permits us to also quantify uncertainty. Since matching permutations in both haplotype structure and corresponding frequency matrix lead to the same reconstruction of their product, we introduce an order-preserving shrinkage prior that ensures identifiability with respect to permutations. For inference, we introduce a blocked Gibbs sampler that enforces the required constraints. In a simulation study, we assessed the performance of our method. Furthermore, by using our approach on two distinct sets of real data, we demonstrate that our Bayesian approach can reconstruct the dominant haplotypes in a challenging, high-dimensional set-up.
Collapse
Affiliation(s)
- Yuexuan Wang
- Department of Applied Statistics, Johannes Kepler University, Linz, Austria
| | - Ritabrata Dutta
- Department of Statistics, University of Warwick, Coventry, United Kingdom
| | - Andreas Futschik
- Department of Applied Statistics, Johannes Kepler University, Linz, Austria
| |
Collapse
|
7
|
Zheng X, Zheng Y, Chen T, Hou C, Zhou L, Liu C, Zheng J, Hu R. Effect of Laryngopharyngeal Reflux and Potassium-Competitive Acid Blocker (P-CAB) on the Microbiological Comprise of the Laryngopharynx. Otolaryngol Head Neck Surg 2024; 170:1380-1390. [PMID: 38385787 DOI: 10.1002/ohn.682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/27/2023] [Accepted: 01/21/2024] [Indexed: 02/23/2024]
Abstract
OBJECTIVE To probe the microbiota composition progressing from healthy individuals to those with laryngopharyngeal reflux disease (LPRD) and subsequently undergoing potassium-competitive acid inhibitor (P-CAB) therapy. STUDY DESIGN Prospective case-control study. SETTING Academic Medical Center. METHODS Forty patients with LPRD and 51 patients without LPRD were recruited. An 8-week P-CAB therapy was initiated (post-T-LPRD), and 39 had return visits. In total, 130 laryngopharyngeal saliva samples were collected and sequenced by targeting the V3-V4 region of the 16S ribosomal RNA (rRNA) gene using an Illumina MiSeq. Amplicon sequence variants (ASVs) and clinical indices were analyzed. RESULTS Alpha and beta diversities were compared among the non-LPRD, LPRD, and post-T-LPRD groups, and the Observed_ASVs were not significantly different. At the same time, the Shannon and Simpson indices, unweighted Unifrac, weighted Unifrac, and binary Jaccard distance were significantly different between non-LPRD and LPRD groups. In addition, significant differences were found in the abundance of Streptococcus, Prevotella, and Prevotellaceae in the LPRD versus non-LPRD groups, and Neisseria, Leptotrichia, and Allprevotella in the LPRD versus post-T-LPRD groups. The genera model was used to distinguish patients with LPRD from those without, and a better receiver operating characteristic curve was formed after combining the clinical indices of reflux symptom index, reflux finding score, and pepsin, with an area under the curve of 0.960. CONCLUSION Laryngopharyngeal microbial communities changed after laryngopharyngeal reflux and were modified further after P-CAB treatment, which provides a potential diagnostic value for LPRD, especially when combined with clinical indices.
Collapse
Affiliation(s)
- Xiaowei Zheng
- Department of Otorhinolaryngology-Head and Neck Surgery, Shengli Clinical Medical College of Fujian Medical University, Fuzhou, China
| | - Yujin Zheng
- Department of Otorhinolaryngology-Head and Neck Surgery, Shengli Clinical Medical College of Fujian Medical University, Fuzhou, China
| | - Ting Chen
- Department of Otorhinolaryngology-Head and Neck Surgery, Shengli Clinical Medical College of Fujian Medical University, Fuzhou, China
| | - Chenjie Hou
- Department of Otorhinolaryngology-Head and Neck Surgery, Shengli Clinical Medical College of Fujian Medical University, Fuzhou, China
| | - Liqun Zhou
- Department of Otorhinolaryngology-Head and Neck Surgery, Shengli Clinical Medical College of Fujian Medical University, Fuzhou, China
| | - Chaofeng Liu
- Department of Otorhinolaryngology-Head and Neck Surgery, Shengli Clinical Medical College of Fujian Medical University, Fuzhou, China
| | - Jingyi Zheng
- Department of Otorhinolaryngology-Head and Neck Surgery, Shengli Clinical Medical College of Fujian Medical University, Fuzhou, China
| | - Renyou Hu
- Chongqing Jinshan Science & Technology (Group) Co. Ltd., Chongqing, China
| |
Collapse
|
8
|
Bitter MC, Berardi S, Oken H, Huynh A, Schmidt P, Petrov DA. Continuously fluctuating selection reveals extreme granularity and parallelism of adaptive tracking. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.16.562586. [PMID: 37904939 PMCID: PMC10614893 DOI: 10.1101/2023.10.16.562586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
Temporally fluctuating environmental conditions are a ubiquitous feature of natural habitats. Yet, how finely natural populations adaptively track fluctuating selection pressures via shifts in standing genetic variation is unknown. We generated high-frequency, genome-wide allele frequency data from a genetically diverse population of Drosophila melanogaster in extensively replicated field mesocosms from late June to mid-December, a period of ∼12 generations. Adaptation throughout the fundamental ecological phases of population expansion, peak density, and collapse was underpinned by extremely rapid, parallel changes in genomic variation across replicates. Yet, the dominant direction of selection fluctuated repeatedly, even within each of these ecological phases. Comparing patterns of allele frequency change to an independent dataset procured from the same experimental system demonstrated that the targets of selection are predictable across years. In concert, our results reveal fitness-relevance of standing variation that is likely to be masked by inference approaches based on static population sampling, or insufficiently resolved time-series data. We propose such fine-scaled temporally fluctuating selection may be an important force maintaining functional genetic variation in natural populations and an important stochastic force affecting levels of standing genetic variation genome-wide.
Collapse
|
9
|
Delomas TA, Willis SC. Estimating microhaplotype allele frequencies from low-coverage or pooled sequencing data. BMC Bioinformatics 2023; 24:415. [PMID: 37923981 PMCID: PMC10623847 DOI: 10.1186/s12859-023-05554-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 10/30/2023] [Indexed: 11/06/2023] Open
Abstract
BACKGROUND Microhaplotypes have the potential to be more cost-effective than SNPs for applications that require genetic panels of highly variable loci. However, development of microhaplotype panels is hindered by a lack of methods for estimating microhaplotype allele frequency from low-coverage whole genome sequencing or pooled sequencing (pool-seq) data. RESULTS We developed new methods for estimating microhaplotype allele frequency from low-coverage whole genome sequence and pool-seq data. We validated these methods using datasets from three non-model organisms. These methods allowed estimation of allele frequency and expected heterozygosity at depths routinely achieved from pooled sequencing. CONCLUSIONS These new methods will allow microhaplotype panels to be designed using low-coverage WGS and pool-seq data to discover and evaluate candidate loci. The python script implementing the two methods and documentation are available at https://www.github.com/delomast/mhFromLowDepSeq .
Collapse
Affiliation(s)
- Thomas A Delomas
- Agricultural Research Service, United States Department of Agriculture, National Cold Water Marine Aquaculture Center, 483 CBLS, 120 Flagg Road, Kingston, RI, 02881, USA.
| | - Stuart C Willis
- Hagerman Genetics Laboratory, Columbia River Inter-Tribal Fish Commission, Hagerman, ID, USA
| |
Collapse
|
10
|
Tavares H, Readshaw A, Kania U, de Jong M, Pasam RK, McCulloch H, Ward S, Shenhav L, Forsyth E, Leyser O. Artificial selection reveals complex genetic architecture of shoot branching and its response to nitrate supply in Arabidopsis. PLoS Genet 2023; 19:e1010863. [PMID: 37616321 PMCID: PMC10482290 DOI: 10.1371/journal.pgen.1010863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 09/06/2023] [Accepted: 07/08/2023] [Indexed: 08/26/2023] Open
Abstract
Quantitative traits may be controlled by many loci, many alleles at each locus, and subject to genotype-by-environment interactions, making them difficult to map. One example of such a complex trait is shoot branching in the model plant Arabidopsis, and its plasticity in response to nitrate. Here, we use artificial selection under contrasting nitrate supplies to dissect the genetic architecture of this complex trait, where loci identified by association mapping failed to explain heritability estimates. We found a consistent response to selection for high branching, with correlated responses in other traits such as plasticity and flowering time. Genome-wide scans for selection and simulations suggest that at least tens of loci control this trait, with a distinct genetic architecture between low and high nitrate treatments. While signals of selection could be detected in the populations selected for high branching on low nitrate, there was very little overlap in the regions selected in three independent populations. Thus the regulatory network controlling shoot branching can be tuned in different ways to give similar phenotypes.
Collapse
Affiliation(s)
- Hugo Tavares
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Anne Readshaw
- Department of Biology, University of York, York, United Kingdom
| | - Urszula Kania
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Maaike de Jong
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Raj K. Pasam
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Hayley McCulloch
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Sally Ward
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
- Department of Biology, University of York, York, United Kingdom
| | - Liron Shenhav
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Elizabeth Forsyth
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Ottoline Leyser
- Sainsbury Laboratory, University of Cambridge, Cambridge, United Kingdom
- Department of Biology, University of York, York, United Kingdom
| |
Collapse
|
11
|
Linder RA, Zabanavar B, Majumder A, Hoang HCS, Delgado VG, Tran R, La VT, Leemans SW, Long AD. Adaptation in Outbred Sexual Yeast is Repeatable, Polygenic and Favors Rare Haplotypes. Mol Biol Evol 2022; 39:msac248. [PMID: 36366952 PMCID: PMC9728589 DOI: 10.1093/molbev/msac248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We carried out a 200 generation Evolve and Resequence (E&R) experiment initiated from an outbred diploid recombined 18-way synthetic base population. Replicate populations were evolved at large effective population sizes (>105 individuals), exposed to several different chemical challenges over 12 weeks of evolution, and whole-genome resequenced. Weekly forced outcrossing resulted in an average between adjacent-gene per cell division recombination rate of ∼0.0008. Despite attempts to force weekly sex, roughly half of our populations evolved cheaters and appear to be evolving asexually. Focusing on seven chemical stressors and 55 total evolved populations that remained sexual we observed large fitness gains and highly repeatable patterns of genome-wide haplotype change within chemical challenges, with limited levels of repeatability across chemical treatments. Adaptation appears highly polygenic with almost the entire genome showing significant and consistent patterns of haplotype change with little evidence for long-range linkage disequilibrium in a subset of populations for which we sequenced haploid clones. That is, almost the entire genome is under selection or drafting with selected sites. At any given locus adaptation was almost always dominated by one of the 18 founder's alleles, with that allele varying spatially and between treatments, suggesting that selection acts primarily on rare variants private to a founder or haplotype blocks harboring multiple mutations.
Collapse
Affiliation(s)
- Robert A Linder
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| | - Behzad Zabanavar
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| | - Arundhati Majumder
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| | - Hannah Chiao-Shyan Hoang
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| | - Vanessa Genesaret Delgado
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| | - Ryan Tran
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| | - Vy Thoai La
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| | - Simon William Leemans
- Department of Biomedical Engineering, School of Engineering, University of California, Irvine
| | - Anthony D Long
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine
| |
Collapse
|
12
|
Czech L, Exposito-Alonso M. grenepipe: a flexible, scalable and reproducible pipeline to automate variant calling from sequence reads. Bioinformatics 2022; 38:4809-4811. [PMID: 36053180 PMCID: PMC10424805 DOI: 10.1093/bioinformatics/btac600] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 07/27/2022] [Accepted: 09/05/2022] [Indexed: 11/14/2022] Open
Abstract
SUMMARY We developed grenepipe, an all-in-one Snakemake workflow to streamline the data processing from raw high-throughput sequencing data of individuals or populations to genotype variant calls. Our pipeline offers a range of popular software tools within a single configuration file, automatically installs software dependencies, is highly optimized for scalability in cluster environments and runs with a single command. AVAILABILITY AND IMPLEMENTATION grenepipe is published under the GPLv3 and freely available at github.com/moiexpositoalonsolab/grenepipe.
Collapse
Affiliation(s)
- Lucas Czech
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Moises Exposito-Alonso
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
- Department of Global Ecology, Carnegie Institution for Science, Stanford, CA 94305, USA
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
13
|
Busch JW, Bodbyl‐Roels S, Tusuubira S, Kelly JK. Pollinator loss causes rapid adaptive evolution of selfing and dramatically reduces genome-wide genetic variability. Evolution 2022; 76:2130-2144. [PMID: 35852008 PMCID: PMC9543508 DOI: 10.1111/evo.14572] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 03/23/2022] [Accepted: 04/20/2022] [Indexed: 01/22/2023]
Abstract
Although selfing populations harbor little genetic variation limiting evolutionary potential, the causes are unclear. We experimentally evolved large, replicate populations of Mimulus guttatus for nine generations in greenhouses with or without pollinating bees and studied DNA polymorphism in descendants. Populations without bees adapted to produce more selfed seed yet exhibited striking reductions in DNA polymorphism despite large population sizes. Importantly, the genome-wide pattern of variation cannot be explained by a simple reduction in effective population size, but instead reflects the complicated interaction between selection, linkage, and inbreeding. Simulations demonstrate that the spread of favored alleles at few loci depresses neutral variation genome wide in large populations containing fully selfing lineages. It also generates greater heterogeneity among chromosomes than expected with neutral evolution in small populations. Genome-wide deviations from neutrality were documented in populations with bees, suggesting widespread influences of background selection. After applying outlier tests to detect loci under selection, two genome regions were found in populations with bees, yet no adaptive loci were otherwise mapped. Large amounts of stochastic change in selfing populations compromise evolutionary potential and undermine outlier tests for selection. This occurs because genetic draft in highly selfing populations makes even the largest changes in allele frequency unremarkable.
Collapse
Affiliation(s)
- Jeremiah W. Busch
- School of Biological SciencesWashington State UniversityPullmanWashington99164
| | - Sarah Bodbyl‐Roels
- Trefny Innovative Instruction CenterColorado School of MinesGoldenColorado80401
| | - Sharif Tusuubira
- Department of Ecology and Evolutionary BiologyUniversity of KansasLawrenceKansas66045
| | - John K. Kelly
- Department of Ecology and Evolutionary BiologyUniversity of KansasLawrenceKansas66045
| |
Collapse
|
14
|
Burghardt LT, Epstein B, Hoge M, Trujillo DI, Tiffin P. Host-Associated Rhizobial Fitness: Dependence on Nitrogen, Density, Community Complexity, and Legume Genotype. Appl Environ Microbiol 2022; 88:e0052622. [PMID: 35852362 PMCID: PMC9361818 DOI: 10.1128/aem.00526-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 06/24/2022] [Indexed: 11/20/2022] Open
Abstract
The environmental context of the nitrogen-fixing mutualism between leguminous plants and rhizobial bacteria varies over space and time. Variation in resource availability, population density, and composition likely affect the ecology and evolution of rhizobia and their symbiotic interactions with hosts. We examined how host genotype, nitrogen addition, rhizobial density, and community complexity affected selection on 68 rhizobial strains in the Sinorhizobium meliloti-Medicago truncatula mutualism. As expected, host genotype had a substantial effect on the size, number, and strain composition of root nodules (the symbiotic organ). The understudied environmental variable of rhizobial density had a stronger effect on nodule strain frequency than the addition of low nitrogen levels. Higher inoculum density resulted in a nodule community that was less diverse and more beneficial but only in the context of the more selective host genotype. Higher density resulted in more diverse and less beneficial nodule communities with the less selective host. Density effects on strain composition deserve additional scrutiny as they can create feedback between ecological and evolutionary processes. Finally, we found that relative strain rankings were stable across increasing community complexity (2, 3, 8, or 68 strains). This unexpected result suggests that higher-order interactions between strains are rare in the context of nodule formation and development. Our work highlights the importance of examining mechanisms of density-dependent strain fitness and developing theoretical predictions that incorporate density dependence. Furthermore, our results have translational relevance for overcoming establishment barriers in bioinoculants and motivating breeding programs that maintain beneficial plant-microbe interactions across diverse agroecological contexts. IMPORTANCE Legume crops establish beneficial associations with rhizobial bacteria that perform biological nitrogen fixation, providing nitrogen to plants without the economic and greenhouse gas emission costs of chemical nitrogen inputs. Here, we examine the influence of three environmental factors that vary in agricultural fields on strain relative fitness in nodules. In addition to manipulating nitrogen, we also use two biotic variables that have rarely been examined: the rhizobial community's density and complexity. Taken together, our results suggest that (i) breeding legume varieties that select beneficial strains despite environmental variation is possible, (ii) changes in rhizobial population densities that occur routinely in agricultural fields could drive evolutionary changes in rhizobial populations, and (iii) the lack of higher-order interactions between strains will allow the high-throughput assessments of rhizobia winners and losers during plant interactions.
Collapse
Affiliation(s)
- Liana T. Burghardt
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
- Plant Science Department, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Brendan Epstein
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| | - Michelle Hoge
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| | - Diana I. Trujillo
- Department of Plant Pathology, University of Minnesota, St. Paul, Minnesota, USA
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| |
Collapse
|
15
|
Batstone RT, Burghardt LT, Heath KD. Phenotypic and genomic signatures of interspecies cooperation and conflict in naturally occurring isolates of a model plant symbiont. Proc Biol Sci 2022; 289:20220477. [PMID: 35858063 PMCID: PMC9277234 DOI: 10.1098/rspb.2022.0477] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Given the need to predict the outcomes of (co)evolution in host-associated microbiomes, whether microbial and host fitnesses tend to trade-off, generating conflict, remains a pressing question. Examining the relationships between host and microbe fitness proxies at both the phenotypic and genomic levels can illuminate the mechanisms underlying interspecies cooperation and conflict. We examined naturally occurring genetic variation in 191 strains of the model microbial symbiont Sinorhizobium meliloti, paired with each of two host Medicago truncatula genotypes in single- or multi-strain experiments to determine how multiple proxies of microbial and host fitness were related to one another and test key predictions about mutualism evolution at the genomic scale, while also addressing the challenge of measuring microbial fitness. We found little evidence for interspecies fitness conflict; loci tended to have concordant effects on both microbe and host fitnesses, even in environments with multiple co-occurring strains. Our results emphasize the importance of quantifying microbial relative fitness for understanding microbiome evolution and thus harnessing microbiomes to improve host fitness. Additionally, we find that mutualistic coevolution between hosts and microbes acts to maintain, rather than erode, genetic diversity, potentially explaining why variation in mutualism traits persists in nature.
Collapse
Affiliation(s)
- Rebecca T. Batstone
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL 61801, USA
| | - Liana T. Burghardt
- Department of Plant Science, The Pennsylvania State University, 103 Tyson Building, University Park, PA, 16802 USA
| | - Katy D. Heath
- Department of Plant Biology, University of Illinois at Urbana-Champaign, 286 Morrill Hall, 505 South Goodwin Avenue, Urbana, IL 61801, USA
| |
Collapse
|
16
|
Epstein B, Burghardt LT, Heath KD, Grillo MA, Kostanecki A, Hämälä T, Young ND, Tiffin P. Combining GWAS and population genomic analyses to characterize coevolution in a legume-rhizobia symbiosis. Mol Ecol 2022. [PMID: 35793264 DOI: 10.1111/mec.16602] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 06/03/2022] [Accepted: 07/04/2022] [Indexed: 11/28/2022]
Abstract
The mutualism between legumes and rhizobia is clearly the product of past coevolution. However, the nature of ongoing evolution between these partners is less clear. To characterize the nature of recent coevolution between legumes and rhizobia, we used population genomic analysis to characterize selection on functionally annotated symbiosis genes as well as on symbiosis gene candidates identified through a two-species association analysis. For the association analysis, we inoculated each of 202 accessions of the legume host Medicago truncatula with a community of 88 Sinorhizobia (Ensifer) meliloti strains. Multistrain inoculation, which better reflects the ecological reality of rhizobial selection in nature than single-strain inoculation, allows strains to compete for nodulation opportunities and host resources and for hosts to preferentially form nodules and provide resources to some strains. We found extensive host by symbiont, that is, genotype-by-genotype, effects on rhizobial fitness and some annotated rhizobial genes bear signatures of recent positive selection. However, neither genes responsible for this variation nor annotated host symbiosis genes are enriched for signatures of either positive or balancing selection. This result suggests that stabilizing selection dominates selection acting on symbiotic traits and that variation in these traits is under mutation-selection balance. Consistent with the lack of positive selection acting on host genes, we found that among-host variation in growth was similar whether plants were grown with rhizobia or N-fertilizer, suggesting that the symbiosis may not be a major driver of variation in plant growth in multistrain contexts.
Collapse
Affiliation(s)
- Brendan Epstein
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| | - Liana T Burghardt
- Department of Plant Sciences, The University of Pennsylvania, University Park, Pennsylvania, USA
| | - Katy D Heath
- Department of Plant Biology, University of Illinois, Urbana, Illinois, USA.,Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, USA
| | - Michael A Grillo
- Department of Biology, Loyola University Chicago, Chicago, Illinois, USA
| | - Adam Kostanecki
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| | - Tuomas Hämälä
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA.,School of Life Sciences, University of Nottingham, Nottingham, UK
| | - Nevin D Young
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA.,Department of Plant Pathology, University of Minnesota, St. Paul, Minnesota, USA
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| |
Collapse
|
17
|
Commensal Pseudomonas strains facilitate protective response against pathogens in the host plant. Nat Ecol Evol 2022; 6:383-396. [PMID: 35210578 PMCID: PMC8986537 DOI: 10.1038/s41559-022-01673-7] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 12/07/2021] [Indexed: 12/31/2022]
Abstract
The community structure in the plant-associated microbiome depends collectively on host–microbe, microbe–microbe and host–microbe–microbe interactions. The ensemble of interactions between the host and microbial consortia may lead to outcomes that are not easily predicted from pairwise interactions. Plant–microbe–microbe interactions are important to plant health but could depend on both host and microbe strain variation. Here we study interactions between groups of naturally co-existing commensal and pathogenic Pseudomonas strains in the Arabidopsis thaliana phyllosphere. We find that commensal Pseudomonas prompt a host response that leads to selective inhibition of a specific pathogenic lineage, resulting in plant protection. The extent of protection depends on plant genotype, supporting that these effects are host-mediated. Strain-specific effects are also demonstrated by one individual Pseudomonas isolate eluding the plant protection provided by commensals. Our work highlights how within-species genetic differences in both hosts and microbes can affect host–microbe–microbe dynamics. The authors conduct competition experiments with multiple strains of Pseudomonas (some pathogenic and some commensal) in the phylosphere microbiome of Arabidopsis plants, showing that both the host and the commensal strains interact to inhibit the pathogenic strains.
Collapse
|
18
|
Schneider M, Shrestha A, Ballvora A, Léon J. High-throughput estimation of allele frequencies using combined pooled-population sequencing and haplotype-based data processing. PLANT METHODS 2022; 18:34. [PMID: 35313910 PMCID: PMC8935755 DOI: 10.1186/s13007-022-00852-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 02/07/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND In addition to heterogeneity and artificial selection, natural selection is one of the forces used to combat climate change and improve agrobiodiversity in evolutionary plant breeding. Accurate identification of the specific genomic effects of natural selection will likely accelerate transfer between populations. Thus, insights into changes in allele frequency, adequate population size, gene flow and drift are essential. However, observing such effects often involves a trade-off between costs and resolution when a large sample of genotypes for many loci is analysed. Pool genotyping approaches achieve high resolution and precision in estimating allele frequency when sequence coverage is high. Nevertheless, high-coverage pool sequencing of large genomes is expensive. RESULTS Three pool samples (n = 300, 300, 288) from a barley backcross population were generated to assess the population's allele frequency. The tested population (BC2F21) has undergone 18 generations of natural adaption to conventional farming practice. The accuracies of estimated pool-based allele frequencies and genome coverage yields were compared using three next-generation sequencing genotyping methods. To achieve accurate allele frequency estimates with low sequence coverage, we employed a haplotyping approach. Low coverage allele frequencies of closely located single polymorphisms were aggregated into a single haplotype allele frequency, yielding 2-to-271-times higher depth and increased precision. When we combined different haplotyping tactics, we found that gene and chip marker-based haplotype analyses performed equivalently or better compared with simple contig haplotype windows. Comparing multiple pool samples and referencing against an individual sequencing approach revealed that whole-genome pool re-sequencing (WGS) achieved the highest correlation with individual genotyping (≥ 0.97). In contrast, transcriptome-based genotyping (MACE) and genotyping by sequencing (GBS) pool replicates were significantly associated with higher error rates and lower correlations, but are still valuable to detect large allele frequency variations. CONCLUSIONS The proposed strategy identified the allele frequency of populations with high accuracy at low cost. This is particularly relevant to evolutionary plant breeding of crops with very large genomes, such as barley. Whole-genome low coverage re-sequencing at 0.03 × coverage per genotype accurately estimated the allele frequency when a loci-based haplotyping approach was applied. The implementation of annotated haplotypes capitalises on the biological background and statistical robustness.
Collapse
Affiliation(s)
- Michael Schneider
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany
- Institute for Quantitative Genetics and Genomics of Plants, University Duesseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany
| | - Asis Shrestha
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany
- Institute for Quantitative Genetics and Genomics of Plants, University Duesseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany
| | - Agim Ballvora
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany
| | - Jens Léon
- Institute of Crop Science and Resource Conservation, University of Bonn, Plant Breeding, Katzenburgweg 5, 53115, Bonn, Germany.
| |
Collapse
|
19
|
Macdonald SJ, Cloud-Richardson KM, Sims-West DJ, Long AD. Powerful, efficient QTL mapping in Drosophila melanogaster using bulked phenotyping and pooled sequencing. Genetics 2022; 220:iyab238. [PMID: 35100395 PMCID: PMC8893256 DOI: 10.1093/genetics/iyab238] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 12/19/2021] [Indexed: 01/22/2024] Open
Abstract
Despite the value of recombinant inbred lines for the dissection of complex traits, large panels can be difficult to maintain, distribute, and phenotype. An attractive alternative to recombinant inbred lines for many traits leverages selecting phenotypically extreme individuals from a segregating population, and subjecting pools of selected and control individuals to sequencing. Under a bulked or extreme segregant analysis paradigm, genomic regions contributing to trait variation are revealed as frequency differences between pools. Here, we describe such an extreme quantitative trait locus, or extreme quantitative trait loci, mapping strategy that builds on an existing multiparental population, the Drosophila Synthetic Population Resource, and involves phenotyping and genotyping a population derived by mixing hundreds of Drosophila Synthetic Population Resource recombinant inbred lines. Simulations demonstrate that challenging, yet experimentally tractable extreme quantitative trait loci designs (≥4 replicates, ≥5,000 individuals/replicate, and selecting the 5-10% most extreme animals) yield at least the same power as traditional recombinant inbred line-based quantitative trait loci mapping and can localize variants with sub-centimorgan resolution. We empirically demonstrate the effectiveness of the approach using a 4-fold replicated extreme quantitative trait loci experiment that identifies 7 quantitative trait loci for caffeine resistance. Two mapped extreme quantitative trait loci factors replicate loci previously identified in recombinant inbred lines, 6/7 are associated with excellent candidate genes, and RNAi knock-downs support the involvement of 4 genes in the genetic control of trait variation. For many traits of interest to drosophilists, a bulked phenotyping/genotyping extreme quantitative trait loci design has considerable advantages.
Collapse
Affiliation(s)
- Stuart J Macdonald
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66045, USA
- Center for Computational Biology, University of Kansas, Lawrence, KS 66047, USA
| | | | - Dylan J Sims-West
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66045, USA
| | - Anthony D Long
- Department of Ecology and Evolutionary Biology, University of California at Irvine, Irvine, CA 92697, USA
| |
Collapse
|
20
|
Dumartinet T, Ravel S, Roussel V, Perez-Vicente L, Aguayo J, Abadie C, Carlier J. Complex adaptive architecture underlies adaptation to quantitative host resistance in a fungal plant pathogen. Mol Ecol 2021; 31:1160-1179. [PMID: 34845779 DOI: 10.1111/mec.16297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 11/12/2021] [Accepted: 11/17/2021] [Indexed: 11/26/2022]
Abstract
Plant pathogens often adapt to plant genetic resistance so characterization of the architecture underlying such an adaptation is required to understand the adaptive potential of pathogen populations. Erosion of banana quantitative resistance to a major leaf disease caused by polygenic adaptation of the causal agent, the fungus Pseudocercospora fijiensis, was recently identified in the northern Caribbean region. Genome scan and quantitative genetics approaches were combined to investigate the adaptive architecture underlying this adaptation. Thirty-two genomic regions showing host selection footprints were identified by pool sequencing of isolates collected from seven plantation pairs of two cultivars with different levels of quantitative resistance. Individual sequencing and phenotyping of isolates from one pair revealed significant and variable levels of correlation between haplotypes in 17 of these regions with a quantitative trait of pathogenicity (the diseased leaf area). The multilocus pattern of haplotypes detected in the 17 regions was found to be highly variable across all the population pairs studied. These results suggest complex adaptive architecture underlying plant pathogen adaptation to quantitative resistance with a polygenic basis, redundancy, and a low level of parallel evolution between pathogen populations. Candidate genes involved in quantitative pathogenicity and host adaptation of P. fijiensis were identified in genomic regions by combining annotation analysis with available biological data.
Collapse
Affiliation(s)
- Thomas Dumartinet
- CIRAD, UMR PHIM, Montpellier, France.,PHIM, Univ Montpellier, INRAe, CIRAD, Montpellier SupAgro, Montpellier, France
| | - Sébastien Ravel
- CIRAD, UMR PHIM, Montpellier, France.,PHIM, Univ Montpellier, INRAe, CIRAD, Montpellier SupAgro, Montpellier, France
| | - Véronique Roussel
- CIRAD, UMR PHIM, Montpellier, France.,PHIM, Univ Montpellier, INRAe, CIRAD, Montpellier SupAgro, Montpellier, France
| | | | - Jaime Aguayo
- ANSES, Laboratoire de la Santé des Végétaux (LSV), Unité de Mycologie, Malzéville, France
| | - Catherine Abadie
- CIRAD, UMR PHIM, Montpellier, France.,PHIM, Univ Montpellier, INRAe, CIRAD, Montpellier SupAgro, Montpellier, France
| | - Jean Carlier
- CIRAD, UMR PHIM, Montpellier, France.,PHIM, Univ Montpellier, INRAe, CIRAD, Montpellier SupAgro, Montpellier, France
| |
Collapse
|
21
|
Weller CA, Tilk S, Rajpurohit S, Bergland AO. Accurate, ultra-low coverage genome reconstruction and association studies in Hybrid Swarm mapping populations. G3-GENES GENOMES GENETICS 2021; 11:6156828. [PMID: 33677482 PMCID: PMC8759814 DOI: 10.1093/g3journal/jkab062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 02/19/2021] [Indexed: 11/27/2022]
Abstract
Genetic association studies seek to uncover the link between genotype and phenotype, and often utilize inbred reference panels as a replicable source of genetic variation. However, inbred reference panels can differ substantially from wild populations in their genotypic distribution, patterns of linkage-disequilibrium, and nucleotide diversity. As a result, associations discovered using inbred reference panels may not reflect the genetic basis of phenotypic variation in natural populations. To address this problem, we evaluated a mapping population design where dozens to hundreds of inbred lines are outbred for few generations, which we call the Hybrid Swarm. The Hybrid Swarm approach has likely remained underutilized relative to pre-sequenced inbred lines due to the costs of genome-wide genotyping. To reduce sequencing costs and make the Hybrid Swarm approach feasible, we developed a computational pipeline that reconstructs accurate whole genomes from ultra-low-coverage (0.05X) sequence data in Hybrid Swarm populations derived from ancestors with phased haplotypes. We evaluate reconstructions using genetic variation from the Drosophila Genetic Reference Panel as well as variation from neutral simulations. We compared the power and precision of Genome-Wide Association Studies using the Hybrid Swarm, inbred lines, recombinant inbred lines (RILs), and highly outbred populations across a range of allele frequencies, effect sizes, and genetic architectures. Our simulations show that these different mapping panels vary in their power and precision, largely depending on the architecture of the trait. The Hybrid Swam and RILs outperform inbred lines for quantitative traits, but not for monogenic ones. Taken together, our results demonstrate the feasibility of the Hybrid Swarm as a cost-effective method of fine-scale genetic mapping.
Collapse
Affiliation(s)
- Cory A Weller
- Department of Biology, University of Virginia, Charlottesville, VA 22904, USA
| | - Susanne Tilk
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Subhash Rajpurohit
- Department of Biological and Life Sciences, Ahmedabad University, Ahmedabad 380009, India
| | - Alan O Bergland
- Department of Biology, University of Virginia, Charlottesville, VA 22904, USA
| |
Collapse
|
22
|
Pelizzola M, Behr M, Li H, Munk A, Futschik A. Multiple haplotype reconstruction from allele frequency data. NATURE COMPUTATIONAL SCIENCE 2021; 1:262-271. [PMID: 38217170 DOI: 10.1038/s43588-021-00056-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 03/12/2021] [Indexed: 01/15/2024]
Abstract
Because haplotype information is of widespread interest in biomedical applications, effort has been put into their reconstruction. Here, we propose an efficient method, called haploSep, that is able to accurately infer major haplotypes and their frequencies just from multiple samples of allele frequency data. Even the accuracy of experimentally obtained allele frequencies can be improved by re-estimating them from our reconstructed haplotypes. From a methodological point of view, we model our problem as a multivariate regression problem where both the design matrix and the coefficient matrix are unknown. Compared to other methods, haploSep is very fast, with linear computational complexity in the haplotype length. We illustrate our method on simulated and real data focusing on experimental evolution and microbial data.
Collapse
Affiliation(s)
- Marta Pelizzola
- Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vienna, Austria
| | - Merle Behr
- University of California, Berkeley, CA, USA
| | - Housen Li
- University of Göttingen, Göttingen, Germany
- Cluster of Excellence 'Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells' (MBExC), University of Göttingen, Göttingen, Germany
| | - Axel Munk
- University of Göttingen, Göttingen, Germany
- Cluster of Excellence 'Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells' (MBExC), University of Göttingen, Göttingen, Germany
- Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | | |
Collapse
|
23
|
Zhao H, Wang S, Yuan X. Detection of Pathogenic Microbe Composition Using Next-Generation Sequencing Data. Front Genet 2020; 11:603093. [PMID: 33329748 PMCID: PMC7734255 DOI: 10.3389/fgene.2020.603093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Accepted: 10/21/2020] [Indexed: 11/23/2022] Open
Abstract
Next-generation sequencing (NGS) technologies have provided great opportunities to analyze pathogenic microbes with high-resolution data. The main goal is to accurately detect microbial composition and abundances in a sample. However, high similarity among sequences from different species and the existence of sequencing errors pose various challenges. Numerous methods have been developed for quantifying microbial composition and abundance, but they are not versatile enough for the analysis of samples with mixtures of noise. In this paper, we propose a new computational method, PGMicroD, for the detection of pathogenic microbial composition in a sample using NGS data. The method first filters the potentially mistakenly mapped reads and extracts multiple species-related features from the sequencing reads of 16S rRNA. Then it trains an Support Vector Machine classifier to predict the microbial composition. Finally, it groups all multiple-mapped sequencing reads into the references of the predicted species to estimate the abundance for each kind of species. The performance of PGMicroD is evaluated based on both simulation and real sequencing data and is compared with several existing methods. The results demonstrate that our proposed method achieves superior performance. The software package of PGMicroD is available at https://github.com/BDanalysis/PGMicroD.
Collapse
Affiliation(s)
- Haiyong Zhao
- School of Computer Science and Technology, Liaocheng University, Liaocheng, China.,School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Shuang Wang
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Xiguo Yuan
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
24
|
PhenoMIP: High-Throughput Phenotyping of Diverse Caenorhabditis elegans Populations via Molecular Inversion Probes. G3-GENES GENOMES GENETICS 2020; 10:3977-3990. [PMID: 32868407 PMCID: PMC7642933 DOI: 10.1534/g3.120.401656] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Whether generated within a lab setting or isolated from the wild, variant alleles continue to be an important resource for decoding gene function in model organisms such as Caenorhabditis elegans. With advances in massively parallel sequencing, multiple whole-genome sequenced (WGS) strain collections are now available to the research community. The Million Mutation Project (MMP) for instance, analyzed 2007 N2-derived, mutagenized strains. Individually, each strain averages ∼400 single nucleotide variants amounting to ∼80 protein-coding variants. The effects of these variants, however, remain largely uncharacterized and querying the breadth of these strains for phenotypic changes requires a method amenable to rapid and sensitive high-throughput analysis. Here we present a pooled competitive fitness approach to quantitatively phenotype subpopulations of sequenced collections via molecular inversion probes (PhenoMIP). We phenotyped the relative fitness of 217 mutant strains on multiple food sources and classified these into five categories. We also demonstrate on a subset of these strains, that their fitness defects can be genetically mapped. Overall, our results suggest that approximately 80% of MMP mutant strains may have a decreased fitness relative to the lab reference, N2. The costs of generating this form of analysis through WGS methods would be prohibitive while PhenoMIP analysis in this manner is accomplished at less than one-tenth of projected WGS costs. We propose methods for applying PhenoMIP to a broad range of population selection experiments in a cost-efficient manner that would be useful to the community at large.
Collapse
|
25
|
Erickson PA, Weller CA, Song DY, Bangerter AS, Schmidt P, Bergland AO. Unique genetic signatures of local adaptation over space and time for diapause, an ecologically relevant complex trait, in Drosophila melanogaster. PLoS Genet 2020; 16:e1009110. [PMID: 33216740 PMCID: PMC7717581 DOI: 10.1371/journal.pgen.1009110] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 12/04/2020] [Accepted: 09/10/2020] [Indexed: 02/07/2023] Open
Abstract
Organisms living in seasonally variable environments utilize cues such as light and temperature to induce plastic responses, enabling them to exploit favorable seasons and avoid unfavorable ones. Local adapation can result in variation in seasonal responses, but the genetic basis and evolutionary history of this variation remains elusive. Many insects, including Drosophila melanogaster, are able to undergo an arrest of reproductive development (diapause) in response to unfavorable conditions. In D. melanogaster, the ability to diapause is more common in high latitude populations, where flies endure harsher winters, and in the spring, reflecting differential survivorship of overwintering populations. Using a novel hybrid swarm-based genome wide association study, we examined the genetic basis and evolutionary history of ovarian diapause. We exposed outbred females to different temperatures and day lengths, characterized ovarian development for over 2800 flies, and reconstructed their complete, phased genomes. We found that diapause, scored at two different developmental cutoffs, has modest heritability, and we identified hundreds of SNPs associated with each of the two phenotypes. Alleles associated with one of the diapause phenotypes tend to be more common at higher latitudes, but these alleles do not show predictable seasonal variation. The collective signal of many small-effect, clinally varying SNPs can plausibly explain latitudinal variation in diapause seen in North America. Alleles associated with diapause are segregating in Zambia, suggesting that variation in diapause relies on ancestral polymorphisms, and both pro- and anti-diapause alleles have experienced selection in North America. Finally, we utilized outdoor mesocosms to track diapause under natural conditions. We found that hybrid swarms reared outdoors evolved increased propensity for diapause in late fall, whereas indoor control populations experienced no such change. Our results indicate that diapause is a complex, quantitative trait with different evolutionary patterns across time and space.
Collapse
Affiliation(s)
- Priscilla A. Erickson
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Cory A. Weller
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Daniel Y. Song
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Alyssa S. Bangerter
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Paul Schmidt
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Alan O. Bergland
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| |
Collapse
|
26
|
Burghardt LT. Evolving together, evolving apart: measuring the fitness of rhizobial bacteria in and out of symbiosis with leguminous plants. THE NEW PHYTOLOGIST 2020; 228:28-34. [PMID: 31276218 DOI: 10.1111/nph.16045] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 06/20/2019] [Indexed: 05/11/2023]
Abstract
Most plant-microbe interactions are facultative, with microbes experiencing temporally and spatially variable selection. How this variation affects microbial evolution is poorly understood. Given its tractability and ecological and agricultural importance, the legume-rhizobia nitrogen-fixing symbiosis is a powerful model for identifying traits and genes underlying bacterial fitness. New technologies allow high-throughput measurement of the relative fitness of bacterial mutants, strains and species in mixed inocula in the host, rhizosphere and soil environments. I consider how host genetic variation (G × G), other environmental factors (G × E), and host life-cycle variation may contribute to the maintenance of genetic variation and adaptive trajectories of rhizobia - and, potentially, other facultative symbionts. Lastly, I place these findings in the context of developing beneficial inoculants in a changing climate.
Collapse
Affiliation(s)
- Liana T Burghardt
- Department of Plant and Microbial Biology, University of Minnesota, 140 Gortner Laboratory, 1479 Gortner Avenue, St Paul, MN, 55108, USA
| |
Collapse
|
27
|
Otte KA, Schlötterer C. Detecting selected haplotype blocks in evolve and resequence experiments. Mol Ecol Resour 2020; 21:93-109. [PMID: 32810339 PMCID: PMC7754423 DOI: 10.1111/1755-0998.13244] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 07/30/2020] [Accepted: 08/04/2020] [Indexed: 12/15/2022]
Abstract
Shifting from the analysis of single nucleotide polymorphisms to the reconstruction of selected haplotypes greatly facilitates the interpretation of evolve and resequence (E&R) experiments. Merging highly correlated hitchhiker SNPs into haplotype blocks reduces thousands of candidates to few selected regions. Current methods of haplotype reconstruction from Pool‐seq data need a variety of data‐specific parameters that are typically defined ad hoc and require haplotype sequences for validation. Here, we introduce haplovalidate, a tool which detects selected haplotypes in Pool‐seq time series data without the need for sequenced haplotypes. Haplovalidate makes data‐driven choices of two key parameters for the clustering procedure, the minimum correlation between SNPs constituting a cluster and the window size. Applying haplovalidate to simulated E&R data reliably detects selected haplotype blocks with low false discovery rates. Importantly, our analyses identified a restriction of the haplotype block‐based approach to describe the genomic architecture of adaptation. We detected a substantial fraction of haplotypes containing multiple selection targets. These blocks were considered as one region of selection and therefore led to underestimation of the number of selection targets. We demonstrate that the separate analysis of earlier time points can significantly increase the separation of selection targets into individual haplotype blocks. We conclude that the analysis of selected haplotype blocks has great potential for the characterization of the adaptive architecture with E&R experiments.
Collapse
Affiliation(s)
- Kathrin A Otte
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | | |
Collapse
|
28
|
Linder RA, Majumder A, Chakraborty M, Long A. Two Synthetic 18-Way Outcrossed Populations of Diploid Budding Yeast with Utility for Complex Trait Dissection. Genetics 2020; 215:323-342. [PMID: 32241804 PMCID: PMC7268983 DOI: 10.1534/genetics.120.303202] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 03/31/2020] [Indexed: 02/07/2023] Open
Abstract
Advanced-generation multiparent populations (MPPs) are a valuable tool for dissecting complex traits, having more power than genome-wide association studies to detect rare variants and higher resolution than F2 linkage mapping. To extend the advantages of MPPs in budding yeast, we describe the creation and characterization of two outbred MPPs derived from 18 genetically diverse founding strains. We carried out de novo assemblies of the genomes of the 18 founder strains, such that virtually all variation segregating between these strains is known, and represented those assemblies as Santa Cruz Genome Browser tracks. We discovered complex patterns of structural variation segregating among the founders, including a large deletion within the vacuolar ATPase VMA1, several different deletions within the osmosensor MSB2, a series of deletions and insertions at PRM7 and the adjacent BSC1, as well as copy number variation at the dehydrogenase ALD2 Resequenced haploid recombinant clones from the two MPPs have a median unrecombined block size of 66 kb, demonstrating that the population is highly recombined. We pool-sequenced the two MPPs to 3270× and 2226× coverage and demonstrated that we can accurately estimate local haplotype frequencies using pooled data. We further downsampled the pool-sequenced data to ∼20-40× and showed that local haplotype frequency estimates remained accurate, with median error rates 0.8 and 0.6% at 20× and 40×, respectively. Haplotypes frequencies are estimated much more accurately than SNP frequencies obtained directly from the same data. Deep sequencing of the two populations revealed that 10 or more founders are present at a detectable frequency for > 98% of the genome, validating the utility of this resource for the exploration of the role of standing variation in the architecture of complex traits.
Collapse
Affiliation(s)
- Robert A Linder
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| | - Arundhati Majumder
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| | - Mahul Chakraborty
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| | - Anthony Long
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| |
Collapse
|
29
|
Barghi N, Schlötterer C. Shifting the paradigm in Evolve and Resequence studies: From analysis of single nucleotide polymorphisms to selected haplotype blocks. Mol Ecol 2020; 28:521-524. [PMID: 30793868 PMCID: PMC6850332 DOI: 10.1111/mec.14992] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 12/16/2018] [Accepted: 12/18/2018] [Indexed: 12/18/2022]
Abstract
For almost a decade the combination of whole genome sequencing with experimental evolution (Evolve and Resequence, E&R; Turner, Stewart, Fields, Rice, & Tarone, 2011) has been used to study adaptation in outcrossing organisms. However, complications caused by inversions and hitchhiking variants have prevented this powerful approach from living up to its potential. In this issue of Molecular Ecology, Michalak, Kang, Schou, Garner, and Loeschke (2018), provide an important step ahead by using a population of Drosophila melanogaster devoid of segregating inversions to identify the genetic basis of resistance to five environmental stressors. They further address the challenge of hitchhiking variants by reconstructing selected haplotype blocks. While it is apparent that the haplotype block reconstruction needs further refinements, their work underpins the potential of E&R studies in Drosophila to address fundamental questions in evolutionary biology.
Collapse
Affiliation(s)
- Neda Barghi
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | | |
Collapse
|
30
|
Burghardt LT, Trujillo DI, Epstein B, Tiffin P, Young ND. A Select and Resequence Approach Reveals Strain-Specific Effects of Medicago Nodule-Specific PLAT-Domain Genes. PLANT PHYSIOLOGY 2020; 182:463-471. [PMID: 31653715 PMCID: PMC6945875 DOI: 10.1104/pp.19.00831] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Accepted: 10/07/2019] [Indexed: 05/23/2023]
Abstract
Genetic studies of legume symbiosis with nitrogen-fixing rhizobial bacteria have traditionally focused on nodule and nitrogen-fixation phenotypes when hosts are inoculated with a single rhizobial strain. These approaches overlook the potential effect of host genes on rhizobial fitness (i.e. how many rhizobia are released from host nodules) and strain-specific effects of host genes (i.e. genome × genome interactions). Using Medicago truncatula mutants in the recently described nodule-specific PLAT domain (NPD) gene family, we show how inoculating plants with a mixed inoculum of 68 rhizobial strains (Ensifer meliloti) via a select-and-resequence approach can be used to efficiently assay host mutants for strain-specific effects of late-acting host genes on interacting bacteria. The deletion of a single NPD gene (npd2) or all five members of the NPD gene family (npd1-5) differentially altered the frequency of rhizobial strains in nodules even though npd2 mutants had no visible nodule morphology or N-fixation phenotype. Also, npd1-5 nodules were less diverse and had larger populations of colony-forming rhizobia despite their smaller size. Lastly, NPD mutations disrupt a positive correlation between strain fitness and wild-type host biomass. These changes indicate that the effects of NPD proteins are strain dependent and that NPD family members are not redundant with regard to their effects on rhizobial strains. Association analyses of the rhizobial strains in the mixed inoculation indicate that rhizobial genes involved in chromosome segregation, cell division, GABA metabolism, efflux systems, and stress tolerance play an important role in the strain-specific effects of NPD genes.
Collapse
Affiliation(s)
- Liana T Burghardt
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota 55108
| | - Diana I Trujillo
- Department of Plant Pathology, University of Minnesota, St. Paul, Minnesota 55108
| | - Brendan Epstein
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota 55108
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota 55108
| | - Nevin D Young
- Department of Plant Pathology, University of Minnesota, St. Paul, Minnesota 55108
| |
Collapse
|
31
|
Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments. G3 (BETHESDA, MD.) 2019; 9:4159-4168. [PMID: 31636085 PMCID: PMC6893198 DOI: 10.1534/g3.119.400755] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Evolve-and-resequence (E+R) experiments leverage next-generation sequencing technology to track the allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this power comes at the expense of high-coverage (>100x) sequencing of many pooled samples, which can be cost-prohibitive. Here, we show that accurate estimates of allele frequencies can be achieved with very shallow sequencing depths (<5x) via inference of known founder haplotypes in small genomic windows. This technique can be used to efficiently estimate frequencies for any number of bi-allelic SNPs in populations of any model organism founded with sequenced homozygous strains. Using both experimentally-pooled and simulated samples of Drosophila melanogaster, we show that haplotype inference can improve allele frequency accuracy by orders of magnitude for up to 50 generations of recombination, and is robust to moderate levels of missing data, as well as different selection regimes. Finally, we show that a simple linear model generated from these simulations can predict the accuracy of haplotype-derived allele frequencies in other model organisms and experimental designs. To make these results broadly accessible for use in E+R experiments, we introduce HAF-pipe, an open-source software tool for calculating haplotype-derived allele frequencies from raw sequencing data. Ultimately, by reducing sequencing costs without sacrificing accuracy, our method facilitates E+R designs with higher replication and resolution, and thereby, increased power to detect adaptive alleles.
Collapse
|
32
|
Burghardt LT, Epstein B, Tiffin P. Legacy of prior host and soil selection on rhizobial fitness in planta. Evolution 2019; 73:2013-2023. [PMID: 31334838 DOI: 10.1111/evo.13807] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 06/19/2019] [Accepted: 06/20/2019] [Indexed: 01/03/2023]
Abstract
Measuring selection acting on microbial populations in natural or even seminatural environments is challenging because many microbial populations experience variable selection. The majority of rhizobial bacteria are found in the soil. However, they also live symbiotically inside nodules of legume hosts and each nodule can release thousands of daughter cells back into the soil. We tested how past selection (i.e., legacies) by two plant genotypes and by the soil alone affected selection and genetic diversity within a population of 101 strains of Ensifer meliloti. We also identified allelic variants most strongly associated with soil- and host-dependent fitness. In addition to imposing direct selection on rhizobia populations, soil and host environments had lasting effects across host generations. Host presence and genotype during the legacy period explained 22% and 12% of the variance in the strain composition of nodule communities in the second cohort, respectively. Although strains with high host fitness in the legacy cohort tended to be enriched in the second cohort, the diversity of the strain community was greater when the second cohort was preceded by host rather than soil legacies. Our results indicate the potential importance of soil selection driving the evolution of these plant-associated microbes.
Collapse
Affiliation(s)
- Liana T Burghardt
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, 55108
| | - Brendan Epstein
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, 55108
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, 55108
| |
Collapse
|
33
|
Barghi N, Tobler R, Nolte V, Jakšić AM, Mallard F, Otte KA, Dolezal M, Taus T, Kofler R, Schlötterer C. Genetic redundancy fuels polygenic adaptation in Drosophila. PLoS Biol 2019; 17:e3000128. [PMID: 30716062 PMCID: PMC6375663 DOI: 10.1371/journal.pbio.3000128] [Citation(s) in RCA: 155] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 02/14/2019] [Accepted: 01/14/2019] [Indexed: 12/31/2022] Open
Abstract
The genetic architecture of adaptive traits is of key importance to predict evolutionary responses. Most adaptive traits are polygenic-i.e., result from selection on a large number of genetic loci-but most molecularly characterized traits have a simple genetic basis. This discrepancy is best explained by the difficulty in detecting small allele frequency changes (AFCs) across many contributing loci. To resolve this, we use laboratory natural selection to detect signatures for selective sweeps and polygenic adaptation. We exposed 10 replicates of a Drosophila simulans population to a new temperature regime and uncovered a polygenic architecture of an adaptive trait with high genetic redundancy among beneficial alleles. We observed convergent responses for several phenotypes-e.g., fitness, metabolic rate, and fat content-and a strong polygenic response (99 selected alleles; mean s = 0.059). However, each of these selected alleles increased in frequency only in a subset of the evolving replicates. We discerned different evolutionary paradigms based on the heterogeneous genomic patterns among replicates. Redundancy and quantitative trait (QT) paradigms fitted the experimental data better than simulations assuming independent selective sweeps. Our results show that natural D. simulans populations harbor a vast reservoir of adaptive variation facilitating rapid evolutionary responses using multiple alternative genetic pathways converging at a new phenotypic optimum. This key property of beneficial alleles requires the modification of testing strategies in natural populations beyond the search for convergence on the molecular level.
Collapse
Affiliation(s)
- Neda Barghi
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - Raymond Tobler
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | - Viola Nolte
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - Ana Marija Jakšić
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | - François Mallard
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | | | - Marlies Dolezal
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Plattform Bioinformatik und Biostatistik, Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - Thomas Taus
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | - Robert Kofler
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | | |
Collapse
|
34
|
Abstract
Computationally inferring the identities and their relative frequencies from pooled samples that are whole-genome or segmentally genotyped or sequenced (e.g., using next-generation sequencing) in a pool is useful for population genetics analysis. To carry out such analysis, one needs to understand basics of how to use high-performance computing (HPC) facilities and the specifics of corresponding computational tools. Here, we describe the basic knowledge and step-by-step usage of a number of tools for haplotype inference on genotyping or next-generation sequencing data.
Collapse
|
35
|
Reppell M, Novembre J. Using pseudoalignment and base quality to accurately quantify microbial community composition. PLoS Comput Biol 2018; 14:e1006096. [PMID: 29659582 PMCID: PMC5945057 DOI: 10.1371/journal.pcbi.1006096] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 05/10/2018] [Accepted: 03/19/2018] [Indexed: 12/31/2022] Open
Abstract
Pooled DNA from multiple unknown organisms arises in a variety of contexts, for example microbial samples from ecological or human health research. Determining the composition of pooled samples can be difficult, especially at the scale of modern sequencing data and reference databases. Here we propose a novel method for taxonomic profiling in pooled DNA that combines the speed and low-memory requirements of k-mer based pseudoalignment with a likelihood framework that uses base quality information to better resolve multiply mapped reads. We apply the method to the problem of classifying 16S rRNA reads using a reference database of known organisms, a common challenge in microbiome research. Using simulations, we show the method is accurate across a variety of read lengths, with different length reference sequences, at different sample depths, and when samples contain reads originating from organisms absent from the reference. We also assess performance in real 16S data, where we reanalyze previous genetic association data to show our method discovers a larger number of quantitative trait associations than other widely used methods. We implement our method in the software Karp, for k-mer based analysis of read pools, to provide a novel combination of speed and accuracy that is uniquely suited for enhancing discoveries in microbial studies.
Collapse
Affiliation(s)
- Mark Reppell
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
36
|
Select and resequence reveals relative fitness of bacteria in symbiotic and free-living environments. Proc Natl Acad Sci U S A 2018; 115:2425-2430. [PMID: 29453274 DOI: 10.1073/pnas.1714246115] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Assays to accurately estimate relative fitness of bacteria growing in multistrain communities can advance our understanding of how selection shapes diversity within a lineage. Here, we present a variant of the "evolve and resequence" approach both to estimate relative fitness and to identify genetic variants responsible for fitness variation of symbiotic bacteria in free-living and host environments. We demonstrate the utility of this approach by characterizing selection by two plant hosts and in two free-living environments (sterilized soil and liquid media) acting on synthetic communities of the facultatively symbiotic bacterium Ensifer meliloti We find (i) selection that hosts exert on rhizobial communities depends on competition among strains, (ii) selection is stronger inside hosts than in either free-living environment, and (iii) a positive host-dependent relationship between relative strain fitness in multistrain communities and host benefits provided by strains in single-strain experiments. The greatest changes in allele frequencies in response to plant hosts are in genes associated with motility, regulation of nitrogen fixation, and host/rhizobia signaling. The approach we present provides a powerful complement to experimental evolution and forward genetic screens for characterizing selection in bacterial populations, identifying gene function, and surveying the functional importance of naturally occurring genomic variation.
Collapse
|
37
|
Rode NO, Holtz Y, Loridon K, Santoni S, Ronfort J, Gay L. How to optimize the precision of allele and haplotype frequency estimates using pooled-sequencing data. Mol Ecol Resour 2017; 18:194-203. [PMID: 28977733 DOI: 10.1111/1755-0998.12723] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 09/12/2017] [Accepted: 09/14/2017] [Indexed: 11/30/2022]
Abstract
Sequencing pools of individuals rather than individuals separately reduces the costs of estimating allele frequencies at many loci in many populations. Theoretical and empirical studies show that sequencing pools comprising a limited number of individuals (typically fewer than 50) provides reliable allele frequency estimates, provided that the DNA pooling and DNA sequencing steps are carefully controlled. Unequal contributions of different individuals to the DNA pool and the mean and variance in sequencing depth both can affect the standard error of allele frequency estimates. To our knowledge, no study separately investigated the effect of these two factors on allele frequency estimates; so that there is currently no method to a priori estimate the relative importance of unequal individual DNA contributions independently of sequencing depth. We develop a new analytical model for allele frequency estimation that explicitly distinguishes these two effects. Our model shows that the DNA pooling variance in a pooled sequencing experiment depends solely on two factors: the number of individuals within the pool and the coefficient of variation of individual DNA contributions to the pool. We present a new method to experimentally estimate this coefficient of variation when planning a pooled sequencing design where samples are either pooled before or after DNA extraction. Using this analytical and experimental framework, we provide guidelines to optimize the design of pooled sequencing experiments. Finally, we sequence replicated pools of inbred lines of the plant Medicago truncatula and show that the predictions from our model generally hold true when estimating the frequency of known multilocus haplotypes using pooled sequencing.
Collapse
|
38
|
Lillie M, Sheng Z, Honaker CF, Dorshorst BJ, Ashwell CM, Siegel PB, Carlborg Ö. Genome-wide standing variation facilitates long-term response to bidirectional selection for antibody response in chickens. BMC Genomics 2017; 18:99. [PMID: 28100171 PMCID: PMC5244587 DOI: 10.1186/s12864-016-3414-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 12/12/2016] [Indexed: 12/13/2022] Open
Abstract
Background Long-term selection experiments provide a powerful approach to gain empirical insights into adaptation, allowing researchers to uncover the targets of selection and infer their contributions to the mode and tempo of adaptation. Here we implement a pooled genome re-sequencing approach to investigate the consequences of 39 generations of bidirectional selection in White Leghorn chickens on a humoral immune trait: antibody response to sheep red blood cells. Results We observed wide genome involvement in response to this selection regime. Many genomic regions were highly differentiated resulting from this experimental selection regime, an involvement of up to 20% of the chicken genome (208.8 Mb). While genetic drift has certainly contributed to this, we implement gene ontology, association analysis and population simulations to increase our confidence in candidate selective sweeps. Three strong candidate genes, MHC, SEMA5A and TGFBR2, are also presented. Conclusions The extensive genomic changes highlight the polygenic genetic architecture of antibody response in these chicken populations, which are derived from a common founder population, demonstrating the extent of standing immunogenetic variation available at the onset of selection. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3414-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mette Lillie
- Department of Medical Biochemistry and Microbiology, Genomics, Uppsala University, Uppsala, 75123, Sweden.
| | - Zheya Sheng
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Christa F Honaker
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| | - Ben J Dorshorst
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| | - Christopher M Ashwell
- Prestage Department of Poultry Science, North Carolina State University, Raleigh, NC, 27695, USA
| | - Paul B Siegel
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| | - Örjan Carlborg
- Department of Medical Biochemistry and Microbiology, Genomics, Uppsala University, Uppsala, 75123, Sweden
| |
Collapse
|
39
|
Franssen SU, Barton NH, Schlötterer C. Reconstruction of Haplotype-Blocks Selected during Experimental Evolution. Mol Biol Evol 2016; 34:174-184. [PMID: 27702776 DOI: 10.1093/molbev/msw210] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The genetic analysis of experimentally evolving populations typically relies on short reads from pooled individuals (Pool-Seq). While this method provides reliable allele frequency estimates, the underlying haplotype structure remains poorly characterized. With small population sizes and adaptive variants that start from low frequencies, the interpretation of selection signatures in most Evolve and Resequencing studies remains challenging. To facilitate the characterization of selection targets, we propose a new approach that reconstructs selected haplotypes from replicated time series, using Pool-Seq data. We identify selected haplotypes through the correlated frequencies of alleles carried by them. Computer simulations indicate that selected haplotype-blocks of several Mb can be reconstructed with high confidence and low error rates, even when allele frequencies change only by 20% across three replicates. Applying this method to real data from D. melanogaster populations adapting to a hot environment, we identify a selected haplotype-block of 6.93 Mb. We confirm the presence of this haplotype-block in evolved populations by experimental haplotyping, demonstrating the power and accuracy of our haplotype reconstruction from Pool-Seq data. We propose that the combination of allele frequency estimates with haplotype information will provide the key to understanding the dynamics of adaptive alleles.
Collapse
Affiliation(s)
| | - Nicholas H Barton
- Institute of Science and Technology Austria (IST Austria), Klosterneuburg, Austria
| | | |
Collapse
|
40
|
Cao CC, Sun X. Ehapp2: Estimate haplotype frequencies from pooled sequencing data with prior database information. J Bioinform Comput Biol 2016; 14:1650017. [PMID: 27216711 DOI: 10.1142/s0219720016500177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
To reduce the cost of large-scale re-sequencing, multiple individuals are pooled together and sequenced called pooled sequencing. Pooled sequencing could provide a cost-effective alternative to sequencing individuals separately. To facilitate the application of pooled sequencing in haplotype-based diseases association analysis, the critical procedure is to accurately estimate haplotype frequencies from pooled samples. Here we present Ehapp2 for estimating haplotype frequencies from pooled sequencing data by utilizing a database which provides prior information of known haplotypes. We first translate the problem of estimating frequency for each haplotype into finding a sparse solution for a system of linear equations, where the NNREG algorithm is employed to achieve the solution. Simulation experiments reveal that Ehapp2 is robust to sequencing errors and able to estimate the frequencies of haplotypes with less than 3% average relative difference for pooled sequencing of mixture of real Drosophila haplotypes with 50× total coverage even when the sequencing error rate is as high as 0.05. Owing to the strategy that proportions for local haplotypes spanning multiple SNPs are accurately calculated first, Ehapp2 retains excellent estimation for recombinant haplotypes resulting from chromosomal crossover. Comparisons with present methods reveal that Ehapp2 is state-of-the-art for many sequencing study designs and more suitable for current massive parallel sequencing.
Collapse
Affiliation(s)
- Chang-Chang Cao
- 1 State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, P. R. China
| | - Xiao Sun
- 1 State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, P. R. China
| |
Collapse
|
41
|
|
42
|
Schlötterer C, Kofler R, Versace E, Tobler R, Franssen SU. Combining experimental evolution with next-generation sequencing: a powerful tool to study adaptation from standing genetic variation. Heredity (Edinb) 2015; 114:431-40. [PMID: 25269380 PMCID: PMC4815507 DOI: 10.1038/hdy.2014.86] [Citation(s) in RCA: 169] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 07/01/2014] [Accepted: 07/14/2014] [Indexed: 12/20/2022] Open
Abstract
Evolve and resequence (E&R) is a new approach to investigate the genomic responses to selection during experimental evolution. By using whole genome sequencing of pools of individuals (Pool-Seq), this method can identify selected variants in controlled and replicable experimental settings. Reviewing the current state of the field, we show that E&R can be powerful enough to identify causative genes and possibly even single-nucleotide polymorphisms. We also discuss how the experimental design and the complexity of the trait could result in a large number of false positive candidates. We suggest experimental and analytical strategies to maximize the power of E&R to uncover the genotype-phenotype link and serve as an important research tool for a broad range of evolutionary questions.
Collapse
Affiliation(s)
- C Schlötterer
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - R Kofler
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - E Versace
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy
| | - R Tobler
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vienna, Austria
| | - S U Franssen
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| |
Collapse
|
43
|
Power analysis of artificial selection experiments using efficient whole genome simulation of quantitative traits. Genetics 2015; 199:991-1005. [PMID: 25672748 PMCID: PMC4391575 DOI: 10.1534/genetics.115.175075] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 02/05/2015] [Indexed: 11/18/2022] Open
Abstract
Evolve and resequence studies combine artificial selection experiments with massively parallel sequencing technology to study the genetic basis for complex traits. In these experiments, individuals are selected for extreme values of a trait, causing alleles at quantitative trait loci (QTL) to increase or decrease in frequency in the experimental population. We present a new analysis of the power of artificial selection experiments to detect and localize quantitative trait loci. This analysis uses a simulation framework that explicitly models whole genomes of individuals, quantitative traits, and selection based on individual trait values. We find that explicitly modeling QTL provides qualitatively different insights than considering independent loci with constant selection coefficients. Specifically, we observe how interference between QTL under selection affects the trajectories and lengthens the fixation times of selected alleles. We also show that a substantial portion of the genetic variance of the trait (50–100%) can be explained by detected QTL in as little as 20 generations of selection, depending on the trait architecture and experimental design. Furthermore, we show that power depends crucially on the opportunity for recombination during the experiment. Finally, we show that an increase in power is obtained by leveraging founder haplotype information to obtain allele frequency estimates.
Collapse
|
44
|
Franssen SU, Nolte V, Tobler R, Schlötterer C. Patterns of linkage disequilibrium and long range hitchhiking in evolving experimental Drosophila melanogaster populations. Mol Biol Evol 2015; 32:495-509. [PMID: 25415966 PMCID: PMC4298179 DOI: 10.1093/molbev/msu320] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Whole-genome resequencing of experimental populations evolving under a specific selection regime has become a popular approach to determine genotype-phenotype maps and understand adaptation to new environments. Despite its conceptual appeal and success in identifying some causative genes, it has become apparent that many studies suffer from an excess of candidate loci. Several explanations have been proposed for this phenomenon, but it is clear that information about the linkage structure during such experiments is needed. Until now only Pool-Seq (whole-genome sequencing of pools of individuals) data were available, which do not provide sufficient information about the correlation between linked sites. We address this problem in two complementary analyses of three replicate Drosophila melanogaster populations evolving to a new hot temperature environment for almost 70 generations. In the first analysis, we sequenced 58 haploid genomes from the founder population and evolved flies at generation 67. We show that during the experiment linkage disequilibrium (LD) increased almost uniformly over much greater distances than typically seen in Drosophila. In the second analysis, Pool-Seq time series data of the three replicates were combined with haplotype information from the founder population to follow blocks of initial haplotypes over time. We identified 17 selected haplotype-blocks that started at low frequencies in the base population and increased in frequency during the experiment. The size of these haplotype-blocks ranged from 0.082 to 4.01 Mb. Moreover, between 42% and 46% of the top candidate single nucleotide polymorphisms from the comparison of founder and evolved populations fell into the genomic region covered by the haplotype-blocks. We conclude that LD in such rising haplotype-blocks results in long range hitchhiking over multiple kilobase-sized regions. LD in such haplotype-blocks is therefore a major factor contributing to an excess of candidate loci. Although modifications of the experimental design may help to reduce the hitchhiking effect and allow for more precise mapping of causative variants, we also note that such haplotype-blocks might be well suited to study the dynamics of selected genomic regions during experimental evolution studies.
Collapse
Affiliation(s)
| | - Viola Nolte
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - Ray Tobler
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | | |
Collapse
|
45
|
Cao CC, Sun X. Accurate estimation of haplotype frequency from pooled sequencing data and cost-effective identification of rare haplotype carriers by overlapping pool sequencing. Bioinformatics 2014; 31:515-22. [PMID: 25304780 DOI: 10.1093/bioinformatics/btu670] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION A variety of hypotheses have been proposed for finding the missing heritability of complex diseases in genome-wide association studies. Studies have focused on the value of haplotype to improve the power of detecting associations with disease. To facilitate haplotype-based association analysis, it is necessary to accurately estimate haplotype frequencies of pooled samples. RESULTS Taking advantage of databases that contain prior haplotypes, we present Ehapp based on the algorithm for solving the system of linear equations to estimate the frequencies of haplotypes from pooled sequencing data. Effects of various factors in sequencing on the performance are evaluated using simulated data. Our method could estimate the frequencies of haplotypes with only about 3% average relative difference for pooled sequencing of the mixture of 10 haplotypes with total coverage of 50×. When unknown haplotypes exist, our method maintains excellent performance for haplotypes with actual frequencies >0.05. Comparisons with present method on simulated data in conjunction with publicly available Illumina sequencing data indicate that our method is state of the art for many sequencing study designs. We also demonstrate the feasibility of applying overlapping pool sequencing to identify rare haplotype carriers cost-effectively. AVAILABILITY AND IMPLEMENTATION Ehapp (in Perl) for the Linux platforms is available online (http://bioinfo.seu.edu.cn/Ehapp/). CONTACT xsun@seu.edu.cn SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chang-Chang Cao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | - Xiao Sun
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| |
Collapse
|
46
|
Sequencing pools of individuals — mining genome-wide polymorphism data without big funding. Nat Rev Genet 2014; 15:749-63. [DOI: 10.1038/nrg3803] [Citation(s) in RCA: 512] [Impact Index Per Article: 46.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
47
|
Nuzhdin SV, Turner TL. Promises and limitations of hitchhiking mapping. Curr Opin Genet Dev 2013; 23:694-9. [PMID: 24239053 PMCID: PMC3872824 DOI: 10.1016/j.gde.2013.10.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 08/20/2013] [Accepted: 10/03/2013] [Indexed: 12/18/2022]
Abstract
Building the connection between genetic and phenotypic variation is an important 'work in progress', and one that will enable proactive diagnosis and treatment in medicine, promote development of environment-targeted varieties in agriculture, and clarify the limits of species adaptation to changing environments in conservation. Quantitative trait loci (QTL) mapping and genome wide association (GWA) studies have recently been allied to an additional focus on 'hitchhiking' (HH) mapping--using changes in allele frequency due to artificial or natural selection. This older technique has been popularized by the falling costs of high throughput sequencing. Initial HH-resequensing experiments seem to have found many thousands of polymorphisms responding to selection. We argue that this interpretation appears too optimistic, and that the data might in fact be more consistent with dozens, rather than thousands, of loci under selection. We propose several developments required for sensible data analyses that will fully realize the great power of the HH technique, and outline ways of moving forward.
Collapse
Affiliation(s)
- Sergey V Nuzhdin
- Program in Molecular and Computation Biology, University of Southern California, Los Angeles 90089, United States.
| | | |
Collapse
|
48
|
Rellstab C, Zoller S, Tedder A, Gugerli F, Fischer MC. Validation of SNP allele frequencies determined by pooled next-generation sequencing in natural populations of a non-model plant species. PLoS One 2013; 8:e80422. [PMID: 24244686 PMCID: PMC3820589 DOI: 10.1371/journal.pone.0080422] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2013] [Accepted: 10/02/2013] [Indexed: 11/28/2022] Open
Abstract
Sequencing of pooled samples (Pool-Seq) using next-generation sequencing technologies has become increasingly popular, because it represents a rapid and cost-effective method to determine allele frequencies for single nucleotide polymorphisms (SNPs) in population pools. Validation of allele frequencies determined by Pool-Seq has been attempted using an individual genotyping approach, but these studies tend to use samples from existing model organism databases or DNA stores, and do not validate a realistic setup for sampling natural populations. Here we used pyrosequencing to validate allele frequencies determined by Pool-Seq in three natural populations of Arabidopsis halleri (Brassicaceae). The allele frequency estimates of the pooled population samples (consisting of 20 individual plant DNA samples) were determined after mapping Illumina reads to (i) the publicly available, high-quality reference genome of a closely related species (Arabidopsis thaliana) and (ii) our own de novo draft genome assembly of A. halleri. We then pyrosequenced nine selected SNPs using the same individuals from each population, resulting in a total of 540 samples. Our results show a highly significant and accurate relationship between pooled and individually determined allele frequencies, irrespective of the reference genome used. Allele frequencies differed on average by less than 4%. There was no tendency that either the Pool-Seq or the individual-based approach resulted in higher or lower estimates of allele frequencies. Moreover, the rather high coverage in the mapping to the two reference genomes, ranging from 55 to 284x, had no significant effect on the accuracy of the Pool-Seq. A resampling analysis showed that only very low coverage values (below 10-20x) would substantially reduce the precision of the method. We therefore conclude that a pooled re-sequencing approach is well suited for analyses of genetic variation in natural populations.
Collapse
Affiliation(s)
- Christian Rellstab
- Biodiversity and Conservation Biology, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland
| | - Stefan Zoller
- Genetic Diversity Centre, ETH Zürich, Zürich, Switzerland
| | - Andrew Tedder
- Institute of Evolutionary Biology and Environmental Studies and Institute of Plant Biology, University of Zürich, Zürich, Switzerland
| | - Felix Gugerli
- Biodiversity and Conservation Biology, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland
| | | |
Collapse
|
49
|
Kum CK, Thorburn D, Ghilagaber G, Gil P, Björkman A. On the effects of malaria treatment on parasite drug resistance--probability modelling of genotyped malaria infections. Int J Biostat 2013; 9:/j/ijb.2013.9.issue-1/ijb-2012-0016/ijb-2012-0016.xml. [PMID: 24127546 DOI: 10.1515/ijb-2012-0016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We compare the frequency of resistant genes of malaria parasites before treatment and at first malaria incidence after treatment. The data come from a clinical trial at two health facilities in Tanzania and concerns single nucleotide polymorphisms (SNPs) at three positions believed to be related to resistance to malaria treatment. A problem is that mixed infections are common, which both obscures the underlying frequency of alleles at each locus as well as the associations between loci in samples where alleles are mixed. We use combinatorics and quite involved probability methods to handle multiple infections and multiple haplotypes. The infection with the different haplotypes seemed to be independent of each other. We showed that at two of the three studied SNPs, the proportion of resistant genes had increased after treatment with sulfadoxine-pyrimethamine alone but when treated in combination with artesunate, no effect was noticed. First recurrences of malaria associated more with sulfadoxine-pyrimethamine alone as treatment than when in combination with artesunate. We also found that the recruited children had two different ongoing malaria infections where the parasites had different gene types.
Collapse
|
50
|
Jajamovich GH, Iliadis A, Anastassiou D, Wang X. Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA. BMC Bioinformatics 2013; 14:270. [PMID: 24010487 PMCID: PMC3847492 DOI: 10.1186/1471-2105-14-270] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2013] [Accepted: 08/27/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND DNA pooling constitutes a cost effective alternative in genome wide association studies. In DNA pooling, equimolar amounts of DNA from different individuals are mixed into one sample and the frequency of each allele in each position is observed in a single genotype experiment. The identification of haplotype frequencies from pooled data in addition to single locus analysis is of separate interest within these studies as haplotypes could increase statistical power and provide additional insight. RESULTS We developed a method for maximum-parsimony haplotype frequency estimation from pooled DNA data based on the sparse representation of the DNA pools in a dictionary of haplotypes. Extensions to scenarios where data is noisy or even missing are also presented. The resulting method is first applied to simulated data based on the haplotypes and their associated frequencies of the AGT gene. We further evaluate our methodology on datasets consisting of SNPs from the first 7Mb of the HapMap CEU population. Noise and missing data were further introduced in the datasets in order to test the extensions of the proposed method. Both HIPPO and HAPLOPOOL were also applied to these datasets to compare performances. CONCLUSIONS We evaluate our methodology on scenarios where pooling is more efficient relative to individual genotyping; that is, in datasets that contain pools with a small number of individuals. We show that in such scenarios our methodology outperforms state-of-the-art methods such as HIPPO and HAPLOPOOL.
Collapse
Affiliation(s)
- Guido H Jajamovich
- Electrical Engineering Department, Columbia University, New York NY 10027, USA.
| | | | | | | |
Collapse
|