1
|
Perkins SA, Neafsey DE, Early AM. Heterogeneous constraint and adaptation across the malaria parasite life cycle. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.11.636054. [PMID: 39990389 PMCID: PMC11844417 DOI: 10.1101/2025.02.11.636054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Evolutionary forces vary across genomes, creating disparities in how traits evolve. In organisms with complex life cycles, it is unclear how intrinsic differences among discrete life stages impact evolution. We looked for life history-driven changes in patterns of adaptation in Plasmodium falciparum, a malaria-causing parasite with a multi-stage life cycle. Categorizing genes based on their expression in different life stages, we compared patterns of between- and within-species polymorphism across stages by estimating nonsynonymous to synonymous substitution rate ratios (dN/dS) and mean pairwise nucleotide diversity ( π NS/ π S). Considering these alongside estimates of Tajima's D, fixation probability, adaptive divergence proportion and rate, and F ST , we looked for changes in the drift-selection balance in life stages subject to transmission bottlenecks and changes in ploidy. We observed signals of reduced selection efficacy in genes exclusively expressed in sporozoites, the parasite form transmitted from mosquitoes to humans and often targeted by vaccines and monoclonal antibodies. We discuss implications for how parasites evolve to resist therapeutics and consider functional, molecular, and population genetic factors that could contribute to these patterns.
Collapse
Affiliation(s)
- Sarah A Perkins
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Daniel E Neafsey
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Angela M Early
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
2
|
Árnason E, Koskela J, Halldórsdóttir K, Eldon B. Sweepstakes reproductive success via pervasive and recurrent selective sweeps. eLife 2023; 12:80781. [PMID: 36806325 PMCID: PMC9940914 DOI: 10.7554/elife.80781] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 12/28/2022] [Indexed: 02/22/2023] Open
Abstract
Highly fecund natural populations characterized by high early mortality abound, yet our knowledge about their recruitment dynamics is somewhat rudimentary. This knowledge gap has implications for our understanding of genetic variation, population connectivity, local adaptation, and the resilience of highly fecund populations. The concept of sweepstakes reproductive success, which posits a considerable variance and skew in individual reproductive output, is key to understanding the distribution of individual reproductive success. However, it still needs to be determined whether highly fecund organisms reproduce through sweepstakes and, if they do, the relative roles of neutral and selective sweepstakes. Here, we use coalescent-based statistical analysis of population genomic data to show that selective sweepstakes likely explain recruitment dynamics in the highly fecund Atlantic cod. We show that the Kingman coalescent (modelling no sweepstakes) and the Xi-Beta coalescent (modelling random sweepstakes), including complex demography and background selection, do not provide an adequate fit for the data. The Durrett-Schweinsberg coalescent, in which selective sweepstakes result from recurrent and pervasive selective sweeps of new mutations, offers greater explanatory power. Our results show that models of sweepstakes reproduction and multiple-merger coalescents are relevant and necessary for understanding genetic diversity in highly fecund natural populations. These findings have fundamental implications for understanding the recruitment variation of fish stocks and general evolutionary genomics of high-fecundity organisms.
Collapse
Affiliation(s)
- Einar Árnason
- Institute of Life- and environmental Sciences, University of IcelandReykjavikIceland,Department of Organismal and Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Jere Koskela
- Department of Statistics, University of WarwickCoventryUnited Kingdom
| | - Katrín Halldórsdóttir
- Institute of Life- and environmental Sciences, University of IcelandReykjavikIceland
| | - Bjarki Eldon
- Leibniz Institute for Evolution and Biodiversity Science, Museum für NaturkundeBerlinGermany
| |
Collapse
|
3
|
Murga-Moreno J, Coronado-Zamora M, Casillas S, Barbadilla A. impMKT: the imputed McDonald and Kreitman test, a straightforward correction that significantly increases the evidence of positive selection of the McDonald and Kreitman test at the gene level. G3 GENES|GENOMES|GENETICS 2022; 12:6670623. [PMID: 35976111 PMCID: PMC9526038 DOI: 10.1093/g3journal/jkac206] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/28/2022] [Indexed: 11/14/2022]
Abstract
The McDonald and Kreitman test is one of the most powerful and widely used methods to detect and quantify recurrent natural selection in DNA sequence data. One of its main limitations is the underestimation of positive selection due to the presence of slightly deleterious variants segregating at low frequencies. Although several approaches have been developed to overcome this limitation, most of them work on gene pooled analyses. Here, we present the imputed McDonald and Kreitman test (impMKT), a new straightforward approach for the detection of positive selection and other selection components of the distribution of fitness effects at the gene level. We compare imputed McDonald and Kreitman test with other widely used McDonald and Kreitman test approaches considering both simulated and empirical data. By applying imputed McDonald and Kreitman test to humans and Drosophila data at the gene level, we substantially increase the statistical evidence of positive selection with respect to previous approaches (e.g. by 50% and 157% compared with the McDonald and Kreitman test in Drosophila and humans, respectively). Finally, we review the minimum number of genes required to obtain a reliable estimation of the proportion of adaptive substitution (α) in gene pooled analyses by using the imputed McDonald and Kreitman test compared with other McDonald and Kreitman test implementations. Because of its simplicity and increased power to detect recurrent positive selection on genes, we propose the imputed McDonald and Kreitman test as the first straightforward approach for testing specific evolutionary hypotheses at the gene level. The software implementation and population genomics data are available at the web-server imkt.uab.cat.
Collapse
Affiliation(s)
- Jesús Murga-Moreno
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Marta Coronado-Zamora
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Sònia Casillas
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Antonio Barbadilla
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| |
Collapse
|
4
|
Ramstein GP, Buckler ES. Prediction of evolutionary constraint by genomic annotations improves functional prioritization of genomic variants in maize. Genome Biol 2022; 23:183. [PMID: 36050782 PMCID: PMC9438327 DOI: 10.1186/s13059-022-02747-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 08/15/2022] [Indexed: 11/10/2022] Open
Abstract
Background Crop improvement through cross-population genomic prediction and genome editing requires identification of causal variants at high resolution, within fewer than hundreds of base pairs. Most genetic mapping studies have generally lacked such resolution. In contrast, evolutionary approaches can detect genetic effects at high resolution, but they are limited by shifting selection, missing data, and low depth of multiple-sequence alignments. Here we use genomic annotations to accurately predict nucleotide conservation across angiosperms, as a proxy for fitness effect of mutations. Results Using only sequence analysis, we annotate nonsynonymous mutations in 25,824 maize gene models, with information from bioinformatics and deep learning. Our predictions are validated by experimental information: within-species conservation, chromatin accessibility, and gene expression. According to gene ontology and pathway enrichment analyses, predicted nucleotide conservation points to genes in central carbon metabolism. Importantly, it improves genomic prediction for fitness-related traits such as grain yield, in elite maize panels, by stringent prioritization of fewer than 1% of single-site variants. Conclusions Our results suggest that predicting nucleotide conservation across angiosperms may effectively prioritize sites most likely to impact fitness-related traits in crops, without being limited by shifting selection, missing data, and low depth of multiple-sequence alignments. Our approach—Prediction of mutation Impact by Calibrated Nucleotide Conservation (PICNC)—could be useful to select polymorphisms for accurate genomic prediction, and candidate mutations for efficient base editing. The trained PICNC models and predicted nucleotide conservation at protein-coding SNPs in maize are publicly available in CyVerse (10.25739/hybz-2957). Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02747-2.
Collapse
Affiliation(s)
- Guillaume P Ramstein
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark. .,Institute for Genomic Diversity, Cornell University, Ithaca, NY, 14853, USA.
| | - Edward S Buckler
- Institute for Genomic Diversity, Cornell University, Ithaca, NY, 14853, USA.,USDA-ARS, Ithaca, NY, 14853, USA
| |
Collapse
|
5
|
Liang YY, Chen XY, Zhou BF, Mitchell-Olds T, Wang B. Globally Relaxed Selection and Local Adaptation in Boechera stricta. Genome Biol Evol 2022; 14:evac043. [PMID: 35349686 PMCID: PMC9011030 DOI: 10.1093/gbe/evac043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2022] [Indexed: 11/25/2022] Open
Abstract
The strength of selection varies among populations and across the genome, but the determinants of efficacy of selection remain unclear. In this study, we used whole-genome sequencing data from 467 Boechera stricta accessions to quantify the strength of selection and characterize the pattern of local adaptation. We found low genetic diversity on 0-fold degenerate sites and conserved non-coding sites, indicating functional constraints on these regions. The estimated distribution of fitness effects and the proportion of fixed substitutions suggest relaxed negative and positive selection in B. stricta. Among the four population groups, the NOR and WES groups have smaller effective population size (Ne), higher proportions of effectively neutral sites, and lower rates of adaptive evolution compared with UTA and COL groups, reflecting the effect of Ne on the efficacy of natural selection. We also found weaker selection on GC-biased sites compared with GC-conservative (unbiased) sites, suggested that GC-biased gene conversion has affected the strength of selection in B. stricta. We found mixed evidence for the role of the recombination rate on the efficacy of selection. The positive and negative selection was stronger in high-recombination regions compared with low-recombination regions in COL but not in other groups. By scanning the genome, we found different subsets of selected genes suggesting differential adaptation among B. stricta groups. These results show that differences in effective population size, nucleotide composition, and recombination rate are important determinants of the efficacy of selection. This study enriches our understanding of the roles of natural selection and local adaptation in shaping genomic variation.
Collapse
Affiliation(s)
- Yi-Ye Liang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences,
Guangzhou, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Xue-Yan Chen
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences,
Guangzhou, China
- University of the Chinese Academy of Sciences, Beijing, China
| | - Biao-Feng Zhou
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences,
Guangzhou, China
- University of the Chinese Academy of Sciences, Beijing, China
| | | | - Baosheng Wang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences,
Guangzhou, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou, China
| |
Collapse
|
6
|
Soni V, Eyre-Walker A. OUP accepted manuscript. Genome Biol Evol 2022; 14:6528851. [PMID: 35166775 PMCID: PMC8882387 DOI: 10.1093/gbe/evac028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2022] [Indexed: 12/05/2022] Open
Abstract
The rate of amino acid substitution has been shown to be correlated to a number of factors including the rate of recombination, the age of the gene, the length of the protein, mean expression level, and gene function. However, the extent to which these correlations are due to adaptive and nonadaptive evolution has not been studied in detail, at least not in hominids. We find that the rate of adaptive evolution is significantly positively correlated to the rate of recombination, protein length and gene expression level, and negatively correlated to gene age. These correlations remain significant when each factor is controlled for in turn, except when controlling for expression in an analysis of protein length; and they also generally remain significant when biased gene conversion is taken into account. However, the positive correlations could be an artifact of population size contraction. We also find that the rate of nonadaptive evolution is negatively correlated to each factor, and all these correlations survive controlling for each other and biased gene conversion. Finally, we examine the effect of gene function on rates of adaptive and nonadaptive evolution; we confirm that virus-interacting proteins (VIPs) have higher rates of adaptive and lower rates of nonadaptive evolution, but we also demonstrate that there is significant variation in the rate of adaptive and nonadaptive evolution between GO categories when removing VIPs. We estimate that the VIP/non-VIP axis explains about 5–8 fold more of the variance in evolutionary rate than GO categories.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- Corresponding author: E-mail:
| |
Collapse
|
7
|
Wang F, Tekle YI. Variation of natural selection in the Amoebozoa reveals heterogeneity across the phylogeny and adaptive evolution in diverse lineages. Front Ecol Evol 2022; 10:851816. [PMID: 36874909 PMCID: PMC9980437 DOI: 10.3389/fevo.2022.851816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The evolution and diversity of the supergroup Amoebozoa is complex and poorly understood. The supergroup encompasses predominantly amoeboid lineages characterized by extreme diversity in phenotype, behavior and genetics. The study of natural selection, a driving force of diversification, within and among species of Amoebozoa will play a crucial role in understanding the evolution of the supergroup. In this study, we searched for traces of natural selection based on a set of highly conserved protein-coding genes in a phylogenetic framework from a broad sampling of amoebozoans. Using these genes, we estimated substitution rates and inferred patterns of selective pressure in lineages and sites with various models. We also examined the effect of selective pressure on codon usage bias and potential correlations with observed biological traits and habitat. Results showed large heterogeneity of selection across lineages of Amoebozoa, indicating potential species-specific optimization of adaptation to their diverse ecological environment. Overall, lineages in Tubulinea had undergone stronger purifying selection with higher average substitution rates compared to Discosea and Evosea. Evidence of adaptive evolution was observed in some representative lineages and in a gene (Rpl7a) within Evosea, suggesting potential innovation and beneficial mutations in these lineages. Our results revealed that members of the fast-evolving lineages, Entamoeba and Cutosea, all underwent strong purifying selection but had distinct patterns of codon usage bias. For the first time, this study revealed an overall pattern of natural selection across the phylogeny of Amoebozoa and provided significant implications on their distinctive evolutionary processes.
Collapse
Affiliation(s)
- Fang Wang
- Department of Biology, Spelman College, Atlanta, GA, United States
| | - Yonas I Tekle
- Department of Biology, Spelman College, Atlanta, GA, United States
| |
Collapse
|
8
|
Huang YF. Dissecting genomic determinants of positive selection with an evolution-guided regression model. Mol Biol Evol 2021; 39:6379733. [PMID: 34597406 PMCID: PMC8763110 DOI: 10.1093/molbev/msab291] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
In evolutionary genomics, it is fundamentally important to understand how characteristics of genomic sequences, such as gene expression level, determine the rate of adaptive evolution. While numerous statistical methods, such as the McDonald–Kreitman (MK) test, are available to examine the association between genomic features and the rate of adaptation, we currently lack a statistical approach to disentangle the independent effect of a genomic feature from the effects of other correlated genomic features. To address this problem, I present a novel statistical model, the MK regression, which augments the MK test with a generalized linear model. Analogous to the classical multiple regression model, the MK regression can analyze multiple genomic features simultaneously to infer the independent effect of a genomic feature, holding constant all other genomic features. Using the MK regression, I identify numerous genomic features driving positive selection in chimpanzees. These features include well-known ones, such as local mutation rate, residue exposure level, tissue specificity, and immune genes, as well as new features not previously reported, such as gene expression level and metabolic genes. In particular, I show that highly expressed genes may have a higher adaptation rate than their weakly expressed counterparts, even though a higher expression level may impose stronger negative selection. Also, I show that metabolic genes may have a higher adaptation rate than their nonmetabolic counterparts, possibly due to recent changes in diet in primate evolution. Overall, the MK regression is a powerful approach to elucidate the genomic basis of adaptation.
Collapse
Affiliation(s)
- Yi-Fei Huang
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA.,Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA
| |
Collapse
|
9
|
Ando N, Sekizuka T, Yokoyama E, Aihara Y, Konishi N, Matsumoto Y, Ishida K, Nagasawa K, Jourdan-Da Silva N, Suzuki M, Kimura H, Le Hello S, Murakami K, Kuroda M, Hirai S, Fukaya S. Whole Genome Analysis Detects the Emergence of a Single Salmonella enterica Serovar Chester Clone in Japan's Kanto Region. Front Microbiol 2021; 12:705679. [PMID: 34385991 PMCID: PMC8354586 DOI: 10.3389/fmicb.2021.705679] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 07/05/2021] [Indexed: 11/18/2022] Open
Abstract
In Japan's Kanto region, the number of Salmonella enterica serovar Chester infections increased temporarily between 2014 and 2016. Concurrently with this temporal increase in the Kanto region, S. Chester isolates belonging to one clonal group were causing repetitive outbreaks in Europe. A recent study reported that the European outbreaks were associated with travelers who had been exposed to contaminated food in Morocco, possibly seafood. Because Japan imports a large amount of seafood from Morocco, we aimed to establish whether the temporal increase in S. Chester infections in the Kanto region was associated with imported Moroccan seafood. Short sequence reads from the whole-genome sequencing of 47 S. Chester isolates from people in the Kanto region (2014-2016), and the additional genome sequences from 58 isolates from the European outbreaks, were analyzed. The reads were compared with the complete genome sequence from a S. Chester reference strain, and 347 single nucleotide polymorphisms (SNPs) were identified. These SNPs were used in this study. Cluster and Bayesian cluster analyses showed that the Japanese and European isolates fell into two different clusters. Therefore, Φ PT and I A S values were calculated to evaluate genetic differences between these clusters. The results revealed that the Japanese and European isolates were genetically distinct populations. Our root-to-tip analysis showed that the Japanese isolates originating from one clone had accumulated mutations, suggesting that an emergence of this organism occurred. A minimum spanning tree analysis demonstrated no correlation between genetic and geographical distances in the Japanese isolates, suggesting that the emergence of the serovar in the Kanto region did not involve person-to-person contact; rather, it occurred through food consumption. The d N /d S ratio indicated that the Japanese strain has evolved under positive selection pressure. Generally, a population of bacterial clones in a reservoir faces negative selection pressure. Therefore, the Japanese strain must have existed outside of any reservoir during its emergence. In conclusion, S. Chester isolates originating from one clone probably emerged in the Kanto region via the consumption of contaminated foods other than imported Moroccan seafood. The emerging strain may have not established a reservoir for survival in the food supply chain resulting in its disappearance after 2017.
Collapse
Affiliation(s)
- Naoshi Ando
- Division of Bacteriology, Chiba Prefectural Institute of Public Health, Chiba, Japan
| | - Tsuyoshi Sekizuka
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Eiji Yokoyama
- Division of Bacteriology, Chiba Prefectural Institute of Public Health, Chiba, Japan
| | - Yoshiyuki Aihara
- Division of Bacteriology, Ibaraki Prefectural Institute of Public Health, Mito, Japan
| | - Noriko Konishi
- Department of Microbiology, Tokyo Metropolitan Institute of Public Health, Tokyo, Japan
| | - Yuko Matsumoto
- Microbiological Testing and Research Division, Yokohama City Institute of Public Health, Yokohama, Japan
| | | | - Koo Nagasawa
- Laboratory of Cancer Genetics, Chiba Cancer Center Research Institute, Chiba, Japan
| | | | - Motoi Suzuki
- Center for Surveillance, Immunization, and Epidemiologic Research, National Institute of Infectious Diseases, Tokyo, Japan
| | - Hirokazu Kimura
- Faculty of Health Science, School of Medical Technology, Gunma Paz University, Takasaki, Japan
| | - Simon Le Hello
- French National Reference Center for E. coli, Shigella and Salmonella, Institute Pasteur, Paris, France
- Groupe de Recherche sur l’Adaptation Microbienne (GRAM 2.0, EA2656), Normandy University, UNICAEN, UNIROUEN, Caen, France
| | - Koichi Murakami
- Center for Emergency Preparedness and Response, National Institute of Infectious Diseases, Musashi-Murayama, Japan
| | - Makoto Kuroda
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Shinichiro Hirai
- Division of Bacteriology, Chiba Prefectural Institute of Public Health, Chiba, Japan
| | - Setsuko Fukaya
- Division of Bacteriology, Ibaraki Prefectural Institute of Public Health, Mito, Japan
| |
Collapse
|
10
|
Schroeder CM, Tomlin SA, Mejia Natividad I, Valenzuela JR, Young JM, Malik HS. An actin-related protein that is most highly expressed in Drosophila testes is critical for embryonic development. eLife 2021; 10:71279. [PMID: 34282725 PMCID: PMC8291977 DOI: 10.7554/elife.71279] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 06/20/2021] [Indexed: 12/25/2022] Open
Abstract
Most actin-related proteins (Arps) are highly conserved and carry out well-defined cellular functions in eukaryotes. However, many lineages like Drosophila and mammals encode divergent non-canonical Arps whose roles remain unknown. To elucidate the function of non-canonical Arps, we focus on Arp53D, which is highly expressed in testes and retained throughout Drosophila evolution. We show that Arp53D localizes to fusomes and actin cones, two germline-specific actin structures critical for sperm maturation, via a unique N-terminal tail. Surprisingly, we find that male fertility is not impaired upon Arp53D loss, yet population cage experiments reveal that Arp53D is required for optimal fitness in Drosophila melanogaster. To reconcile these findings, we focus on Arp53D function in ovaries and embryos where it is only weakly expressed. We find that under heat stress Arp53D-knockout (KO) females lay embryos with reduced nuclear integrity and lower viability; these defects are further exacerbated in Arp53D-KO embryos. Thus, despite its relatively recent evolution and primarily testis-specific expression, non-canonical Arp53D is required for optimal embryonic development in Drosophila.
Collapse
Affiliation(s)
- Courtney M Schroeder
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Sarah A Tomlin
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States.,Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Isabel Mejia Natividad
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States.,Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - John R Valenzuela
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Janet M Young
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Harmit S Malik
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States.,Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, United States
| |
Collapse
|
11
|
Herrera-Álvarez S, Karlsson E, Ryder OA, Lindblad-Toh K, Crawford AJ. How to Make a Rodent Giant: Genomic Basis and Tradeoffs of Gigantism in the Capybara, the World's Largest Rodent. Mol Biol Evol 2021; 38:1715-1730. [PMID: 33169792 PMCID: PMC8097284 DOI: 10.1093/molbev/msaa285] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Gigantism results when one lineage within a clade evolves extremely large body size relative to its small-bodied ancestors, a common phenomenon in animals. Theory predicts that the evolution of giants should be constrained by two tradeoffs. First, because body size is negatively correlated with population size, purifying selection is expected to be less efficient in species of large body size, leading to increased mutational load. Second, gigantism is achieved through generating a higher number of cells along with higher rates of cell proliferation, thus increasing the likelihood of cancer. To explore the genetic basis of gigantism in rodents and uncover genomic signatures of gigantism-related tradeoffs, we assembled a draft genome of the capybara (Hydrochoerus hydrochaeris), the world's largest living rodent. We found that the genome-wide ratio of nonsynonymous to synonymous mutations (ω) is elevated in the capybara relative to other rodents, likely caused by a generation-time effect and consistent with a nearly neutral model of molecular evolution. A genome-wide scan for adaptive protein evolution in the capybara highlighted several genes controlling postnatal bone growth regulation and musculoskeletal development, which are relevant to anatomical and developmental modifications for an increase in overall body size. Capybara-specific gene-family expansions included a putative novel anticancer adaptation that involves T-cell-mediated tumor suppression, offering a potential resolution to the increased cancer risk in this lineage. Our comparative genomic results uncovered the signature of an intragenomic conflict where the evolution of gigantism in the capybara involved selection on genes and pathways that are directly linked to cancer.
Collapse
Affiliation(s)
| | - Elinor Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA, USA
| | - Oliver A Ryder
- San Diego Zoo Institute for Conservation Research, San Diego Zoo Global, Escondido, CA, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia
| |
Collapse
|
12
|
Garud NR, Messer PW, Petrov DA. Detection of hard and soft selective sweeps from Drosophila melanogaster population genomic data. PLoS Genet 2021; 17:e1009373. [PMID: 33635910 PMCID: PMC7946363 DOI: 10.1371/journal.pgen.1009373] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 03/10/2021] [Accepted: 01/17/2021] [Indexed: 12/12/2022] Open
Abstract
Whether hard sweeps or soft sweeps dominate adaptation has been a matter of much debate. Recently, we developed haplotype homozygosity statistics that (i) can detect both hard and soft sweeps with similar power and (ii) can classify the detected sweeps as hard or soft. The application of our method to population genomic data from a natural population of Drosophila melanogaster (DGRP) allowed us to rediscover three known cases of adaptation at the loci Ace, Cyp6g1, and CHKov1 known to be driven by soft sweeps, and detected additional candidate loci for recent and strong sweeps. Surprisingly, all of the top 50 candidates showed patterns much more consistent with soft rather than hard sweeps. Recently, Harris et al. 2018 criticized this work, suggesting that all the candidate loci detected by our haplotype statistics, including the positive controls, are unlikely to be sweeps at all and that instead these haplotype patterns can be more easily explained by complex neutral demographic models. They also claim that these neutral non-sweeps are likely to be hard instead of soft sweeps. Here, we reanalyze the DGRP data using a range of complex admixture demographic models and reconfirm our original published results suggesting that the majority of recent and strong sweeps in D. melanogaster are first likely to be true sweeps, and second, that they do appear to be soft. Furthermore, we discuss ways to take this work forward given that most demographic models employed in such analyses are necessarily too simple to capture the full demographic complexity, while more realistic models are unlikely to be inferred correctly because they require a large number of free parameters.
Collapse
Affiliation(s)
- Nandita R. Garud
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
| | - Philipp W. Messer
- Department of Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Dmitri A. Petrov
- Department of Biology, Stanford University, Stanford, California, United States of America
| |
Collapse
|
13
|
Mugal CF, Kutschera VE, Botero-Castro F, Wolf JBW, Kaj I. Polymorphism Data Assist Estimation of the Nonsynonymous over Synonymous Fixation Rate Ratio ω for Closely Related Species. Mol Biol Evol 2020; 37:260-279. [PMID: 31504782 PMCID: PMC6984366 DOI: 10.1093/molbev/msz203] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The ratio of nonsynonymous over synonymous sequence divergence, dN/dS, is a widely used estimate of the nonsynonymous over synonymous fixation rate ratio ω, which measures the extent to which natural selection modulates protein sequence evolution. Its computation is based on a phylogenetic approach and computes sequence divergence of protein-coding DNA between species, traditionally using a single representative DNA sequence per species. This approach ignores the presence of polymorphisms and relies on the indirect assumption that new mutations fix instantaneously, an assumption which is generally violated and reasonable only for distantly related species. The violation of the underlying assumption leads to a time-dependence of sequence divergence, and biased estimates of ω in particular for closely related species, where the contribution of ancestral and lineage-specific polymorphisms to sequence divergence is substantial. We here use a time-dependent Poisson random field model to derive an analytical expression of dN/dS as a function of divergence time and sample size. We then extend our framework to the estimation of the proportion of adaptive protein evolution α. This mathematical treatment enables us to show that the joint usage of polymorphism and divergence data can assist the inference of selection for closely related species. Moreover, our analytical results provide the basis for a protocol for the estimation of ω and α for closely related species. We illustrate the performance of this protocol by studying a population data set of four corvid species, which involves the estimation of ω and α at different time-scales and for several choices of sample sizes.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| | - Verena E Kutschera
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Science for Life Laboratory, Stockholm University, Stockholm, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Fidel Botero-Castro
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Jochen B W Wolf
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.,Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Ingemar Kaj
- Department of Mathematics, Uppsala University, Uppsala, Sweden
| |
Collapse
|
14
|
Abstract
Adaptive mutations play an important role in molecular evolution. However, the frequency and nature of these mutations at the intramolecular level are poorly understood. To address this, we analyzed the impact of protein architecture on the rate of adaptive substitutions, aiming to understand how protein biophysics influences fitness and adaptation. Using Drosophila melanogaster and Arabidopsis thaliana population genomics data, we fitted models of distribution of fitness effects and estimated the rate of adaptive amino-acid substitutions both at the protein and amino-acid residue level. We performed a comprehensive analysis covering genome, gene, and protein structure, by exploring a multitude of factors with a plausible impact on the rate of adaptive evolution, such as intron number, protein length, secondary structure, relative solvent accessibility, intrinsic protein disorder, chaperone affinity, gene expression, protein function, and protein-protein interactions. We found that the relative solvent accessibility is a major determinant of adaptive evolution, with most adaptive mutations occurring at the surface of proteins. Moreover, we observe that the rate of adaptive substitutions differs between protein functional classes, with genes encoding for protein biosynthesis and degradation signaling exhibiting the fastest rates of protein adaptation. Overall, our results suggest that adaptive evolution in proteins is mainly driven by intermolecular interactions, with host-pathogen coevolution likely playing a major role.
Collapse
Affiliation(s)
- Ana Filipa Moutinho
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Fernanda Fontes Trancoso
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Julien Yann Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany.,Unité Mixte de Recherche 5554 Institut des Sciences de l'Evolution, CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| |
Collapse
|
15
|
Popovic I, Riginos C. Comparative genomics reveals divergent thermal selection in warm‐ and cold‐tolerant marine mussels. Mol Ecol 2020; 29:519-535. [DOI: 10.1111/mec.15339] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 12/10/2019] [Accepted: 12/13/2019] [Indexed: 12/25/2022]
Affiliation(s)
- Iva Popovic
- School of Biological Sciences University of Queensland St Lucia Qld Australia
| | - Cynthia Riginos
- School of Biological Sciences University of Queensland St Lucia Qld Australia
| |
Collapse
|
16
|
Moutinho AF, Bataillon T, Dutheil JY. Variation of the adaptive substitution rate between species and within genomes. Evol Ecol 2019. [DOI: 10.1007/s10682-019-10026-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
AbstractThe importance of adaptive mutations in molecular evolution is extensively debated. Recent developments in population genomics allow inferring rates of adaptive mutations by fitting a distribution of fitness effects to the observed patterns of polymorphism and divergence at sites under selection and sites assumed to evolve neutrally. Here, we summarize the current state-of-the-art of these methods and review the factors that affect the molecular rate of adaptation. Several studies have reported extensive cross-species variation in the proportion of adaptive amino-acid substitutions (α) and predicted that species with larger effective population sizes undergo less genetic drift and higher rates of adaptation. Disentangling the rates of positive and negative selection, however, revealed that mutations with deleterious effects are the main driver of this population size effect and that adaptive substitution rates vary comparatively little across species. Conversely, rates of adaptive substitution have been documented to vary substantially within genomes. On a genome-wide scale, gene density, recombination and mutation rate were observed to play a role in shaping molecular rates of adaptation, as predicted under models of linked selection. At the gene level, it has been reported that the gene functional category and the macromolecular structure substantially impact the rate of adaptive mutations. Here, we deliver a comprehensive review of methods used to infer the molecular adaptive rate, the potential drivers of adaptive evolution and how positive selection shapes molecular evolution within genes, across genes within species and between species.
Collapse
|
17
|
Coronado-Zamora M, Salvador-Martínez I, Castellano D, Barbadilla A, Salazar-Ciudad I. Adaptation and Conservation throughout the Drosophila melanogaster Life-Cycle. Genome Biol Evol 2019; 11:1463-1482. [PMID: 31028390 PMCID: PMC6535812 DOI: 10.1093/gbe/evz086] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2019] [Indexed: 01/09/2023] Open
Abstract
Previous studies of the evolution of genes expressed at different life-cycle stages of Drosophila melanogaster have not been able to disentangle adaptive from nonadaptive substitutions when using nonsynonymous sites. Here, we overcome this limitation by combining whole-genome polymorphism data from D. melanogaster and divergence data between D. melanogaster and Drosophila yakuba. For the set of genes expressed at different life-cycle stages of D. melanogaster, as reported in modENCODE, we estimate the ratio of substitutions relative to polymorphism between nonsynonymous and synonymous sites (α) and then α is discomposed into the ratio of adaptive (ωa) and nonadaptive (ωna) substitutions to synonymous substitutions. We find that the genes expressed in mid- and late-embryonic development are the most conserved, whereas those expressed in early development and postembryonic stages are the least conserved. Importantly, we found that low conservation in early development is due to high rates of nonadaptive substitutions (high ωna), whereas in postembryonic stages it is due, instead, to high rates of adaptive substitutions (high ωa). By using estimates of different genomic features (codon bias, average intron length, exon number, recombination rate, among others), we also find that genes expressed in mid- and late-embryonic development show the most complex architecture: they are larger, have more exons, more transcripts, and longer introns. In addition, these genes are broadly expressed among all stages. We suggest that all these genomic features are related to the conservation of mid- and late-embryonic development. Globally, our study supports the hourglass pattern of conservation and adaptation over the life-cycle.
Collapse
Affiliation(s)
- Marta Coronado-Zamora
- Genomics, Bioinformatics and Evolution, Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Irepan Salvador-Martínez
- Evo-Devo Helsinki Community, Centre of Excellence in Experimental and Computational Developmental Biology, Institute of Biotechnology, University of Helsinki, Finland.,Department of Genetics, Evolution and Environment, University College London, United Kingdom
| | | | - Antonio Barbadilla
- Genomics, Bioinformatics and Evolution, Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Isaac Salazar-Ciudad
- Genomics, Bioinformatics and Evolution, Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain.,Evo-Devo Helsinki Community, Centre of Excellence in Experimental and Computational Developmental Biology, Institute of Biotechnology, University of Helsinki, Finland.,Centre de Recerca Matemàtica, Cerdanyola del Vallès, Spain
| |
Collapse
|
18
|
Davydov II, Salamin N, Robinson-Rechavi M. Large-Scale Comparative Analysis of Codon Models Accounting for Protein and Nucleotide Selection. Mol Biol Evol 2019; 36:1316-1332. [PMID: 30847475 PMCID: PMC6526913 DOI: 10.1093/molbev/msz048] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
There are numerous sources of variation in the rate of synonymous substitutions inside genes, such as direct selection on the nucleotide sequence, or mutation rate variation. Yet scans for positive selection rely on codon models which incorporate an assumption of effectively neutral synonymous substitution rate, constant between sites of each gene. Here we perform a large-scale comparison of approaches which incorporate codon substitution rate variation and propose our own simple yet effective modification of existing models. We find strong effects of substitution rate variation on positive selection inference. More than 70% of the genes detected by the classical branch-site model are presumably false positives caused by the incorrect assumption of uniform synonymous substitution rate. We propose a new model which is strongly favored by the data while remaining computationally tractable. With the new model we can capture signatures of nucleotide level selection acting on translation initiation and on splicing sites within the coding region. Finally, we show that rate variation is highest in the highly recombining regions, and we propose that recombination and mutation rate variation, such as high CpG mutation rate, are the two main sources of nucleotide rate variation. Although we detect fewer genes under positive selection in Drosophila than without rate variation, the genes which we detect contain a stronger signal of adaptation of dynein, which could be associated with Wolbachia infection. We provide software to perform positive selection analysis using the new model.
Collapse
Affiliation(s)
- Iakov I Davydov
- Department of Computational Biology, Biophore, University of Lausanne, Lausanne, Switzerland.,Department of Ecology and Evolution, Biophore, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Nicolas Salamin
- Department of Computational Biology, Biophore, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, Biophore, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
19
|
Haploid selection drives new gene male germline expression. Genome Res 2019; 29:1115-1122. [PMID: 31221725 PMCID: PMC6633266 DOI: 10.1101/gr.238824.118] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 05/31/2019] [Indexed: 11/25/2022]
Abstract
New genes are a major source of novelties, and a disproportionate amount of them are known to show testis expression in later phases of male gametogenesis in different groups such as mammals and plants. Here, we propose that this enhanced expression is a consequence of haploid selection during the latter stages of male gametogenesis. Because emerging adaptive mutations will be fixed faster if their phenotypes are expressed by haploid rather than diploid genotypes, new genes with advantageous functions arising during this unique stage of development have a better chance to become fixed. To test this hypothesis, expression levels of genes of differing evolutionary age were examined at various stages of Drosophila spermatogenesis. We found, consistent with a model based on haploid selection, that new Drosophila genes are both expressed in later haploid phases of spermatogenesis and harbor a significant enrichment of adaptive mutations. Additionally, the observed overexpression of new genes in the latter phases of spermatogenesis was limited to the autosomes. Because all male cells exhibit hemizygous expression for X-linked genes (and therefore effectively haploid), there is no expectation that selection acting on late spermatogenesis will have a different effect on X-linked genes in comparison to initial diploid phases. Together, our proposed hypothesis and the analyzed data suggest that natural selection in haploid cells elucidates several aspects of the origin of new genes by explaining the general prevalence of their testis expression, and a parsimonious solution for new alleles to avoid being lost by genetic drift or pseudogenization.
Collapse
|
20
|
Abstract
In this perspective, we evaluate the explanatory power of the neutral theory of molecular evolution, 50 years after its introduction by Kimura. We argue that the neutral theory was supported by unreliable theoretical and empirical evidence from the beginning, and that in light of modern, genome-scale data, we can firmly reject its universality. The ubiquity of adaptive variation both within and between species means that a more comprehensive theory of molecular evolution must be sought.
Collapse
Affiliation(s)
- Andrew D Kern
- Department of Genetics, Rutgers University, Piscataway, NJ
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University Bloomington, IN
| |
Collapse
|
21
|
Fraïsse C, Puixeu Sala G, Vicoso B. Pleiotropy Modulates the Efficacy of Selection in Drosophila melanogaster. Mol Biol Evol 2019; 36:500-515. [PMID: 30590559 PMCID: PMC6389323 DOI: 10.1093/molbev/msy246] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Pleiotropy is the well-established idea that a single mutation affects multiple phenotypes. If a mutation has opposite effects on fitness when expressed in different contexts, then genetic conflict arises. Pleiotropic conflict is expected to reduce the efficacy of selection by limiting the fixation of beneficial mutations through adaptation, and the removal of deleterious mutations through purifying selection. Although this has been widely discussed, in particular in the context of a putative "gender load," it has yet to be systematically quantified. In this work, we empirically estimate to which extent different pleiotropic regimes impede the efficacy of selection in Drosophila melanogaster. We use whole-genome polymorphism data from a single African population and divergence data from D. simulans to estimate the fraction of adaptive fixations (α), the rate of adaptation (ωA), and the direction of selection (DoS). After controlling for confounding covariates, we find that the different pleiotropic regimes have a relatively small, but significant, effect on selection efficacy. Specifically, our results suggest that pleiotropic sexual antagonism may restrict the efficacy of selection, but that this conflict can be resolved by limiting the expression of genes to the sex where they are beneficial. Intermediate levels of pleiotropy across tissues and life stages can also lead to maladaptation in D. melanogaster, due to inefficient purifying selection combined with low frequency of mutations that confer a selective advantage. Thus, our study highlights the need to consider the efficacy of selection in the context of antagonistic pleiotropy, and of genetic conflict in general.
Collapse
Affiliation(s)
- Christelle Fraïsse
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg 3400, Austria
| | - Gemma Puixeu Sala
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg 3400, Austria
| | - Beatriz Vicoso
- Institute of Science and Technology Austria, Am Campus 1, Klosterneuburg 3400, Austria
| |
Collapse
|
22
|
Amei A, Zhou S. Inferring the distribution of selective effects from a time inhomogeneous model. PLoS One 2019; 14:e0194709. [PMID: 30657757 PMCID: PMC6338356 DOI: 10.1371/journal.pone.0194709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 03/08/2018] [Indexed: 11/18/2022] Open
Abstract
We have developed a Poisson random field model for estimating the distribution of selective effects of newly arisen nonsynonymous mutations that could be observed as polymorphism or divergence in samples of two related species under the assumption that the two species populations are not at mutation-selection-drift equilibrium. The model is applied to 91Drosophila genes by comparing levels of polymorphism in an African population of D. melanogaster with divergence to a reference strain of D. simulans. Based on the difference of gene expression level between testes and ovaries, the 91 genes were classified as 33 male-biased, 28 female-biased, and 30 sex-unbiased genes. Under a Bayesian framework, Markov chain Monte Carlo simulations are implemented to the model in which the distribution of selective effects is assumed to be Gaussian with a mean that may differ from one gene to the other to sample key parameters. Based on our estimates, the majority of newly-arisen nonsynonymous mutations that could contribute to polymorphism or divergence in Drosophila species are mildly deleterious with a mean scaled selection coefficient of -2.81, while almost 86% of the fixed differences between species are driven by positive selection. There are only 16.6% of the nonsynonymous mutations observed in sex-unbiased genes that are under positive selection in comparison to 30% of male-biased and 46% of female-biased genes that are beneficial. We also estimated that D. melanogaster and D. simulans may have diverged 1.72 million years ago.
Collapse
Affiliation(s)
- Amei Amei
- Department of Mathematical Sciences, University of Nevada, Las Vegas, Nevada, United States of America
- * E-mail:
| | - Shilei Zhou
- 54 Crescent Ave, Apt G, Dorchester, Massachusetts, United States of America
| |
Collapse
|
23
|
Signor SA, New FN, Nuzhdin S. A Large Panel of Drosophila simulans Reveals an Abundance of Common Variants. Genome Biol Evol 2018; 10:189-206. [PMID: 29228179 PMCID: PMC5767965 DOI: 10.1093/gbe/evx262] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2017] [Indexed: 01/03/2023] Open
Abstract
The rapidly expanding availability of large NGS data sets provides an opportunity to investigate population genetics at an unprecedented scale. Drosophila simulans is the sister species of the model organism Drosophila melanogaster, and is often presumed to share similar demographic history. However, previous population genetic and ecological work suggests very different signatures of selection and demography. Here, we sequence a new panel of 170 inbred genotypes of a North American population of D. simulans, a valuable complement to the DGRP and other D. melanogaster panels. We find some unexpected signatures of demography, in the form of excess intermediate frequency polymorphisms. Simulations suggest that this is possibly due to a recent population contraction and selection. We examine the outliers in the D. simulans genome determined by a haplotype test to attempt to parse the contribution of demography and selection to the patterns observed in this population. Untangling the relative contribution of demography and selection to genomic patterns of variation is challenging, however, it is clear that although D. melanogaster was thought to share demographic history with D. simulans different forces are at work in shaping genomic variation in this population of D. simulans.
Collapse
Affiliation(s)
- Sarah A Signor
- Department of Molecular and Computational Biology, University of Southern California
| | - Felicia N New
- Department of Molecular Genetics and Microbiology, University of Florida College of Medicine
| | - Sergey Nuzhdin
- Department of Molecular and Computational Biology, University of Southern California
| |
Collapse
|
24
|
Llopart A. Faster‐X evolution of gene expression is driven by recessive adaptive
cis
‐regulatory variation in
Drosophila. Mol Ecol 2018; 27:3811-3821. [DOI: 10.1111/mec.14708] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2017] [Revised: 03/28/2018] [Accepted: 04/05/2018] [Indexed: 12/30/2022]
Affiliation(s)
- Ana Llopart
- Department of Biology The University of Iowa Iowa City Iowa
- Interdisciplinary Graduate Program in Genetics The University of Iowa Iowa City Iowa
| |
Collapse
|
25
|
Schirrmann MK, Zoller S, Croll D, Stukenbrock EH, Leuchtmann A, Fior S. Genomewide signatures of selection in Epichloë reveal candidate genes for host specialization. Mol Ecol 2018; 27:3070-3086. [PMID: 29633410 DOI: 10.1111/mec.14585] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Revised: 02/21/2018] [Accepted: 02/23/2018] [Indexed: 12/31/2022]
Abstract
Host specialization is a key process in ecological divergence and speciation of plant-associated fungi. The underlying determinants of host specialization are generally poorly understood, especially in endophytes, which constitute one of the most abundant components of the plant microbiome. We addressed the genetic basis of host specialization in two sympatric subspecies of grass-endophytic fungi from the Epichloë typhina complex: subsp. typhina and clarkii. The life cycle of these fungi entails unrestricted dispersal of gametes and sexual reproduction before infection of a new host, implying that the host imposes a selective barrier on viability of the progeny. We aimed to detect genes under divergent selection between subspecies, experiencing restricted gene flow due to adaptation to different hosts. Using pooled whole-genome sequencing data, we combined FST and DXY population statistics in genome scans and detected 57 outlier genes showing strong differentiation between the two subspecies. Genomewide analyses of nucleotide diversity (π), Tajima's D and dN/dS ratios indicated that these genes have evolved under positive selection. Genes encoding secreted proteins were enriched among the genes showing evidence of positive selection, suggesting that molecular plant-fungus interactions are strong drivers of endophyte divergence. We focused on five genes encoding secreted proteins, which were further sequenced in 28 additional isolates collected across Europe to assess genetic variation in a larger sample size. Signature of positive selection in these isolates and putative identification of pathogenic function supports our findings that these genes represent strong candidates for host specialization determinants in Epichloë endophytes. Our results highlight the role of secreted proteins as key determinants of host specialization.
Collapse
Affiliation(s)
- Melanie K Schirrmann
- Institute of Integrative Biology (IBZ), ETH Zürich, Zürich, Switzerland.,Research Group Molecular Diagnostics, Genomics and Bioinformatics, Agroscope, Wädenswil, Switzerland
| | - Stefan Zoller
- Genetic Diversity Centre (GDC), ETH Zürich, Zürich, Switzerland
| | - Daniel Croll
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
| | - Eva H Stukenbrock
- Environmental Genomics, Christian-Albrechts University of Kiel, Kiel, Germany.,Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Adrian Leuchtmann
- Institute of Integrative Biology (IBZ), ETH Zürich, Zürich, Switzerland
| | - Simone Fior
- Institute of Integrative Biology (IBZ), ETH Zürich, Zürich, Switzerland
| |
Collapse
|
26
|
RNA-Interference Pathways Display High Rates of Adaptive Protein Evolution in Multiple Invertebrates. Genetics 2018; 208:1585-1599. [PMID: 29437826 PMCID: PMC5887150 DOI: 10.1534/genetics.117.300567] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 01/31/2018] [Indexed: 12/30/2022] Open
Abstract
Conflict between organisms can lead to a reciprocal adaptation that manifests as an increased evolutionary rate in genes mediating the conflict. This adaptive signature has been observed in RNA-interference (RNAi) pathway genes involved in the suppression of viruses and transposable elements in Drosophila melanogaster, suggesting that a subset of Drosophila RNAi genes may be locked in an arms race with these parasites. However, it is not known whether rapid evolution of RNAi genes is a general phenomenon across invertebrates, or which RNAi genes generally evolve adaptively. Here we use population genomic data from eight invertebrate species to infer rates of adaptive sequence evolution, and to test for past and ongoing selective sweeps in RNAi genes. We assess rates of adaptive protein evolution across species using a formal meta-analytic framework to combine data across species and by implementing a multispecies generalized linear mixed model of mutation counts. Across species, we find that RNAi genes display a greater rate of adaptive protein substitution than other genes, and that this is primarily mediated by positive selection acting on the genes most likely to defend against viruses and transposable elements. In contrast, evidence for recent selective sweeps is broadly spread across functional classes of RNAi genes and differs substantially among species. Finally, we identify genes that exhibit elevated adaptive evolution across the analyzed insect species, perhaps due to concurrent parasite-mediated arms races.
Collapse
|
27
|
Grivet D, Avia K, Vaattovaara A, Eckert AJ, Neale DB, Savolainen O, González-Martínez SC. High rate of adaptive evolution in two widespread European pines. Mol Ecol 2017; 26:6857-6870. [PMID: 29110402 DOI: 10.1111/mec.14402] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Revised: 09/14/2017] [Accepted: 09/25/2017] [Indexed: 12/18/2022]
Abstract
Comparing related organisms with differing ecological requirements and evolutionary histories can shed light on the mechanisms and drivers underlying genetic adaptation. Here, by examining a common set of hundreds of loci, we compare patterns of nucleotide diversity and molecular adaptation of two European conifers (Scots pine and maritime pine) living in contrasted environments and characterized by distinct population genetic structure (low and clinal in Scots pine, high and ecotypic in maritime pine) and demographic histories. We found higher nucleotide diversity in Scots pine than in maritime pine, whereas rates of new adaptive substitutions (ωa ), as estimated from the distribution of fitness effects, were similar across species and among the highest found in plants. Sample size and population genetic structure did not appear to have resulted in significant bias in estimates of ωa . Moreover, population contraction-expansion dynamics for each species did not affect differentially the rate of adaptive substitution in these two pines. Several methodological and biological factors may underlie the unusually high rate of adaptive evolution of Scots pine and maritime pine. By providing two new case studies with contrasting evolutionary histories, we contribute to disentangling the multiple factors potentially affecting adaptive evolution in natural plant populations.
Collapse
Affiliation(s)
- Delphine Grivet
- Department of Forest Ecology and Genetics, Forest Research Centre, INIA-CIFOR, Madrid, Spain.,Sustainable Forest Management Research Institute, INIA - University of Valladolid, Palencia, Spain
| | - Komlan Avia
- Department of Ecology and Genetics and Biocenter Oulu, University of Oulu, Oulu, Finland.,Algal Genetics Group, UMR 8227, CNRS, Sorbonne Universités, UPMC, Station Biologique Roscoff, Roscoff, France.,UMI 3614 Evolutionary Biology and Ecology of Algae, CNRS, Sorbonne Universités, UPMC, Pontificia Universidad Católica de Chile, Universidad Austral de Chile, Station Biologique Roscoff, Roscoff, France
| | - Aleksia Vaattovaara
- Department of Ecology and Genetics and Biocenter Oulu, University of Oulu, Oulu, Finland.,Division of Plant Biology, Department of Biosciences, Viikki Plant Science Centre (ViPS), University of Helsinki, Helsinki, Finland
| | - Andrew J Eckert
- Department of Biology, Virginia Commonwealth University, Richmond, VA, USA
| | - David B Neale
- Department of Plant Sciences, University of California at Davis, Davis, CA, USA
| | - Outi Savolainen
- Department of Ecology and Genetics and Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Santiago C González-Martínez
- Department of Forest Ecology and Genetics, Forest Research Centre, INIA-CIFOR, Madrid, Spain.,Sustainable Forest Management Research Institute, INIA - University of Valladolid, Palencia, Spain.,BIOGECO, INRA, Univ. Bordeaux, Cestas, France
| |
Collapse
|
28
|
Warner MR, Mikheyev AS, Linksvayer TA. Genomic Signature of Kin Selection in an Ant with Obligately Sterile Workers. Mol Biol Evol 2017; 34:1780-1787. [PMID: 28419349 PMCID: PMC5455959 DOI: 10.1093/molbev/msx123] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Kin selection is thought to drive the evolution of cooperation and conflict, but the specific genes and genome-wide patterns shaped by kin selection are unknown. We identified thousands of genes associated with the sterile ant worker caste, the archetype of an altruistic phenotype shaped by kin selection, and then used population and comparative genomic approaches to study patterns of molecular evolution at these genes. Consistent with population genetic theoretical predictions, worker-upregulated genes experienced reduced selection compared with genes upregulated in reproductive castes. Worker-upregulated genes included more taxonomically restricted genes, indicating that the worker caste has recruited more novel genes, yet these genes also experienced reduced selection. Our study identifies a putative genomic signature of kin selection and helps to integrate emerging sociogenomic data with longstanding social evolution theory.
Collapse
Affiliation(s)
- Michael R Warner
- Department of Biology, University of Pennsylvania, Philadelphia, PA
| | - Alexander S Mikheyev
- Ecology and Evolution Unit, Okinawa Institute of Science and Technology, Onna-son, Okinawa, Japan
| | | |
Collapse
|
29
|
De La Torre AR, Li Z, Van de Peer Y, Ingvarsson PK. Contrasting Rates of Molecular Evolution and Patterns of Selection among Gymnosperms and Flowering Plants. Mol Biol Evol 2017; 34:1363-1377. [PMID: 28333233 PMCID: PMC5435085 DOI: 10.1093/molbev/msx069] [Citation(s) in RCA: 126] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The majority of variation in rates of molecular evolution among seed plants remains both unexplored and unexplained. Although some attention has been given to flowering plants, reports of molecular evolutionary rates for their sister plant clade (gymnosperms) are scarce, and to our knowledge differences in molecular evolution among seed plant clades have never been tested in a phylogenetic framework. Angiosperms and gymnosperms differ in a number of features, of which contrasting reproductive biology, life spans, and population sizes are the most prominent. The highly conserved morphology of gymnosperms evidenced by similarity of extant species to fossil records and the high levels of macrosynteny at the genomic level have led scientists to believe that gymnosperms are slow-evolving plants, although some studies have offered contradictory results. Here, we used 31,968 nucleotide sites obtained from orthologous genes across a wide taxonomic sampling that includes representatives of most conifers, cycads, ginkgo, and many angiosperms with a sequenced genome. Our results suggest that angiosperms and gymnosperms differ considerably in their rates of molecular evolution per unit time, with gymnosperm rates being, on average, seven times lower than angiosperm species. Longer generation times and larger genome sizes are some of the factors explaining the slow rates of molecular evolution found in gymnosperms. In contrast to their slow rates of molecular evolution, gymnosperms possess higher substitution rate ratios than angiosperm taxa. Finally, our study suggests stronger and more efficient purifying and diversifying selection in gymnosperm than in angiosperm species, probably in relation to larger effective population sizes.
Collapse
Affiliation(s)
- Amanda R De La Torre
- Department of Plant Sciences, University of California-Davis, Davis, CA.,Department of Ecology and Environmental Science, Umeå University, Umeå, Sweden
| | - Zhen Li
- Department of Plant Systems Biology, VIB, Ghent, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Systems Biology, VIB, Ghent, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Genomics Research Institute, University of Pretoria, Hatfield Campus, Pretoria, South Africa
| | - Pär K Ingvarsson
- Department of Ecology and Environmental Science, Umeå University, Umeå, Sweden.,Department of Plant Biology, Uppsala Biocenter, Swedish University of Agricultural Sciences, Uppsala, Sweden
| |
Collapse
|
30
|
Schrider DR, Shanku AG, Kern AD. Effects of Linked Selective Sweeps on Demographic Inference and Model Selection. Genetics 2016; 204:1207-1223. [PMID: 27605051 PMCID: PMC5105852 DOI: 10.1534/genetics.116.190223] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 09/02/2016] [Indexed: 01/06/2023] Open
Abstract
The availability of large-scale population genomic sequence data has resulted in an explosion in efforts to infer the demographic histories of natural populations across a broad range of organisms. As demographic events alter coalescent genealogies, they leave detectable signatures in patterns of genetic variation within and between populations. Accordingly, a variety of approaches have been designed to leverage population genetic data to uncover the footprints of demographic change in the genome. The vast majority of these methods make the simplifying assumption that the measures of genetic variation used as their input are unaffected by natural selection. However, natural selection can dramatically skew patterns of variation not only at selected sites, but at linked, neutral loci as well. Here we assess the impact of recent positive selection on demographic inference by characterizing the performance of three popular methods through extensive simulation of data sets with varying numbers of linked selective sweeps. In particular, we examined three different demographic models relevant to a number of species, finding that positive selection can bias parameter estimates of each of these models-often severely. We find that selection can lead to incorrect inferences of population size changes when none have occurred. Moreover, we show that linked selection can lead to incorrect demographic model selection, when multiple demographic scenarios are compared. We argue that natural populations may experience the amount of recent positive selection required to skew inferences. These results suggest that demographic studies conducted in many species to date may have exaggerated the extent and frequency of population size changes.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, Rutgers University, Piscataway, New Jersey 08854
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey 08554
| | - Alexander G Shanku
- Department of Genetics, Rutgers University, Piscataway, New Jersey 08854
- Institute for Quantitative Biomedicine, Rutgers University, Piscataway, New Jersey 08554
| | - Andrew D Kern
- Department of Genetics, Rutgers University, Piscataway, New Jersey 08854
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey 08554
| |
Collapse
|
31
|
Arguello JR, Cardoso-Moreira M, Grenier JK, Gottipati S, Clark AG, Benton R. Extensive local adaptation within the chemosensory system following Drosophila melanogaster's global expansion. Nat Commun 2016; 7:ncomms11855. [PMID: 27292132 PMCID: PMC4910016 DOI: 10.1038/ncomms11855] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 05/06/2016] [Indexed: 01/05/2023] Open
Abstract
How organisms adapt to new environments is of fundamental biological interest, but poorly understood at the genetic level. Chemosensory systems provide attractive models to address this problem, because they lie between external environmental signals and internal physiological responses. To investigate how selection has shaped the well-characterized chemosensory system of Drosophila melanogaster, we have analysed genome-wide data from five diverse populations. By couching population genomic analyses of chemosensory protein families within parallel analyses of other large families, we demonstrate that chemosensory proteins are not outliers for adaptive divergence between species. However, chemosensory families often display the strongest genome-wide signals of recent selection within D. melanogaster. We show that recent adaptation has operated almost exclusively on standing variation, and that patterns of adaptive mutations predict diverse effects on protein function. Finally, we provide evidence that chemosensory proteins have experienced relaxed constraint, and argue that this has been important for their rapid adaptation over short timescales. Fruit flies gain valuable information about their environment by sensing chemicals. Here, Arguello et al. show strong signals of recent selection on the chemosensory system of the fruit fly Drosophila melanogaster, consistent with the adaptation of populations to their local chemical environment.
Collapse
Affiliation(s)
- J Roman Arguello
- Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, CH-1015 Lausanne, Switzerland.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Margarida Cardoso-Moreira
- Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, CH-1015 Lausanne, Switzerland.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Jennifer K Grenier
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Srikanth Gottipati
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA.,Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA
| | - Richard Benton
- Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, CH-1015 Lausanne, Switzerland
| |
Collapse
|
32
|
Enard D, Cai L, Gwennap C, Petrov DA. Viruses are a dominant driver of protein adaptation in mammals. eLife 2016; 5. [PMID: 27187613 PMCID: PMC4869911 DOI: 10.7554/elife.12469] [Citation(s) in RCA: 192] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Accepted: 04/04/2016] [Indexed: 12/12/2022] Open
Abstract
Viruses interact with hundreds to thousands of proteins in mammals, yet adaptation against viruses has only been studied in a few proteins specialized in antiviral defense. Whether adaptation to viruses typically involves only specialized antiviral proteins or affects a broad array of virus-interacting proteins is unknown. Here, we analyze adaptation in ~1300 virus-interacting proteins manually curated from a set of 9900 proteins conserved in all sequenced mammalian genomes. We show that viruses (i) use the more evolutionarily constrained proteins within the cellular functions they interact with and that (ii) despite this high constraint, virus-interacting proteins account for a high proportion of all protein adaptation in humans and other mammals. Adaptation is elevated in virus-interacting proteins across all functional categories, including both immune and non-immune functions. We conservatively estimate that viruses have driven close to 30% of all adaptive amino acid changes in the part of the human proteome conserved within mammals. Our results suggest that viruses are one of the most dominant drivers of evolutionary change across mammalian and human proteomes.
Collapse
Affiliation(s)
- David Enard
- Department of Biology, Stanford University, Stanford, United States
| | - Le Cai
- Department of Biology, Stanford University, Stanford, United States
| | - Carina Gwennap
- Department of Biology, Stanford University, Stanford, United States
| | - Dmitri A Petrov
- Department of Biology, Stanford University, Stanford, United States
| |
Collapse
|
33
|
Elevated Linkage Disequilibrium and Signatures of Soft Sweeps Are Common in Drosophila melanogaster. Genetics 2016; 203:863-80. [PMID: 27098909 DOI: 10.1534/genetics.115.184002] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2015] [Accepted: 03/25/2016] [Indexed: 12/20/2022] Open
Abstract
The extent to which selection and demography impact patterns of genetic diversity in natural populations of Drosophila melanogaster is yet to be fully understood. We previously observed that linkage disequilibrium (LD) at scales of ∼10 kb in the Drosophila Genetic Reference Panel (DGRP), consisting of 145 inbred strains from Raleigh, North Carolina, measured both between pairs of sites and as haplotype homozygosity, is elevated above neutral demographic expectations. We also demonstrated that signatures of strong and recent soft sweeps are abundant. However, the extent to which these patterns are specific to this derived and admixed population is unknown. It is also unclear whether these patterns are a consequence of the extensive inbreeding performed to generate the DGRP data. Here we analyze LD statistics in a sample of >100 fully-sequenced strains from Zambia; an ancestral population to the Raleigh population that has experienced little to no admixture and was generated by sequencing haploid embryos rather than inbred strains. We find an elevation in long-range LD and haplotype homozygosity compared to neutral expectations in the Zambian sample, thus showing the elevation in LD is not specific to the DGRP data set. This elevation in LD and haplotype structure remains even after controlling for possible confounders including genomic inversions, admixture, population substructure, close relatedness of individual strains, and recombination rate variation. Furthermore, signatures of partial soft sweeps similar to those found in the DGRP as well as partial hard sweeps are common in Zambia. These results suggest that while the selective forces and sources of adaptive mutations may differ in Zambia and Raleigh, elevated long-range LD and signatures of soft sweeps are generic in D. melanogaster.
Collapse
|
34
|
Matsumoto T, John A, Baeza-Centurion P, Li B, Akashi H. Codon Usage Selection Can Bias Estimation of the Fraction of Adaptive Amino Acid Fixations. Mol Biol Evol 2016; 33:1580-9. [PMID: 26873577 DOI: 10.1093/molbev/msw027] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
A growing number of molecular evolutionary studies are estimating the proportion of adaptive amino acid substitutions (α) from comparisons of ratios of polymorphic and fixed DNA mutations. Here, we examine how violations of two of the model assumptions, neutral evolution of synonymous mutations and stationary base composition, affect α estimation. We simulated the evolution of coding sequences assuming weak selection on synonymous codon usage bias and neutral protein evolution, α = 0. We show that weak selection on synonymous mutations can give polymorphism/divergence ratios that yield α-hat (estimated α) considerably larger than its true value. Nonstationary evolution (changes in population size, selection, or mutation) can exacerbate such biases or, in some scenarios, give biases in the opposite direction, α-hat < α. These results demonstrate that two factors that appear to be prevalent among taxa, weak selection on synonymous mutations and non-steady-state nucleotide composition, should be considered when estimating α. Estimates of the proportion of adaptive amino acid fixations from large-scale analyses of Drosophila melanogaster polymorphism and divergence data are positively correlated with codon usage bias. Such patterns are consistent with α-hat inflation from weak selection on synonymous mutations and/or mutational changes within the examined gene trees.
Collapse
Affiliation(s)
- Tomotaka Matsumoto
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Anoop John
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Pablo Baeza-Centurion
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Boyang Li
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Hiroshi Akashi
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan Department of Genetics, The Graduate University for Advanced Studies (SOKENDAI), Yata, Mishima, Shizuoka, Japan
| |
Collapse
|
35
|
Abstract
Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.
Collapse
Affiliation(s)
- Jessica Hedge
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
| | - Daniel J. Wilson
- Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- Wellcome Trust Centre for Human Genetics, Oxford, United Kingdom
| |
Collapse
|
36
|
Galtier N. Adaptive Protein Evolution in Animals and the Effective Population Size Hypothesis. PLoS Genet 2016; 12:e1005774. [PMID: 26752180 PMCID: PMC4709115 DOI: 10.1371/journal.pgen.1005774] [Citation(s) in RCA: 122] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 12/05/2015] [Indexed: 01/09/2023] Open
Abstract
The rate at which genomes adapt to environmental changes and the prevalence of adaptive processes in molecular evolution are two controversial issues in current evolutionary genetics. Previous attempts to quantify the genome-wide rate of adaptation through amino-acid substitution have revealed a surprising diversity of patterns, with some species (e.g. Drosophila) experiencing a very high adaptive rate, while other (e.g. humans) are dominated by nearly-neutral processes. It has been suggested that this discrepancy reflects between-species differences in effective population size. Published studies, however, were mainly focused on model organisms, and relied on disparate data sets and methodologies, so that an overview of the prevalence of adaptive protein evolution in nature is currently lacking. Here we extend existing estimators of the amino-acid adaptive rate by explicitly modelling the effect of favourable mutations on non-synonymous polymorphism patterns, and we apply these methods to a newly-built, homogeneous data set of 44 non-model animal species pairs. Data analysis uncovers a major contribution of adaptive evolution to the amino-acid substitution process across all major metazoan phyla—with the notable exception of humans and primates. The proportion of adaptive amino-acid substitution is found to be positively correlated to species effective population size. This relationship, however, appears to be primarily driven by a decreased rate of nearly-neutral amino-acid substitution because of more efficient purifying selection in large populations. Our results reveal that adaptive processes dominate the evolution of proteins in most animal species, but do not corroborate the hypothesis that adaptive substitutions accumulate at a faster rate in large populations. Implications regarding the factors influencing the rate of adaptive evolution and positive selection detection in humans vs. other organisms are discussed. The rate at which species adapt to environmental changes is a controversial topic. The theory predicts that adaptation is easier in large than in small populations, and the genomic studies of model organisms have revealed a much higher adaptive rate in large population-sized flies than in small population-sized humans and apes. Here we build and analyse a large data set of protein-coding sequences made of thousands of genes in 44 pairs of species from various groups of animals including insects, molluscs, annelids, echinoderms, reptiles, birds, and mammals. Extending and improving existing data analysis methods, we show that adaptation is a major process in protein evolution across all phyla of animals: the proportion of amino-acid substitutions that occurred adaptively is above 50% in a majority of species, and reaches up to 90%. Our analysis does not confirm that population size, here approached through species genetic diversity and ecological traits, does influence the rate of adaptive molecular evolution, but points to human and apes as a special case, compared to other animals, in terms of adaptive genomic processes.
Collapse
Affiliation(s)
- Nicolas Galtier
- Institut des Sciences de l'Evolution UMR5554, Université Montpellier–CNRS–IRD–EPHE, Montpellier, France
- * E-mail:
| |
Collapse
|
37
|
Böndel KB, Lainer H, Nosenko T, Mboup M, Tellier A, Stephan W. North–South Colonization Associated with Local Adaptation of the Wild Tomato SpeciesSolanum chilense. Mol Biol Evol 2015; 32:2932-43. [DOI: 10.1093/molbev/msv166] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
|
38
|
Santpere G, Carnero-Montoro E, Petit N, Serra F, Hvilsom C, Rambla J, Heredia-Genestar JM, Halligan DL, Dopazo H, Navarro A, Bosch E. Analysis of Five Gene Sets in Chimpanzees Suggests Decoupling between the Action of Selection on Protein-Coding and on Noncoding Elements. Genome Biol Evol 2015; 7:1490-505. [PMID: 25977458 PMCID: PMC4494068 DOI: 10.1093/gbe/evv082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We set out to investigate potential differences and similarities between the selective forces acting upon the coding and noncoding regions of five different sets of genes defined according to functional and evolutionary criteria: 1) two reference gene sets presenting accelerated and slow rates of protein evolution (the Complement and Actin pathways); 2) a set of genes with evidence of accelerated evolution in at least one of their introns; and 3) two gene sets related to neurological function (Parkinson’s and Alzheimer’s diseases). To that effect, we combine human–chimpanzee divergence patterns with polymorphism data obtained from target resequencing 20 central chimpanzees, our closest relatives with largest long-term effective population size. By using the distribution of fitness effect-alpha extension of the McDonald–Kreitman test, we reproduce inferences of rates of evolution previously based only on divergence data on both coding and intronic sequences and also obtain inferences for other classes of genomic elements (untranslated regions, promoters, and conserved noncoding sequences). Our results suggest that 1) the distribution of fitness effect-alpha method successfully helps distinguishing different scenarios of accelerated divergence (adaptation or relaxed selective constraints) and 2) the adaptive history of coding and noncoding sequences within the gene sets analyzed is decoupled.
Collapse
Affiliation(s)
- Gabriel Santpere
- Departament de Ciències Experimentals i la Salut, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Barcelona, Spain
| | - Elena Carnero-Montoro
- Departament de Ciències Experimentals i la Salut, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Barcelona, Spain
| | - Natalia Petit
- Departament de Ciències Experimentals i la Salut, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Barcelona, Spain
| | - François Serra
- Structural Genomics Team, Genome Biology Group, Centre Nacional d'Anàlisi Genòmica (CNAG), Barcelona, Spain
| | | | - Jordi Rambla
- Departament de Ciències Experimentals i la Salut, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Barcelona, Spain
| | - Jose Maria Heredia-Genestar
- Departament de Ciències Experimentals i la Salut, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Barcelona, Spain
| | - Daniel L Halligan
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Hernan Dopazo
- Biomedical Genomics & Evolution Laboratory, Departamento de Ecología, Genética y Evolución, IEGEBA (CONICET-UBA), Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Argentina
| | - Arcadi Navarro
- Departament de Ciències Experimentals i la Salut, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Barcelona, Spain National Institute for Bioinformatics (INB), PRBB, Barcelona, Spain Institució Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Barcelona, Spain Center for Genomic Regulation (CRG), PRBB, Barcelona, Spain
| | - Elena Bosch
- Departament de Ciències Experimentals i la Salut, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Barcelona, Spain
| |
Collapse
|
39
|
Garud NR, Messer PW, Buzbas EO, Petrov DA. Recent selective sweeps in North American Drosophila melanogaster show signatures of soft sweeps. PLoS Genet 2015; 11:e1005004. [PMID: 25706129 PMCID: PMC4338236 DOI: 10.1371/journal.pgen.1005004] [Citation(s) in RCA: 305] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 01/14/2015] [Indexed: 11/18/2022] Open
Abstract
Adaptation from standing genetic variation or recurrent de novo mutation in large populations should commonly generate soft rather than hard selective sweeps. In contrast to a hard selective sweep, in which a single adaptive haplotype rises to high population frequency, in a soft selective sweep multiple adaptive haplotypes sweep through the population simultaneously, producing distinct patterns of genetic variation in the vicinity of the adaptive site. Current statistical methods were expressly designed to detect hard sweeps and most lack power to detect soft sweeps. This is particularly unfortunate for the study of adaptation in species such as Drosophila melanogaster, where all three confirmed cases of recent adaptation resulted in soft selective sweeps and where there is evidence that the effective population size relevant for recent and strong adaptation is large enough to generate soft sweeps even when adaptation requires mutation at a specific single site at a locus. Here, we develop a statistical test based on a measure of haplotype homozygosity (H12) that is capable of detecting both hard and soft sweeps with similar power. We use H12 to identify multiple genomic regions that have undergone recent and strong adaptation in a large population sample of fully sequenced Drosophila melanogaster strains from the Drosophila Genetic Reference Panel (DGRP). Visual inspection of the top 50 candidates reveals that in all cases multiple haplotypes are present at high frequencies, consistent with signatures of soft sweeps. We further develop a second haplotype homozygosity statistic (H2/H1) that, in combination with H12, is capable of differentiating hard from soft sweeps. Surprisingly, we find that the H12 and H2/H1 values for all top 50 peaks are much more easily generated by soft rather than hard sweeps. We discuss the implications of these results for the study of adaptation in Drosophila and in species with large census population sizes. Evolutionary adaptation is a process in which beneficial mutations increase in frequency in response to selective pressures. If these mutations were previously rare or absent from the population, adaptation should generate a characteristic signature in the genetic diversity around the adaptive locus, known as a selective sweep. Such selective sweeps can be distinguished into hard selective sweeps, where only a single adaptive mutation rises in frequency, or soft selective sweeps, where multiple adaptive mutations at the same locus sweep through the population simultaneously. Here we design a new statistical method that can identify both hard and soft sweeps in population genomic data and apply this method to a Drosophila melanogaster population genomic dataset consisting of 145 sequenced strains collected in North Carolina. We find that selective sweeps were abundant in the recent history of this population. Interestingly, we also find that practically all of the strongest and most recent sweeps show patterns that are more consistent with soft rather than hard sweeps. We discuss the implications of these findings for the discovery and quantification of adaptation from population genomic data in Drosophila and other species with large population sizes.
Collapse
Affiliation(s)
- Nandita R. Garud
- Department of Genetics, Stanford University, Stanford, California, United States of America
- Department of Biology, Stanford University, Stanford, California, United States of America
- * E-mail: (NRG); (DAP)
| | - Philipp W. Messer
- Department of Biology, Stanford University, Stanford, California, United States of America
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Erkan O. Buzbas
- Department of Biology, Stanford University, Stanford, California, United States of America
- Department of Statistical Science, University of Idaho, Moscow, Idaho, United States of America
| | - Dmitri A. Petrov
- Department of Biology, Stanford University, Stanford, California, United States of America
- * E-mail: (NRG); (DAP)
| |
Collapse
|
40
|
Guillén Y, Rius N, Delprat A, Williford A, Muyas F, Puig M, Casillas S, Ràmia M, Egea R, Negre B, Mir G, Camps J, Moncunill V, Ruiz-Ruano FJ, Cabrero J, de Lima LG, Dias GB, Ruiz JC, Kapusta A, Garcia-Mas J, Gut M, Gut IG, Torrents D, Camacho JP, Kuhn GCS, Feschotte C, Clark AG, Betrán E, Barbadilla A, Ruiz A. Genomics of ecological adaptation in cactophilic Drosophila. Genome Biol Evol 2014; 7:349-66. [PMID: 25552534 PMCID: PMC4316639 DOI: 10.1093/gbe/evu291] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Cactophilic Drosophila species provide a valuable model to study gene–environment interactions and ecological adaptation. Drosophila buzzatii and Drosophila mojavensis are two cactophilic species that belong to the repleta group, but have very different geographical distributions and primary host plants. To investigate the genomic basis of ecological adaptation, we sequenced the genome and developmental transcriptome of D. buzzatii and compared its gene content with that of D. mojavensis and two other noncactophilic Drosophila species in the same subgenus. The newly sequenced D. buzzatii genome (161.5 Mb) comprises 826 scaffolds (>3 kb) and contains 13,657 annotated protein-coding genes. Using RNA sequencing data of five life-stages we found expression of 15,026 genes, 80% protein-coding genes, and 20% noncoding RNA genes. In total, we detected 1,294 genes putatively under positive selection. Interestingly, among genes under positive selection in the D. mojavensis lineage, there is an excess of genes involved in metabolism of heterocyclic compounds that are abundant in Stenocereus cacti and toxic to nonresident Drosophila species. We found 117 orphan genes in the shared D. buzzatii–D. mojavensis lineage. In addition, gene duplication analysis identified lineage-specific expanded families with functional annotations associated with proteolysis, zinc ion binding, chitin binding, sensory perception, ethanol tolerance, immunity, physiology, and reproduction. In summary, we identified genetic signatures of adaptation in the shared D. buzzatii–D. mojavensis lineage, and in the two separate D. buzzatii and D. mojavensis lineages. Many of the novel lineage-specific genomic features are promising candidates for explaining the adaptation of these species to their distinct ecological niches.
Collapse
Affiliation(s)
- Yolanda Guillén
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain
| | - Núria Rius
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain
| | - Alejandra Delprat
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain
| | | | - Francesc Muyas
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain
| | - Marta Puig
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain
| | - Sònia Casillas
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Spain
| | - Miquel Ràmia
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Spain
| | - Raquel Egea
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Spain
| | - Barbara Negre
- EMBL/CRG Research Unit in Systems Biology, Centre for Genomic Regulation (CRG), Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Gisela Mir
- IRTA, Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Barcelona, Spain The Peter MacCallum Cancer Centre, East Melbourne, Victoria, Australia
| | - Jordi Camps
- Centro Nacional de Análisis Genómico (CNAG), Parc Científic de Barcelona, Torre I, Barcelona, Spain
| | - Valentí Moncunill
- Barcelona Supercomputing Center (BSC), Edifici TG (Torre Girona), Barcelona, Spain and Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | | | - Josefa Cabrero
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Spain
| | - Leonardo G de Lima
- Instituto de Ciências Biológicas, Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Guilherme B Dias
- Instituto de Ciências Biológicas, Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Jeronimo C Ruiz
- Informática de Biossistemas, Centro de Pesquisas René Rachou-Fiocruz Minas, Belo Horizonte, MG, Brazil
| | - Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine
| | - Jordi Garcia-Mas
- IRTA, Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Barcelona, Spain
| | - Marta Gut
- Centro Nacional de Análisis Genómico (CNAG), Parc Científic de Barcelona, Torre I, Barcelona, Spain
| | - Ivo G Gut
- Centro Nacional de Análisis Genómico (CNAG), Parc Científic de Barcelona, Torre I, Barcelona, Spain
| | - David Torrents
- Barcelona Supercomputing Center (BSC), Edifici TG (Torre Girona), Barcelona, Spain and Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Juan P Camacho
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Spain
| | - Gustavo C S Kuhn
- Instituto de Ciências Biológicas, Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University
| | - Esther Betrán
- Department of Biology, University of Texas at Arlington
| | - Antonio Barbadilla
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Spain
| | - Alfredo Ruiz
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Spain
| |
Collapse
|
41
|
Cornejo OE, Fisher D, Escalante AA. Genome-wide patterns of genetic polymorphism and signatures of selection in Plasmodium vivax. Genome Biol Evol 2014; 7:106-19. [PMID: 25523904 PMCID: PMC4316620 DOI: 10.1093/gbe/evu267] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Plasmodium vivax is the most prevalent human malaria parasite outside of Africa. Yet, studies aimed to identify genes with signatures consistent with natural selection are rare. Here, we present a comparative analysis of the pattern of genetic variation of five sequenced isolates of P. vivax and its divergence with two closely related species, Plasmodium cynomolgi and Plasmodium knowlesi, using a set of orthologous genes. In contrast to Plasmodium falciparum, the parasite that causes the most lethal form of human malaria, we did not find significant constraints on the evolution of synonymous sites genome wide in P. vivax. The comparative analysis of polymorphism and divergence across loci allowed us to identify 87 genes with patterns consistent with positive selection, including genes involved in the “exportome” of P. vivax, which are potentially involved in evasion of the host immune system. Nevertheless, we have found a pattern of polymorphism genome wide that is consistent with a significant amount of constraint on the replacement changes and prevalent negative selection. Our analyses also show that silent polymorphism tends to be larger toward the ends of the chromosomes, where many genes involved in antigenicity are located, suggesting that natural selection acts not only by shaping the patterns of variation within the genes but it also affects genome organization.
Collapse
Affiliation(s)
- Omar E Cornejo
- School of Biological Sciences, Washington State University
| | - David Fisher
- Center for Evolutionary Medicine and Informatics, the Biodesign Institute, Arizona State University
| | - Ananias A Escalante
- Center for Evolutionary Medicine and Informatics, the Biodesign Institute, Arizona State University School of Life Sciences, Arizona State University Present address: Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.
| |
Collapse
|
42
|
Gossmann TI, Waxman D, Eyre-Walker A. Fluctuating selection models and McDonald-Kreitman type analyses. PLoS One 2014; 9:e84540. [PMID: 24409303 PMCID: PMC3883665 DOI: 10.1371/journal.pone.0084540] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2013] [Accepted: 11/15/2013] [Indexed: 12/02/2022] Open
Abstract
It is likely that the strength of selection acting upon a mutation varies through time due to changes in the environment. However, most population genetic theory assumes that the strength of selection remains constant. Here we investigate the consequences of fluctuating selection pressures on the quantification of adaptive evolution using McDonald-Kreitman (MK) style approaches. In agreement with previous work, we show that fluctuating selection can generate evidence of adaptive evolution even when the expected strength of selection on a mutation is zero. However, we also find that the mutations, which contribute to both polymorphism and divergence tend, on average, to be positively selected during their lifetime, under fluctuating selection models. This is because mutations that fluctuate, by chance, to positive selected values, tend to reach higher frequencies in the population than those that fluctuate towards negative values. Hence the evidence of positive adaptive evolution detected under a fluctuating selection model by MK type approaches is genuine since fixed mutations tend to be advantageous on average during their lifetime. Never-the-less we show that methods tend to underestimate the rate of adaptive evolution when selection fluctuates.
Collapse
Affiliation(s)
- Toni I. Gossmann
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| | - David Waxman
- Centre for Computational Systems Biology, Fudan University, Shanghai, China
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| |
Collapse
|
43
|
Loire E, Chiari Y, Bernard A, Cahais V, Romiguier J, Nabholz B, Lourenço JM, Galtier N. Population genomics of the endangered giant Galápagos tortoise. Genome Biol 2013; 14:R136. [PMID: 24342523 PMCID: PMC4053747 DOI: 10.1186/gb-2013-14-12-r136] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 12/16/2013] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND The giant Galápagos tortoise, Chelonoidis nigra, is a large-sized terrestrial chelonian of high patrimonial interest. The species recently colonized a small continental archipelago, the Galápagos Islands, where it has been facing novel environmental conditions and limited resource availability. To explore the genomic consequences of this ecological shift, we analyze the transcriptomic variability of five individuals of C. nigra, and compare it to similar data obtained from several continental species of turtles. RESULTS Having clarified the timing of divergence in the Chelonoidis genus, we report in C. nigra a very low level of genetic polymorphism, signatures of a weakened efficacy of purifying selection, and an elevated mutation load in coding and regulatory sequences. These results are consistent with the hypothesis of an extremely low long-term effective population size in this insular species. Functional evolutionary analyses reveal a reduced diversity of immunity genes in C. nigra, in line with the hypothesis of attenuated pathogen diversity in islands, and an increased selective pressure on genes involved in response to stress, potentially related to the climatic instability of its environment and its elongated lifespan. Finally, we detect no population structure or homozygosity excess in our five-individual sample. CONCLUSIONS These results enlighten the molecular evolution of an endangered taxon in a stressful environment and point to island endemic species as a promising model for the study of the deleterious effects on genome evolution of a reduced long-term population size.
Collapse
Affiliation(s)
- Etienne Loire
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
| | - Ylenia Chiari
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, 4485-661 Vairão, Portugal
| | - Aurélien Bernard
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
| | - Vincent Cahais
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
| | - Jonathan Romiguier
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
| | - Benoît Nabholz
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
| | - Joao Miguel Lourenço
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
| | - Nicolas Galtier
- Université Montpellier 2, CNRS UMR 5554, Institut des Sciences de l’Evolution de Montpellier, Place E. Bataillon, 34095 Montpellier, France
| |
Collapse
|
44
|
Eckert AJ, Bower AD, Jermstad KD, Wegrzyn JL, Knaus BJ, Syring JV, Neale DB. Multilocus analyses reveal little evidence for lineage-wide adaptive evolution within major clades of soft pines (Pinus subgenus Strobus). Mol Ecol 2013; 22:5635-50. [PMID: 24134614 DOI: 10.1111/mec.12514] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Revised: 08/27/2013] [Accepted: 08/29/2013] [Indexed: 12/26/2022]
Abstract
Estimates from molecular data for the fraction of new nonsynonymous mutations that are adaptive vary strongly across plant species. Much of this variation is due to differences in life history strategies as they influence the effective population size (Ne ). Ample variation for these estimates, however, remains even when comparisons are made across species with similar values of Ne . An open question thus remains as to why the large disparity for estimates of adaptive evolution exists among plant species. Here, we have estimated the distribution of deleterious fitness effects (DFE) and the fraction of adaptive nonsynonymous substitutions (α) for 11 species of soft pines (subgenus Strobus) using DNA sequence data from 167 orthologous nuclear gene fragments. Most newly arising nonsynonymous mutations were inferred to be so strongly deleterious that they would rarely become fixed. Little evidence for long-term adaptive evolution was detected, as all 11 estimates for α were not significantly different from zero. Nucleotide diversity at synonymous sites, moreover, was strongly correlated with attributes of the DFE across species, thus illustrating a strong consistency with the expectations from the Nearly Neutral Theory of molecular evolution. Application of these patterns to genome-wide expectations for these species, however, was difficult as the loci chosen for the analysis were a biased set of conserved loci, which greatly influenced the estimates of the DFE and α. This implies that genome-wide parameter estimates will need truly genome-wide data, so that many of the existing patterns documented previously for forest trees (e.g. little evidence for signature of selection) may need revision.
Collapse
Affiliation(s)
- Andrew J Eckert
- Department of Biology, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | | | | | | | | | | | | |
Collapse
|
45
|
Chong Z, Zhai W, Li C, Gao M, Gong Q, Ruan J, Li J, Jiang L, Lv X, Hungate E, Wu CI. The evolution of small insertions and deletions in the coding genes of Drosophila melanogaster. Mol Biol Evol 2013; 30:2699-708. [PMID: 24077769 DOI: 10.1093/molbev/mst167] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Studies of protein evolution have focused on amino acid substitutions with much less systematic analysis on insertion and deletions (indels) in protein coding genes. We hence surveyed 7,500 genes between Drosophila melanogaster and D. simulans, using D. yakuba as an outgroup for this purpose. The evolutionary rate of coding indels is indeed low, at only 3% of that of nonsynonymous substitutions. As coding indels follow a geometric distribution in size and tend to fall in low-complexity regions of proteins, it is unclear whether selection or mutation underlies this low rate. To resolve the issue, we collected genomic sequences from an isogenic African line of D. melanogaster (ZS30) at a high coverage of 70× and analyzed indel polymorphism between ZS30 and the reference genome. In comparing polymorphism and divergence, we found that the divergence to polymorphism ratio (i.e., fixation index) for smaller indels (size ≤ 10 bp) is very similar to that for synonymous changes, suggesting that most of the within-species polymorphism and between-species divergence for indels are selectively neutral. Interestingly, deletions of larger sizes (size ≥ 11 bp and ≤ 30 bp) have a much higher fixation index than synonymous mutations and 44.4% of fixed middle-sized deletions are estimated to be adaptive. To our surprise, this pattern is not found for insertions. Protein indel evolution appear to be in a dynamic flux of neutrally driven expansion (insertions) together with adaptive-driven contraction (deletions), and these observations provide important insights for understanding the fitness of new mutations as well as the evolutionary driving forces for genomic evolution in Drosophila species.
Collapse
Affiliation(s)
- Zechen Chong
- Center for Computational Biology and Laboratory of Disease Genomics and Individualized Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Arunkumar R, Josephs EB, Williamson RJ, Wright SI. Pollen-specific, but not sperm-specific, genes show stronger purifying selection and higher rates of positive selection than sporophytic genes in Capsella grandiflora. Mol Biol Evol 2013; 30:2475-86. [PMID: 23997108 DOI: 10.1093/molbev/mst149] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Selection on the gametophyte can be a major force shaping plant genomes as 7-11% of genes are expressed only in that phase and 60% of genes are expressed in both the gametophytic and sporophytic phases. The efficacy of selection on gametophytic tissues is likely to be influenced by sexual selection acting on male and female functions of hermaphroditic plants. Moreover, the haploid nature of the gametophytic phase allows selection to be efficient in removing recessive deleterious mutations and fixing recessive beneficial mutations. To assess the importance of gametophytic selection, we compared the strength of purifying selection and extent of positive selection on gametophyte- and sporophyte-specific genes in the highly outcrossing plant Capsella grandiflora. We found that pollen-exclusive genes had a larger fraction of sites under strong purifying selection, a greater proportion of adaptive substitutions, and faster protein evolution compared with seedling-exclusive genes. In contrast, sperm cell-exclusive genes had a smaller fraction of sites under strong purifying selection, a lower proportion of adaptive substitutions, and slower protein evolution compared with seedling-exclusive genes. Observations of strong selection acting on pollen-expressed genes are likely explained by sexual selection resulting from pollen competition aided by the haploid nature of that tissue. The relaxation of selection in sperm might be due to the reduced influence of intrasexual competition, but reduced gene expression may also be playing an important role.
Collapse
Affiliation(s)
- Ramesh Arunkumar
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
47
|
Leushkin EV, Bazykin GA, Kondrashov AS. Strong mutational bias toward deletions in the Drosophila melanogaster genome is compensated by selection. Genome Biol Evol 2013; 5:514-24. [PMID: 23395983 PMCID: PMC3622295 DOI: 10.1093/gbe/evt021] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Insertions and deletions (collectively indels) obviously have a major impact on genome evolution. However, before large-scale data on indel polymorphism became available, it was difficult to estimate the strength of selection acting on indel mutations. Here, we analyze indel polymorphism and divergence in different compartments of the Drosophila melanogaster genome: exons, introns of different lengths, and intergenic regions. Data on low-frequency polymorphisms indicate that 0.036–0.039 short (1–30 nt) insertion mutations and 0.085–0.092 short deletion mutations, with mean lengths 3.23 and 4.78, respectively, occur per single-nucleotide substitution. The excess of short deletion over short insertion mutations implies that indel mutations of these lengths should lead to a loss of approximately 0.30 nt per single-nucleotide replacement. However, polymorphism and divergence data show that this deletion bias is almost completely compensated by selection: Negative selection is stronger against deletions, whereas insertions are more likely to be favored by positive selection. Among the inframe low-frequency polymorphic mutations in exons, long introns, and intergenic regions, selection prevents a larger fraction of deletions (80–87%, depending on the type of the compartment) than of insertions (70–82%) or single-nucleotide substitutions (49–73%), from reaching high frequencies. The corresponding fractions were the lowest in short introns: 66%, 47%, and 15%, respectively, consistent with the weakest selective constraint in them. The McDonald–Kreitman test shows that 32–46% of the deletions and 60–73% of the insertions that were fixed in the recent evolution of D. melanogaster are adaptive, whereas this fraction is only 0–29% for single-nucleotide substitutions.
Collapse
Affiliation(s)
- Evgeny V Leushkin
- Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia.
| | | | | |
Collapse
|
48
|
Arbiza L, Gronau I, Aksoy BA, Hubisz MJ, Gulko B, Keinan A, Siepel A. Genome-wide inference of natural selection on human transcription factor binding sites. Nat Genet 2013; 45:723-9. [PMID: 23749186 DOI: 10.1038/ng.2658] [Citation(s) in RCA: 88] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Accepted: 05/08/2013] [Indexed: 11/09/2022]
Abstract
For decades, it has been hypothesized that gene regulation has had a central role in human evolution, yet much remains unknown about the genome-wide impact of regulatory mutations. Here we use whole-genome sequences and genome-wide chromatin immunoprecipitation and sequencing data to demonstrate that natural selection has profoundly influenced human transcription factor binding sites since the divergence of humans from chimpanzees 4-6 million years ago. Our analysis uses a new probabilistic method, called INSIGHT, for measuring the influence of selection on collections of short, interspersed noncoding elements. We find that, on average, transcription factor binding sites have experienced somewhat weaker selection than protein-coding genes. However, the binding sites of several transcription factors show clear evidence of adaptation. Several measures of selection are strongly correlated with predicted binding affinity. Overall, regulatory elements seem to contribute substantially to both adaptive substitutions and deleterious polymorphisms with key implications for human evolution and disease.
Collapse
Affiliation(s)
- Leonardo Arbiza
- Department of Biological Statistics & Computational Biology, Cornell University, Ithaca, NY, USA
| | | | | | | | | | | | | |
Collapse
|
49
|
Reference-free population genomics from next-generation transcriptome data and the vertebrate-invertebrate gap. PLoS Genet 2013; 9:e1003457. [PMID: 23593039 PMCID: PMC3623758 DOI: 10.1371/journal.pgen.1003457] [Citation(s) in RCA: 122] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2012] [Accepted: 03/04/2013] [Indexed: 01/19/2023] Open
Abstract
In animals, the population genomic literature is dominated by two taxa, namely mammals and drosophilids, in which fully sequenced, well-annotated genomes have been available for years. Data from other metazoan phyla are scarce, probably because the vast majority of living species still lack a closely related reference genome. Here we achieve de novo, reference-free population genomic analysis from wild samples in five non-model animal species, based on next-generation sequencing transcriptome data. We introduce a pipe-line for cDNA assembly, read mapping, SNP/genotype calling, and data cleaning, with specific focus on the issue of hidden paralogy detection. In two species for which a reference genome is available, similar results were obtained whether the reference was used or not, demonstrating the robustness of our de novo inferences. The population genomic profile of a hare, a turtle, an oyster, a tunicate, and a termite were found to be intermediate between those of human and Drosophila, indicating that the discordant genomic diversity patterns that have been reported between these two species do not reflect a generalized vertebrate versus invertebrate gap. The genomic average diversity was generally higher in invertebrates than in vertebrates (with the notable exception of termite), in agreement with the notion that population size tends to be larger in the former than in the latter. The non-synonymous to synonymous ratio, however, did not differ significantly between vertebrates and invertebrates, even though it was negatively correlated with genetic diversity within each of the two groups. This study opens promising perspective regarding genome-wide population analyses of non-model organisms and the influence of population size on non-synonymous versus synonymous diversity.
Collapse
|
50
|
ZRT1 Harbors an Excess of Nonsynonymous Polymorphism and Shows Evidence of Balancing Selection in Saccharomyces cerevisiae. G3-GENES GENOMES GENETICS 2013; 3:665-673. [PMID: 23550117 PMCID: PMC3618353 DOI: 10.1534/g3.112.005082] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Estimates of the fraction of nucleotide substitutions driven by positive selection vary widely across different species. Accounting for different estimates of positive selection has been difficult, in part because selection on polymorphism within a species is known to obscure a signal of positive selection among species. While methods have been developed to control for the confounding effects of negative selection against deleterious polymorphism, the impact of balancing selection on estimates of positive selection has not been assessed. In Saccharomyces cerevisiae, there is no signal of positive selection within protein coding sequences as the ratio of nonsynonymous to synonymous polymorphism is higher than that of divergence. To investigate the impact of balancing selection on estimates of positive selection, we examined five genes with high rates of nonsynonymous polymorphism in S. cerevisiae relative to divergence from S. paradoxus. One of the genes, the high-affinity zinc transporter ZRT1 showed an elevated rate of synonymous polymorphism indicative of balancing selection. The high rate of synonymous polymorphism coincided with nonsynonymous divergence among three haplotype groups, among which we found no detectable differences in ZRT1 function. Our results implicate balancing selection in one of five genes exhibiting a large excess of nonsynonymous polymorphism in yeast. We conclude that balancing selection is a potentially important factor in estimating the frequency of positive selection across the yeast genome.
Collapse
|