1
|
Yamashita H, Matsumoto T, Kawashima K, Abdulla Daanaa HS, Yang Z, Akashi H. Dinucleotide preferences underlie apparent codon preference reversals in the Drosophila melanogaster lineage. Proc Natl Acad Sci U S A 2025; 122:e2419696122. [PMID: 40402244 DOI: 10.1073/pnas.2419696122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 04/21/2025] [Indexed: 05/23/2025] Open
Abstract
We employ fine-scale population genetic analyses to reveal dynamics among interacting forces that act at synonymous sites and introns among closely related Drosophila species. Synonymous codon usage bias has proven to be well suited for population genetic inference. Under major codon preference (MCP), translationally superior "major" codons confer fitness benefits relative to their less efficiently and/or accurately decoded synonymous counterparts. Our codon family and lineage-specific analyses expand on previous findings in the Drosophila simulans lineage; patterns in naturally occurring polymorphism demonstrate fixation biases toward GC-ending codons that are consistent in direction, but heterogeneous in magnitude, among synonymous families. These forces are generally stronger than fixation biases in intron sequences. In contrast, population genetic analyses reveal unexpected evidence of codon preference reversals in the Drosophila melanogaster lineage. Codon family-specific polymorphism patterns support reduced efficacy of natural selection in most synonymous families but indicate reversals of favored states in the four codon families encoded by NAY. Accelerated synonymous fixations in favor of NAT and greater differences for both allele frequencies and fixation rates among X-linked, relative to autosomal, loci bolster support for fitness effect reversals. The specificity of preference reversals to codons whose cognate tRNAs undergo wobble position queuosine modification is intriguing. However, our analyses reveal prevalent dinucleotide preferences for ApT over ApC that act in opposition to GC-favoring forces in both coding and intron regions. We present evidence that changes in the relative efficacy of translational selection and dinucleotide preference underlie apparent codon preference reversals.
Collapse
Affiliation(s)
- Haruka Yamashita
- Laboratory of Evolutionary Genetics, Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Department of Genetics, The Graduate University for Advanced Studies, SOKENDAI, Mishima, Shizuoka 411-8540, Japan
| | - Tomotaka Matsumoto
- Laboratory of Evolutionary Genetics, Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Department of Genetics, The Graduate University for Advanced Studies, SOKENDAI, Mishima, Shizuoka 411-8540, Japan
| | - Kent Kawashima
- Laboratory of Evolutionary Genetics, Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
| | - Hassan Sibroe Abdulla Daanaa
- Laboratory of Evolutionary Genetics, Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Department of Genetics, The Graduate University for Advanced Studies, SOKENDAI, Mishima, Shizuoka 411-8540, Japan
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, United Kingdom
| | - Hiroshi Akashi
- Laboratory of Evolutionary Genetics, Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Department of Genetics, The Graduate University for Advanced Studies, SOKENDAI, Mishima, Shizuoka 411-8540, Japan
| |
Collapse
|
2
|
Gupta MK, Vadde R. Next-generation development and application of codon model in evolution. Front Genet 2023; 14:1091575. [PMID: 36777719 PMCID: PMC9911445 DOI: 10.3389/fgene.2023.1091575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/17/2023] [Indexed: 01/28/2023] Open
Abstract
To date, numerous nucleotide, amino acid, and codon substitution models have been developed to estimate the evolutionary history of any sequence/organism in a more comprehensive way. Out of these three, the codon substitution model is the most powerful. These models have been utilized extensively to detect selective pressure on a protein, codon usage bias, ancestral reconstruction and phylogenetic reconstruction. However, due to more computational demanding, in comparison to nucleotide and amino acid substitution models, only a few studies have employed the codon substitution model to understand the heterogeneity of the evolutionary process in a genome-scale analysis. Hence, there is always a question of how to develop more robust but less computationally demanding codon substitution models to get more accurate results. In this review article, the authors attempted to understand the basis of the development of different types of codon-substitution models and how this information can be utilized to develop more robust but less computationally demanding codon substitution models. The codon substitution model enables to detect selection regime under which any gene or gene region is evolving, codon usage bias in any organism or tissue-specific region and phylogenetic relationship between different lineages more accurately than nucleotide and amino acid substitution models. Thus, in the near future, these codon models can be utilized in the field of conservation, breeding and medicine.
Collapse
|
3
|
Murga-Moreno J, Coronado-Zamora M, Casillas S, Barbadilla A. impMKT: the imputed McDonald and Kreitman test, a straightforward correction that significantly increases the evidence of positive selection of the McDonald and Kreitman test at the gene level. G3 GENES|GENOMES|GENETICS 2022; 12:6670623. [PMID: 35976111 PMCID: PMC9526038 DOI: 10.1093/g3journal/jkac206] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/28/2022] [Indexed: 11/14/2022]
Abstract
The McDonald and Kreitman test is one of the most powerful and widely used methods to detect and quantify recurrent natural selection in DNA sequence data. One of its main limitations is the underestimation of positive selection due to the presence of slightly deleterious variants segregating at low frequencies. Although several approaches have been developed to overcome this limitation, most of them work on gene pooled analyses. Here, we present the imputed McDonald and Kreitman test (impMKT), a new straightforward approach for the detection of positive selection and other selection components of the distribution of fitness effects at the gene level. We compare imputed McDonald and Kreitman test with other widely used McDonald and Kreitman test approaches considering both simulated and empirical data. By applying imputed McDonald and Kreitman test to humans and Drosophila data at the gene level, we substantially increase the statistical evidence of positive selection with respect to previous approaches (e.g. by 50% and 157% compared with the McDonald and Kreitman test in Drosophila and humans, respectively). Finally, we review the minimum number of genes required to obtain a reliable estimation of the proportion of adaptive substitution (α) in gene pooled analyses by using the imputed McDonald and Kreitman test compared with other McDonald and Kreitman test implementations. Because of its simplicity and increased power to detect recurrent positive selection on genes, we propose the imputed McDonald and Kreitman test as the first straightforward approach for testing specific evolutionary hypotheses at the gene level. The software implementation and population genomics data are available at the web-server imkt.uab.cat.
Collapse
Affiliation(s)
- Jesús Murga-Moreno
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Marta Coronado-Zamora
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Sònia Casillas
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Antonio Barbadilla
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| |
Collapse
|
4
|
Bubnell JE, Ulbing CKS, Fernandez Begne P, Aquadro CF. Functional Divergence of the bag-of-marbles Gene in the Drosophila melanogaster Species Group. Mol Biol Evol 2022; 39:6609986. [PMID: 35714266 PMCID: PMC9250105 DOI: 10.1093/molbev/msac137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
In Drosophila melanogaster, a key germline stem cell (GSC) differentiation factor, bag of marbles (bam) shows rapid bursts of amino acid fixations between sibling species D. melanogaster and Drosophila simulans, but not in the outgroup species Drosophila ananassae. Here, we test the null hypothesis that bam's differentiation function is conserved between D. melanogaster and four additional Drosophila species in the melanogaster species group spanning approximately 30 million years of divergence. Surprisingly, we demonstrate that bam is not necessary for oogenesis or spermatogenesis in Drosophila teissieri nor is bam necessary for spermatogenesis in D. ananassae. Remarkably bam function may change on a relatively short time scale. We further report tests of neutral sequence evolution at bam in additional species of Drosophila and find a positive, but not perfect, correlation between evidence for positive selection at bam and its essential role in GSC regulation and fertility for both males and females. Further characterization of bam function in more divergent lineages will be necessary to distinguish between bam's critical gametogenesis role being newly derived in D. melanogaster, D. simulans, Drosophila yakuba, and D. ananassae females or it being basal to the genus and subsequently lost in numerous lineages.
Collapse
Affiliation(s)
| | - Cynthia K S Ulbing
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | | | | |
Collapse
|
5
|
Lin KP, Chaw SM, Lo YH, Kinjo T, Tung CY, Cheng HC, Liu Q, Satta Y, Izawa M, Chen SF, Ko WY. Genetic Differentiation and Demographic Trajectory of the Insular Formosan and Orii's Flying Foxes. J Hered 2021; 112:192-203. [PMID: 33675222 PMCID: PMC8006818 DOI: 10.1093/jhered/esab007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 02/24/2021] [Indexed: 12/04/2022] Open
Abstract
Insular flying foxes are keystone species in island ecosystems due to their critical roles in plant pollination and seed dispersal. These species are vulnerable to population decline because of their small populations and low reproductive rates. The Formosan flying fox (Pteropus dasymallus formosus) is one of the 5 subspecies of the Ryukyu flying fox. Pteropus dasymallus formosus has suffered from a severe decline and is currently recognized as a critically endangered population in Taiwan. On the contrary, the Orii's flying fox (Pteropus dasymallus inopinatus) is a relatively stable population inhabiting Okinawa Island. Here, we applied a genomic approach called double digest restriction-site associated DNA sequencing to study these 2 subspecies for a total of 7 individuals. We detected significant genetic structure between the 2 populations. Despite their contrasting contemporary population sizes, both populations harbor very low degrees of genetic diversity. We further inferred their demographic history based on the joint folded site frequency spectrum and revealed that both P. d. formosus and P. d. inopinatus had maintained small population sizes for a long period of time after their divergence. Recently, these populations experienced distinct trajectories of demographic changes. While P. d. formosus suffered from a drastic ~10-fold population decline not long ago, P. d. inopinatus underwent a ~4.5-fold population expansion. Our results suggest separate conservation management for the 2 populations-population recovery is urgently needed for P. d. formosus while long-term monitoring for adverse genetic effects should be considered for P. d. inopinatus.
Collapse
Affiliation(s)
- Kung-Ping Lin
- Department of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Shu-Miaw Chaw
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Yun-Hwa Lo
- Department of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | | | - Chien-Yi Tung
- Cancer Progression Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | | | - Quintin Liu
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Japan
| | - Yoko Satta
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Japan
| | - Masako Izawa
- Kitakyushu Museum of Natural History and Human History, Fukuoka, Japan
| | - Shiang-Fan Chen
- Center for General Education, National Taipei University, New Taipei City, Taiwan
| | - Wen-Ya Ko
- Department of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan
| |
Collapse
|
6
|
Amei A, Xu J. Inference of genetic forces using a Poisson random field model with non-constant population size. J Stat Plan Inference 2019. [DOI: 10.1016/j.jspi.2019.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
7
|
Castellano D, James J, Eyre-Walker A. Nearly Neutral Evolution across the Drosophila melanogaster Genome. Mol Biol Evol 2019; 35:2685-2694. [PMID: 30418639 DOI: 10.1093/molbev/msy164] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Under the nearly neutral theory of molecular evolution, the proportion of effectively neutral mutations is expected to depend upon the effective population size (Ne). Here, we investigate whether this is the case across the genome of Drosophila melanogaster using polymorphism data from North American and African lines. We show that the ratio of the number of nonsynonymous and synonymous polymorphisms is negatively correlated to the number of synonymous polymorphisms, even when the nonindependence is accounted for. The relationship is such that the proportion of effectively neutral nonsynonymous mutations increases by ∼45% as Ne is halved. However, we also show that this relationship is steeper than expected from an independent estimate of the distribution of fitness effects from the site frequency spectrum. We investigate a number of potential explanations for this and show, using simulation, that this is consistent with a model of genetic hitchhiking: Genetic hitchhiking depresses diversity at neutral and weakly selected sites, but has little effect on the diversity of strongly selected sites.
Collapse
Affiliation(s)
- David Castellano
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Jennifer James
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| |
Collapse
|
8
|
Amei A, Zhou S. Inferring the distribution of selective effects from a time inhomogeneous model. PLoS One 2019; 14:e0194709. [PMID: 30657757 PMCID: PMC6338356 DOI: 10.1371/journal.pone.0194709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 03/08/2018] [Indexed: 11/18/2022] Open
Abstract
We have developed a Poisson random field model for estimating the distribution of selective effects of newly arisen nonsynonymous mutations that could be observed as polymorphism or divergence in samples of two related species under the assumption that the two species populations are not at mutation-selection-drift equilibrium. The model is applied to 91Drosophila genes by comparing levels of polymorphism in an African population of D. melanogaster with divergence to a reference strain of D. simulans. Based on the difference of gene expression level between testes and ovaries, the 91 genes were classified as 33 male-biased, 28 female-biased, and 30 sex-unbiased genes. Under a Bayesian framework, Markov chain Monte Carlo simulations are implemented to the model in which the distribution of selective effects is assumed to be Gaussian with a mean that may differ from one gene to the other to sample key parameters. Based on our estimates, the majority of newly-arisen nonsynonymous mutations that could contribute to polymorphism or divergence in Drosophila species are mildly deleterious with a mean scaled selection coefficient of -2.81, while almost 86% of the fixed differences between species are driven by positive selection. There are only 16.6% of the nonsynonymous mutations observed in sex-unbiased genes that are under positive selection in comparison to 30% of male-biased and 46% of female-biased genes that are beneficial. We also estimated that D. melanogaster and D. simulans may have diverged 1.72 million years ago.
Collapse
Affiliation(s)
- Amei Amei
- Department of Mathematical Sciences, University of Nevada, Las Vegas, Nevada, United States of America
- * E-mail:
| | - Shilei Zhou
- 54 Crescent Ave, Apt G, Dorchester, Massachusetts, United States of America
| |
Collapse
|
9
|
Fry AJ. MILDLY DELETERIOUS MUTATIONS IN AVIAN MITOCHONDRIAL DNA: EVIDENCE FROM NEUTRALITY TESTS. Evolution 2017; 53:1617-1620. [PMID: 28565547 DOI: 10.1111/j.1558-5646.1999.tb05426.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/1998] [Accepted: 04/20/1999] [Indexed: 11/29/2022]
Abstract
To determine whether mildly deleterious mutations (MDMs) are present in nonrecombining genomes such as avian mitochondrial DNA (mtDNA), I analyzed molecular data from 14 studies using the neutrality tests of Tajima (1989a) and McDonald and Kreitman (1991). The presence of MDMs in mtDNA is inferred from trends observed across species in estimates of heterozygosity (θ and π) and by comparisons of polymorphism and divergence using the neutrality index (NI). Assuming neutrality, θ equals π and NI equals one. In this study, however, θ is greater than π more often than expected by chance, which reflects an excess of low-frequency alleles, and NI values presented here and elsewhere are consistently greater than one, which suggests an excess of nonsynonymous mutations within species (polymorphism) relative to between species (divergence). These observations suggest that, within species, there is an excess of rare haplotypes and that these haplotypes are carrying MDMs. The excess rare haplotypes may need to be accounted for when estimating population genetic parameters that assume strict neutrality.
Collapse
Affiliation(s)
- Adam J Fry
- Department of Ecology and Evolutionary Biology, Brown University, Box G-W, Providence, Rhode Island, 02912
| |
Collapse
|
10
|
Choi JY, Aquadro CF. Recent and Long-Term Selection Across Synonymous Sites in Drosophila ananassae. J Mol Evol 2016; 83:50-60. [PMID: 27481397 DOI: 10.1007/s00239-016-9753-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2015] [Accepted: 07/23/2016] [Indexed: 11/28/2022]
Abstract
In Drosophila, many studies have examined the short- or long-term evolution occurring across synonymous sites. Few, however, have examined both the recent and long-term evolution to gain a complete view of this selection. Here we have analyzed Drosophila ananassae DNA polymorphism and divergence data using several different methods, and have identified evidence of positive selection favoring preferred codons in both recent and long-term evolutionary time scale. Further in D. ananassae, the strength of selection for preferred codons was stronger on the X chromosome compared to the autosomes. We show that this stronger selection is not due to higher gene expression of X-linked genes. Analysis of the selectively neutral introns indicated that the X chromosome also had a preference for GC over AT nucleotides, potentially from GC-biased gene conversions (gcBGCs) that can also affect the base composition of synonymous sites. Thus selection for preferred codons and gcBGC both seem to be partially responsible for shaping the D. ananassae synonymous site evolution.
Collapse
Affiliation(s)
- Jae Young Choi
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, USA.
| | - Charles F Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, USA
| |
Collapse
|
11
|
Wakeley J, King L, Wilton PR. Effects of the population pedigree on genetic signatures of historical demographic events. Proc Natl Acad Sci U S A 2016; 113:7994-8001. [PMID: 27432946 PMCID: PMC4961129 DOI: 10.1073/pnas.1601080113] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Genetic variation among loci in the genomes of diploid biparental organisms is the result of mutation and genetic transmission through the genealogy, or population pedigree, of the species. We explore the consequences of this for patterns of variation at unlinked loci for two kinds of demographic events: the occurrence of a very large family or a strong selective sweep that occurred in the recent past. The results indicate that only rather extreme versions of such events can be expected to structure population pedigrees in such a way that unlinked loci will show deviations from the standard predictions of population genetics, which average over population pedigrees. The results also suggest that large samples of individuals and loci increase the chance of picking up signatures of these events, and that very large families may have a unique signature in terms of sample distributions of mutant alleles.
Collapse
Affiliation(s)
- John Wakeley
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
| | - Léandra King
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
| | - Peter R Wilton
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
| |
Collapse
|
12
|
Matsumoto T, John A, Baeza-Centurion P, Li B, Akashi H. Codon Usage Selection Can Bias Estimation of the Fraction of Adaptive Amino Acid Fixations. Mol Biol Evol 2016; 33:1580-9. [PMID: 26873577 DOI: 10.1093/molbev/msw027] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
A growing number of molecular evolutionary studies are estimating the proportion of adaptive amino acid substitutions (α) from comparisons of ratios of polymorphic and fixed DNA mutations. Here, we examine how violations of two of the model assumptions, neutral evolution of synonymous mutations and stationary base composition, affect α estimation. We simulated the evolution of coding sequences assuming weak selection on synonymous codon usage bias and neutral protein evolution, α = 0. We show that weak selection on synonymous mutations can give polymorphism/divergence ratios that yield α-hat (estimated α) considerably larger than its true value. Nonstationary evolution (changes in population size, selection, or mutation) can exacerbate such biases or, in some scenarios, give biases in the opposite direction, α-hat < α. These results demonstrate that two factors that appear to be prevalent among taxa, weak selection on synonymous mutations and non-steady-state nucleotide composition, should be considered when estimating α. Estimates of the proportion of adaptive amino acid fixations from large-scale analyses of Drosophila melanogaster polymorphism and divergence data are positively correlated with codon usage bias. Such patterns are consistent with α-hat inflation from weak selection on synonymous mutations and/or mutational changes within the examined gene trees.
Collapse
Affiliation(s)
- Tomotaka Matsumoto
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Anoop John
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Pablo Baeza-Centurion
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Boyang Li
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan
| | - Hiroshi Akashi
- Division of Evolutionary Genetics, National Institute of Genetics, Yata, Mishima, Shizuoka, Japan Department of Genetics, The Graduate University for Advanced Studies (SOKENDAI), Yata, Mishima, Shizuoka, Japan
| |
Collapse
|
13
|
Cullingham CI, Cooke JEK, Coltman DW. Cross-species outlier detection reveals different evolutionary pressures between sister species. THE NEW PHYTOLOGIST 2014; 204:215-229. [PMID: 24942459 PMCID: PMC4260136 DOI: 10.1111/nph.12896] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 05/14/2014] [Indexed: 05/15/2023]
Abstract
Lodgepole pine (Pinus contorta var. latifolia) and jack pine (Pinus banksiana) hybridize in western Canada, an area of recent mountain pine beetle range expansion. Given the heterogeneity of the environment, and indications of local adaptation, there are many unknowns regarding the response of these forests to future outbreaks. To better understand this we aim to identify genetic regions that have adaptive potential. We used data collected on 472 single nucleotide polymorphism (SNP) loci from 576 tree samples collected across 13 lodgepole pine-dominated sites and four jack pine-dominated sites. We looked at the relationship of genetic diversity with the environment, and we identified candidate loci using both frequency-based (arlequin and bayescan) and correlation-based (matsam and bayenv) methods. We found contrasting relationships between environmental variation and genetic diversity for the species. While we identified a number of candidate outliers (34 in lodgepole pine, 25 in jack pine, and 43 interspecific loci), we did not find any loci in common between lodgepole and jack pine. Many of the outlier loci identified were correlated with environmental variation. Using rigorous criteria we have been able to identify potential outlier SNPs. We have also found evidence of contrasting environmental adaptations between lodgepole and jack pine which could have implications for beetle spread risk.
Collapse
Affiliation(s)
- Catherine I Cullingham
- Department of Biological Sciences, University of Alberta, Biological Sciences Building, Edmonton, AB, T6G 2E9, Canada
| | - Janice E K Cooke
- Department of Biological Sciences, University of Alberta, Biological Sciences Building, Edmonton, AB, T6G 2E9, Canada
| | - David W Coltman
- Department of Biological Sciences, University of Alberta, Biological Sciences Building, Edmonton, AB, T6G 2E9, Canada
| |
Collapse
|
14
|
Charlesworth B. Stabilizing selection, purifying selection, and mutational bias in finite populations. Genetics 2013; 194:955-71. [PMID: 23709636 PMCID: PMC3730922 DOI: 10.1534/genetics.113.151555] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 05/18/2013] [Indexed: 12/16/2022] Open
Abstract
Genomic traits such as codon usage and the lengths of noncoding sequences may be subject to stabilizing selection rather than purifying selection. Mutations affecting these traits are often biased in one direction. To investigate the potential role of stabilizing selection on genomic traits, the effects of mutational bias on the equilibrium value of a trait under stabilizing selection in a finite population were investigated, using two different mutational models. Numerical results were generated using a matrix method for calculating the probability distribution of variant frequencies at sites affecting the trait, as well as by Monte Carlo simulations. Analytical approximations were also derived, which provided useful insights into the numerical results. A novel conclusion is that the scaled intensity of selection acting on individual variants is nearly independent of the effective population size over a wide range of parameter space and is strongly determined by the logarithm of the mutational bias parameter. This is true even when there is a very small departure of the mean from the optimum, as is usually the case. This implies that studies of the frequency spectra of DNA sequence variants may be unable to distinguish between stabilizing and purifying selection. A similar investigation of purifying selection against deleterious mutations was also carried out. Contrary to previous suggestions, the scaled intensity of purifying selection with synergistic fitness effects is sensitive to population size, which is inconsistent with the general lack of sensitivity of codon usage to effective population size.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| |
Collapse
|
15
|
Ko WY, Rajan P, Gomez F, Scheinfeldt L, An P, Winkler CA, Froment A, Nyambo T, Omar S, Wambebe C, Ranciaro A, Hirbo J, Tishkoff S. Identifying Darwinian selection acting on different human APOL1 variants among diverse African populations. Am J Hum Genet 2013; 93:54-66. [PMID: 23768513 PMCID: PMC3710747 DOI: 10.1016/j.ajhg.2013.05.014] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Revised: 04/10/2013] [Accepted: 05/20/2013] [Indexed: 12/24/2022] Open
Abstract
Disease susceptibility can arise as a consequence of adaptation to infectious disease. Recent findings have suggested that higher rates of chronic kidney disease (CKD) in individuals with recent African ancestry might be attributed to two risk alleles (G1 and G2) at the serum-resistance-associated (SRA)-interacting-domain-encoding region of APOL1. These two alleles appear to have arisen adaptively, possibly as a result of their protective effects against human African trypanosomiasis (HAT), or African sleeping sickness. In order to explore the distribution of potential functional variation at APOL1, we studied nucleotide variation in 187 individuals across ten geographically and genetically diverse African ethnic groups with exposure to two Trypanosoma brucei subspecies that cause HAT. We observed unusually high levels of nonsynonymous polymorphism in the regions encoding the functional domains that are required for lysing parasites. Whereas allele frequencies of G2 were similar across all populations (3%-8%), the G1 allele was only common in the Yoruba (39%). Additionally, we identified a haplotype (termed G3) that contains a nonsynonymous change at the membrane-addressing-domain-encoding region of APOL1 and is present in all populations except for the Yoruba. Analyses of long-range patterns of linkage disequilibrium indicate evidence of recent selection acting on the G3 haplotype in Fulani from Cameroon. Our results indicate that the G1 and G2 variants in APOL1 are geographically restricted and that there might be other functional variants that could play a role in HAT resistance and CKD risk in African populations.
Collapse
MESH Headings
- Adaptation, Biological
- Africa
- Alleles
- Apolipoprotein L1
- Apolipoproteins/genetics
- Black People/genetics
- Disease Resistance/genetics
- Evolution, Molecular
- Exons
- Gene Frequency
- Genetic Predisposition to Disease
- Genetics, Population/methods
- Haplotypes
- Humans
- Linkage Disequilibrium
- Lipoproteins, HDL/genetics
- Molecular Sequence Data
- Polymorphism, Single Nucleotide
- Renal Insufficiency, Chronic/ethnology
- Renal Insufficiency, Chronic/genetics
- Risk Factors
- Selection, Genetic
- Trypanosomiasis, African/ethnology
- Trypanosomiasis, African/genetics
Collapse
Affiliation(s)
- Wen-Ya Ko
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, 4485-661 Vairão, Portugal
| | - Prianka Rajan
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Felicia Gomez
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Anthropology, Center for the Advanced Study of Hominid Paleobiology, The George Washington University, Washington, DC 20052, USA
| | - Laura Scheinfeldt
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ping An
- Basic Research Laboratory, Center for Cancer Research, National Cancer Institute, Frederick National Laboratory, Science Applications International Corporation-Frederick, Frederick, MD 21702, USA
| | - Cheryl A. Winkler
- Basic Research Laboratory, Center for Cancer Research, National Cancer Institute, Frederick National Laboratory, Science Applications International Corporation-Frederick, Frederick, MD 21702, USA
| | - Alain Froment
- Unité Mixte de Recherche 208, Muséum National d’Histoire Naturelle, Institut de Recherche pour le Développement, Musée de l’Homme, 75116 Paris, France
| | - Thomas B. Nyambo
- Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Sabah A. Omar
- Kenya Medical Research Institute, Center for Biotechnology Research and Development, 54840-00200 Nairobi, Kenya
| | - Charles Wambebe
- International Biomedical Research in Africa, Kampala, Uganda
| | - Alessia Ranciaro
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jibril B. Hirbo
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sarah A. Tishkoff
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
16
|
Akashi H, Osada N, Ohta T. Weak selection and protein evolution. Genetics 2012; 192:15-31. [PMID: 22964835 PMCID: PMC3430532 DOI: 10.1534/genetics.112.140178] [Citation(s) in RCA: 92] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 06/11/2012] [Indexed: 01/23/2023] Open
Abstract
The "nearly neutral" theory of molecular evolution proposes that many features of genomes arise from the interaction of three weak evolutionary forces: mutation, genetic drift, and natural selection acting at its limit of efficacy. Such forces generally have little impact on allele frequencies within populations from generation to generation but can have substantial effects on long-term evolution. The evolutionary dynamics of weakly selected mutations are highly sensitive to population size, and near neutrality was initially proposed as an adjustment to the neutral theory to account for general patterns in available protein and DNA variation data. Here, we review the motivation for the nearly neutral theory, discuss the structure of the model and its predictions, and evaluate current empirical support for interactions among weak evolutionary forces in protein evolution. Near neutrality may be a prevalent mode of evolution across a range of functional categories of mutations and taxa. However, multiple evolutionary mechanisms (including adaptive evolution, linked selection, changes in fitness-effect distributions, and weak selection) can often explain the same patterns of genome variation. Strong parameter sensitivity remains a limitation of the nearly neutral model, and we discuss concave fitness functions as a plausible underlying basis for weak selection.
Collapse
Affiliation(s)
- Hiroshi Akashi
- Division of Evolutionary Genetics, Department of Population Genetics, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan.
| | | | | |
Collapse
|
17
|
Clemente F, Vogl C. Unconstrained evolution in short introns? - an analysis of genome-wide polymorphism and divergence data from Drosophila. J Evol Biol 2012; 25:1975-1990. [PMID: 22901008 DOI: 10.1111/j.1420-9101.2012.02580.x] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2012] [Revised: 06/15/2012] [Accepted: 06/22/2012] [Indexed: 12/23/2022]
Abstract
An unconstrained reference sequence facilitates the detection of selection. In Drosophila, sequence variation in short introns seems to be least influenced by selection and dominated by mutation and drift. Here, we test this with genome-wide sequences using an African population (Malawi) of D. melanogaster and data from the related outgroup species D. simulans, D. sechellia, D. erecta and D. yakuba. The distribution of mutations deviates from equilibrium, and the content of A and T (AT) nucleotides shows an excess of variance among introns. We explain this by a complex mutational pattern: a shift in mutational bias towards AT, leading to a slight nonequilibrium in base composition and context-dependent mutation rates, with G or C (GC) sites mutating most frequently in AT-rich introns. By comparing the corresponding allele frequency spectra of AT-rich vs. GC-rich introns, we can rule out the influence of directional selection or biased gene conversion on the mutational pattern. Compared with neutral equilibrium expectations, polymorphism spectra show an excess of low frequency and a paucity of intermediate frequency variants, irrespective of the direction of mutation. Combining the information from different outgroups with the polymorphism data and using a generalized linear model, we find evidence for shared ancestral polymorphism between D. melanogaster and D. simulans, D. sechellia, arguing against a bottleneck in D. melanogaster. Generally, we find that short introns can be used as a neutral reference on a genome-wide level, if the spatially and temporally varying mutational pattern is accounted for.
Collapse
Affiliation(s)
- F Clemente
- Institute of Population Genetics, Veterinärmedizinische Universität Wien, Vienna, Austria
| | - C Vogl
- Institute of Animal Breeding and Genetics, Veterinärmedizinische Universität Wien, Vienna, Austria
| |
Collapse
|
18
|
Bazykin GA, Kondrashov AS. Major role of positive selection in the evolution of conservative segments of Drosophila proteins. Proc Biol Sci 2012; 279:3409-17. [PMID: 22673359 PMCID: PMC3396909 DOI: 10.1098/rspb.2012.0776] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Slow evolution of conservative segments of coding and non-coding DNA is caused by the action of negative selection, which removes new mutations. However, the mode of selection that affects the few substitutions that do occur within such segments remains unclear. Here, we show that the fraction of allele replacements that were driven by positive selection, and the strength of this selection, is the highest within the conservative segments of Drosophila protein-coding genes. The McDonald–Kreitman test, applied to the data on variation in Drosophila melanogaster and in Drosophila simulans, indicates that within the most conservative protein segments, approximately 72 per cent (approx. 80%) of allele replacements were driven by positive selection, as opposed to only approximately 44 per cent (approx. 53%) at rapidly evolving segments. Data on multiple non-synonymous substitutions at a codon lead to the same conclusion and additionally indicate that positive selection driving allele replacements at conservative sites is the strongest, as it accelerates evolution by a factor of approximately 40, as opposed to a factor of approximately 5 at rapidly evolving sites. Thus, random drift plays only a minor role in the evolution of conservative DNA segments, and those relatively rare allele replacements that occur within such segments are mostly driven by substantial positive selection.
Collapse
Affiliation(s)
- Georgii A Bazykin
- Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Vorbyevy Gory 1-73, Moscow 119992, Russia
| | | |
Collapse
|
19
|
BURGARELLA C, NAVASCUÉS M, ZABAL-AGUIRRE M, BERGANZO E, RIBA M, MAYOL M, VENDRAMIN GG, GONZÁLEZ-MARTÍNEZ SC. Recent population decline and selection shape diversity of taxol-related genes. Mol Ecol 2012; 21:3006-21. [DOI: 10.1111/j.1365-294x.2012.05532.x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
20
|
Whittle CA, Sun Y, Johannesson H. Genome-wide selection on codon usage at the population level in the fungal model organism Neurospora crassa. Mol Biol Evol 2012; 29:1975-86. [PMID: 22334579 DOI: 10.1093/molbev/mss065] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Many organisms exhibit biased codon usage in their genome, including the fungal model organism Neurospora crassa. The preferential use of subset of synonymous codons (optimal codons) at the macroevolutionary level is believed to result from a history of selection to promote translational efficiency. At present, few data are available about selection on optimal codons at the microevolutionary scale, that is, at the population level. Herein, we conducted a large-scale assessment of codon mutations at biallelic sites, spanning more than 5,100 genes, in 2 distinct populations of N. crassa: the Caribbean and Louisiana populations. Based on analysis of the frequency spectra of synonymous codon mutations at biallelic sites, we found that derived (nonancestral) optimal codon mutations segregate at a higher frequency than derived nonoptimal codon mutations in each population; this is consistent with natural selection favoring optimal codons. We also report that optimal codon variants were less frequent in longer genes and that the fixation of optimal codons was reduced in rapidly evolving long genes/proteins, trends suggestive of genetic hitchhiking (Hill-Robertson) altering codon usage variation. Notably, nonsynonymous codon mutations segregated at a lower frequency than synonymous nonoptimal codon mutations (which impair translational efficiency) in each N. crassa population, suggesting that changes in protein composition are more detrimental to fitness than mutations altering translation. Overall, the present data demonstrate that selection, and partly genetic interference, shapes codon variation across the genome in N. crassa populations.
Collapse
Affiliation(s)
- C A Whittle
- Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden
| | | | | |
Collapse
|
21
|
Abstract
Populations evolve as mutations arise in individual organisms and, through hereditary transmission, may become "fixed" (shared by all individuals) in the population. Most mutations are lethal or have negative fitness consequences for the organism. Others have essentially no effect on organismal fitness and can become fixed through the neutral stochastic process known as random drift. However, mutations may also produce a selective advantage that boosts their chances of reaching fixation. Regions of genes where new mutations are beneficial, rather than neutral or deleterious, tend to evolve more rapidly due to positive selection. Genes involved in immunity and defense are a well-known example; rapid evolution in these genes presumably occurs because new mutations help organisms to prevail in evolutionary "arms races" with pathogens. In recent years, genome-wide scans for selection have enlarged our understanding of the evolution of the protein-coding regions of the various species. In this chapter, we focus on the methods to detect selection in protein-coding genes. In particular, we discuss probabilistic models and how they have changed with the advent of new genome-wide data now available.
Collapse
|
22
|
Keays MC, Barker D, Wicker-Thomas C, Ritchie MG. Signatures of selection and sex-specific expression variation of a novel duplicate during the evolution of the Drosophila desaturase gene family. Mol Ecol 2011; 20:3617-30. [PMID: 21801259 DOI: 10.1111/j.1365-294x.2011.05208.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The tempo and mode of evolution of loci with a large effect on adaptation and reproductive isolation will influence the rate of evolutionary divergence and speciation. Desaturase loci are involved in key biochemical changes in long-chain fatty acids. In insects, these have been shown to influence adaptation to starvation or desiccation resistance and in some cases act as important pheromones. The desaturase gene family of Drosophila is known to have evolved by gene duplication and diversification, and at least one locus shows rapid evolution of sex-specific expression variation. Here, we examine the evolution of the gene family in species representing the Drosophila phylogeny. We find that the family includes more loci than have been previously described. Most are represented as single-copy loci, but we also find additional examples of duplications in loci which influence pheromone blends. Most loci show patterns of variation associated with purifying selection, but there are strong signatures of diversifying selection in new duplicates. In the case of a new duplicate of desat1 in the obscura group species, we show that strong selection on the coding sequence is associated with the evolution of sex-specific expression variation. It seems likely that both sexual selection and ecological adaptation have influenced the evolution of this gene family in Drosophila.
Collapse
Affiliation(s)
- Maria C Keays
- Centre for Evolution, Genes and Genomics, School of Biology, University of St. Andrews, St. Andrews, Fife, UK
| | | | | | | |
Collapse
|
23
|
Waldman YY, Tuller T, Keinan A, Ruppin E. Selection for translation efficiency on synonymous polymorphisms in recent human evolution. Genome Biol Evol 2011; 3:749-61. [PMID: 21803767 PMCID: PMC3163469 DOI: 10.1093/gbe/evr076] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Synonymous mutations are considered to be "silent" as they do not affect protein sequence. However, different silent codons have different translation efficiency (TE), which raises the question to what extent such mutations are really neutral. We perform the first genome-wide study of natural selection operating on TE in recent human evolution, surveying 13,798 synonymous single nucleotide polymorphisms (SNPs) in 1,198 unrelated individuals from 11 populations. We find evidence for both negative and positive selection on TE, as measured based on differentiation in allele frequencies between populations. Notably, the likelihood of an SNP to be targeted by positive or negative selection is correlated with the magnitude of its effect on the TE of the corresponding protein. Furthermore, negative selection acting against changes in TE is more marked in highly expressed genes, highly interacting proteins, complex members, and regulatory genes. It is also more common in functional regions and in the initial segments of highly expressed genes. Positive selection targeting sites with a large effect on TE is stronger in lowly interacting proteins and in regulatory genes. Similarly, essential genes are enriched for negative TE selection while underrepresented for positive TE selection. Taken together, these results point to the significant role of TE as a selective force operating in humans and hence underscore the importance of considering silent SNPs in interpreting associations with complex human diseases. Testifying to this potential, we describe two synonymous SNPs that may have clinical implications in phenylketonuria and in Best's macular dystrophy due to TE differences between alleles.
Collapse
Affiliation(s)
- Yedael Y Waldman
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | | | | | | |
Collapse
|
24
|
Abstract
SummaryPopulation genomics is the study of the amount and causes of genome-wide variability in natural populations, a topic that has been under discussion since Darwin. This paper first briefly reviews the early development of molecular approaches to the subject: the pioneering unbiased surveys of genetic variability at multiple loci by means of gel electrophoresis and restriction enzyme mapping. The results of surveys of levels of genome-wide variability using DNA resequencing studies are then discussed. Studies of the extent to which variability for different classes of variants (non-synonymous, synonymous and non-coding) are affected by natural selection, or other directional forces such as biased gene conversion, are also described. Finally, the effects of deleterious mutations on population fitness and the possible role of Hill–Robertson interference in shaping patterns of sequence variability are discussed.
Collapse
|
25
|
Levenstien MA, Klein RJ. Predicting functionally important SNP classes based on negative selection. BMC Bioinformatics 2011; 12:26. [PMID: 21247465 PMCID: PMC3033802 DOI: 10.1186/1471-2105-12-26] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 01/19/2011] [Indexed: 01/20/2023] Open
Abstract
Background With the advent of cost-effective genotyping technologies, genome-wide association studies allow researchers to examine hundreds of thousands of single nucleotide polymorphisms (SNPs) for association with human disease. Recently, many researchers applying this strategy have detected strong associations to disease with SNP markers that are either not in linkage disequilibrium with any nonsynonymous SNP or large distances from any annotated gene. In such cases, no well-established standard practice for effective SNP selection for follow-up studies exists. We aim to identify and prioritize groups of SNPs that are more likely to affect phenotypes in order to facilitate efficient SNP selection for follow-up studies. Results Based on the annotations available in the Ensembl database, we categorized SNPs in the human genome into classes related to regulatory attributes, such as epigenetic modifications and transcription factor binding sites, in addition to classes related to gene structure and cross-species conservation. Using the distribution of derived allele frequencies (DAF) within each class, we assessed the strength of natural selection for each class relative to the genome as a whole. We applied this DAF analysis to Perlegen resequenced SNPs genome-wide. Regulatory elements annotated by Ensembl such as specific histone methylation sites as well as classes defined by cross-species conservation showed negative selection in comparison to the genome as a whole. Conclusions These results highlight which annotated classes are under purifying selection, have putative functional importance, and contain SNPs that are strong candidates for follow-up studies after genome-wide association. Such SNP annotation may also be useful in interpreting results of whole-genome sequencing studies.
Collapse
Affiliation(s)
- Mark A Levenstien
- Program in Cancer Biology and Genetics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| | | |
Collapse
|
26
|
Codoñer FM, Alfonso-Loeches S, Fares MA. Mutational dynamics of murine angiogenin duplicates. BMC Evol Biol 2010; 10:310. [PMID: 20950426 PMCID: PMC2964713 DOI: 10.1186/1471-2148-10-310] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2010] [Accepted: 10/15/2010] [Indexed: 12/03/2022] Open
Abstract
Background Angiogenin (Ang) is a protein involved in angiogenesis by inducing the formation of blood vessels. The biomedical importance of this protein has come from findings linking mutations in Ang to cancer progression and neurodegenerative diseases. These findings highlight the evolutionary constrain on Ang amino acid sequence. However, previous studies comparing human Angiogenin with homologs from other phylogenetically related organisms have led to the conclusion that Ang presents a striking variability. Whether this variability has an adaptive value per se remains elusive. Understanding why many functional Ang paralogs have been preserved in mouse and rat and identifying functional divergence mutations at these copies may explain the relationship between mutations and function. In spite of the importance of testing this hypothesis from the evolutionarily and biomedical perspectives, this remains yet unaccomplished. Here we test the main mutational dynamics driving the evolution and function of Ang paralogs in mammals. Results We analysed the phylogenetic asymmetries between the different Ang gene copies in mouse and rat in the context of vertebrate Ang phylogeny. This analysis shows strong evidence in support of accelerated evolution in some Ang murine copies (mAng). This acceleration is not due to non-functionalisation because constraints on amino acid replacements remain strong. We identify many of the amino acid sites involved in signal localization and nucleotide binding by Ang to have evolved under diversifying selection. Compensatory effects of many of the mutations at these paralogs and their key structural location in or nearby important functional regions support a possible functional shift (functional divergence) in many Ang copies. Similarities between 3D-structural models for mAng copies suggest that their divergence is mainly functional. Conclusions We identify the main evolutionary dynamics shaping the variability of Angiogenin in vertebrates and highlight the plasticity of this protein after gene duplication. Our results suggest functional divergence among mAng paralogs. This puts forward mAng as a good system candidate for testing functional plasticity of such an important protein while stresses caution when using mouse as a model to infer the consequences of mutations in the single Ang copy of humans.
Collapse
Affiliation(s)
- Francisco M Codoñer
- Evolutionary Genetics and Bioinformatics Laboratory, Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | | | | |
Collapse
|
27
|
Amei A, Sawyer S. A time-dependent Poisson random field model for polymorphism within and between two related biological species. ANN APPL PROBAB 2010. [DOI: 10.1214/09-aap668] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
28
|
Estimating the parameters of selection on nonsynonymous mutations in Drosophila pseudoobscura and D. miranda. Genetics 2010; 185:1381-96. [PMID: 20516497 DOI: 10.1534/genetics.110.117614] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present the results of surveys of diversity in sets of >40 X-linked and autosomal loci in samples from natural populations of Drosophila miranda and D. pseudoobscura, together with their sequence divergence from D. affinis. Mean silent site diversity in D. miranda is approximately one-quarter of that in D. pseudoobscura; mean X-linked silent diversity is about three-quarters of that for the autosomes in both species. Estimates of the distribution of selection coefficients against heterozygous, deleterious nonsynonymous mutations from two different methods suggest a wide distribution, with coefficients of variation greater than one, and with the average segregating amino acid mutation being subject to only very weak selection. Only a small fraction of new amino acid mutations behave as effectively neutral, however. A large fraction of amino acid differences between D. pseudoobscura and D. affinis appear to have been fixed by positive natural selection, using three different methods of estimation; estimates between D. miranda and D. affinis are more equivocal. Sources of bias in the estimates, especially those arising from selection on synonymous mutations and from the choice of genes, are discussed and corrections for these applied. Overall, the results show that both purifying selection and positive selection on nonsynonymous mutations are pervasive.
Collapse
|
29
|
Katzman S, Kern AD, Pollard KS, Salama SR, Haussler D. GC-biased evolution near human accelerated regions. PLoS Genet 2010; 6:e1000960. [PMID: 20502635 PMCID: PMC2873926 DOI: 10.1371/journal.pgen.1000960] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2009] [Accepted: 04/20/2010] [Indexed: 12/30/2022] Open
Abstract
Regions of the genome that have been the target of positive selection specifically along the human lineage are of special importance in human biology. We used high throughput sequencing combined with methods to enrich human genomic samples for particular targets to obtain the sequence of 22 chromosomal samples at high depth in 40 kb neighborhoods of 49 previously identified 100–400 bp elements that show evidence for human accelerated evolution. In addition to selection, the pattern of nucleotide substitutions in several of these elements suggested an historical bias favoring the conversion of weak (A or T) alleles into strong (G or C) alleles. Here we found strong evidence in the derived allele frequency spectra of many of these 40 kb regions for ongoing weak-to-strong fixation bias. Comparison of the nucleotide composition at polymorphic loci to the composition at sites of fixed substitutions additionally reveals the signature of historical weak-to-strong fixation bias in a subset of these regions. Most of the regions with evidence for historical bias do not also have signatures of ongoing bias, suggesting that the evolutionary forces generating weak-to-strong bias are not constant over time. To investigate the role of selection in shaping these regions, we analyzed the spatial pattern of polymorphism in our samples. We found no significant evidence for selective sweeps, possibly because the signal of such sweeps has decayed beyond the power of our tests to detect them. Together, these results do not rule out functional roles for the observed changes in these regions—indeed there is good evidence that the first two are functional elements in humans—but they suggest that a fixation process (such as biased gene conversion) that is biased at the nucleotide level, but is otherwise selectively neutral, could be an important evolutionary force at play in them, both historically and at present. The search for functional regions in the human genome, beyond the protein-coding portion, often relies on signals of conservation across species. The Human Accelerated Regions (HARs) are strongly conserved elements, ranging in size from 100–400 bp, that show an unexpected number of human-specific changes. This pattern suggests that HARs may be functional elements that have significantly changed during human evolution. To analyze the evolutionary forces that led these changes, we studied 40 kb neighborhoods of the top 49 HARs. We took advantage of recently developed DNA sequencing technology, coupled with methods to isolate genomic DNA for our target regions only, to determine the genotypes in 22 chromosomal samples. This polymorphism data showed no significant evidence for adaptive selective sweeps in HAR regions. By contrast, we found strong evidence for a nucleotide bias in the fixation of mutations from A or T to G or C basepairs. Our work reveals that this bias in the HAR neighborhoods is not just an historic phenomenon, but is ongoing in the present day human population. This finding adds credence to the possibility that non-selective forces, such as biased gene conversion, could have contributed to the evolution of several of these regions.
Collapse
Affiliation(s)
- Sol Katzman
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Andrew D. Kern
- Department of Biological Sciences, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Katherine S. Pollard
- Gladstone Institutes, University of California San Francisco, San Francisco, California, United States of America
| | - Sofie R. Salama
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - David Haussler
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, California, United States of America
- * E-mail:
| |
Collapse
|
30
|
Gossmann TI, Song BH, Windsor AJ, Mitchell-Olds T, Dixon CJ, Kapralov MV, Filatov DA, Eyre-Walker A. Genome wide analyses reveal little evidence for adaptive evolution in many plant species. Mol Biol Evol 2010; 27:1822-32. [PMID: 20299543 DOI: 10.1093/molbev/msq079] [Citation(s) in RCA: 183] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The relative contribution of advantageous and neutral mutations to the evolutionary process is a central problem in evolutionary biology. Current estimates suggest that whereas Drosophila, mice, and bacteria have undergone extensive adaptive evolution, hominids show little or no evidence of adaptive evolution in protein-coding sequences. This may be a consequence of differences in effective population size. To study the matter further, we have investigated whether plants show evidence of adaptive evolution using an extension of the McDonald-Kreitman test that explicitly models slightly deleterious mutations by estimating the distribution of fitness effects of new mutations. We apply this method to data from nine pairs of species. Altogether more than 2,400 loci with an average length of approximately 280 nucleotides were analyzed. We observe very similar results in all species; we find little evidence of adaptive amino acid substitution in any comparison except sunflowers. This may be because many plant species have modest effective population sizes.
Collapse
Affiliation(s)
- Toni I Gossmann
- Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Lee S, Costanzo S, Jia Y, Olsen KM, Caicedo AL. Evolutionary dynamics of the genomic region around the blast resistance gene Pi-ta in AA genome Oryza species. Genetics 2009; 183:1315-25. [PMID: 19822730 PMCID: PMC2787423 DOI: 10.1534/genetics.109.108266] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2009] [Accepted: 10/03/2009] [Indexed: 11/18/2022] Open
Abstract
The race-specific resistance gene Pi-ta has been effectively used to control blast disease, one of the most destructive plant diseases worldwide. A single amino acid change at the 918 position of the Pi-ta protein was known to determine resistance specificity. To understand the evolutionary dynamics present, we examined sequences of the Pi-ta locus and its flanking regions in 159 accessions composed of seven AA genome Oryza species: O. sativa, O. rufipogon, O. nivara, O. meridionalis, O. glaberrima, O. barthii, and O. glumaepatula. A 3364-bp fragment encoding a predicted transposon was found in the proximity of the Pi-ta promoter region associated with the resistance phenotype. Haplotype network analysis with 33 newly identified Pi-ta haplotypes and 18 newly identified Pi-ta protein variants demonstrated the evolutionary relationships of Pi-ta haplotypes between O. sativa and O. rufipogon. In O. rufipogon, the recent directional selection was found in the Pi-ta region, while significant deviation from neutral evolution was not found in all O. sativa groups. Results of sequence variation in flanking regions around Pi-ta in O. sativa suggest that the size of the resistant Pi-ta introgressed block was at least 5.4 Mb in all elite resistant cultivars but not in the cultivars without Pi-ta. These findings demonstrate that the Pi-ta region with transposon and additional plant modifiers has evolved under an extensive selection pressure during crop breeding.
Collapse
Affiliation(s)
- Seonghee Lee
- Rice Research and Extension Center, University of Arkansas, Stuttgart, Arkansas 72160, U. S. Department of Agriculture–Agricultural Research Service, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas 72160, Department of Biology, Washington University, St. Louis, Missouri 63130 and Department of Biology, University of Massachusetts, Amherst, Massachusetts 01003
| | - Stefano Costanzo
- Rice Research and Extension Center, University of Arkansas, Stuttgart, Arkansas 72160, U. S. Department of Agriculture–Agricultural Research Service, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas 72160, Department of Biology, Washington University, St. Louis, Missouri 63130 and Department of Biology, University of Massachusetts, Amherst, Massachusetts 01003
| | - Yulin Jia
- Rice Research and Extension Center, University of Arkansas, Stuttgart, Arkansas 72160, U. S. Department of Agriculture–Agricultural Research Service, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas 72160, Department of Biology, Washington University, St. Louis, Missouri 63130 and Department of Biology, University of Massachusetts, Amherst, Massachusetts 01003
| | - Kenneth M. Olsen
- Rice Research and Extension Center, University of Arkansas, Stuttgart, Arkansas 72160, U. S. Department of Agriculture–Agricultural Research Service, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas 72160, Department of Biology, Washington University, St. Louis, Missouri 63130 and Department of Biology, University of Massachusetts, Amherst, Massachusetts 01003
| | - Ana L. Caicedo
- Rice Research and Extension Center, University of Arkansas, Stuttgart, Arkansas 72160, U. S. Department of Agriculture–Agricultural Research Service, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas 72160, Department of Biology, Washington University, St. Louis, Missouri 63130 and Department of Biology, University of Massachusetts, Amherst, Massachusetts 01003
| |
Collapse
|
32
|
Lu B, Wang N, Xiao J, Xu Y, Murphy RW, Huang D. Expression and evolutionary divergence of the non-conventional olfactory receptor in four species of fig wasp associated with one species of fig. BMC Evol Biol 2009; 9:43. [PMID: 19232102 PMCID: PMC2661049 DOI: 10.1186/1471-2148-9-43] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2008] [Accepted: 02/20/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The interactions of fig wasps and their host figs provide a model for investigating co-evolution. Fig wasps have specialized morphological characters and lifestyles thought to be adaptations to living in the fig's syconium. Although these aspects of natural history are well documented, the genetic mechanism(s) underlying these changes remain(s) unknown. Fig wasp olfaction is the key to host-specificity. The Or83b gene class, an unusual member of olfactory receptor family, plays a critical role in enabling the function of conventional olfactory receptors. Four Or83b orthologous genes from one pollinator (PFW) (Ceratosolen solmsi) and three non-pollinator fig wasps (NPFWs) (Apocrypta bakeri, Philotrypesis pilosa and Philotrypesis sp.) associated with one species of fig (Ficus hispida) can be used to better understand the molecular mechanism underlying the fig wasp's adaptation to its host. We made a comparison of spatial tissue-specific expression patterns and substitution rates of one orthologous gene in these fig wasps and sought evidence for selection pressures. RESULTS A newly identified Or83b orthologous gene was named Or2. Expressions of Or2 were restricted to the heads of all wingless male fig wasps, which usually live in the dark cavity of a fig throughout their life cycle. However, expressions were widely detected in the antennae, legs and abdomens of all female fig wasps that fly from one fig to another for oviposition, and secondarily pollination. Weak expression was also observed in the thorax of PFWs. Compared with NPFWs, the Or2 gene in C. solmsi had an elevated rate of substitutions and lower codon usage. Analyses using Tajima's D, Fu and Li's D* and F* tests indicated a non-neutral pattern of nucleotide variation in all fig wasps. Unlike in NPFWs, this non-neutral pattern was also observed for synonymous sites of Or2 within PFWs. CONCLUSION The sex- and species-specific expression patterns of Or2 genes detected beyond the known primary olfactory tissues indicates the location of cryptic olfactory inputs. The specialized ecological niche of these wasps explains the unique habits and adaptive evolution of Or2 genes. The Or2 gene in C. solmsi is evolving very rapidly. Negative deviation from the neutral model of evolution reflects possible selection pressures acting on Or2 sequences of fig wasp, particularly on PFWs who are more host-specific to figs.
Collapse
Affiliation(s)
- Bin Lu
- College of Plant Protection, Shandong Agricultural University, Tai'an, Shandong 271018, PR China.
| | | | | | | | | | | |
Collapse
|
33
|
Morton BR, Dar VUN, Wright SI. Analysis of site frequency spectra from Arabidopsis with context-dependent corrections for ancestral misinference. PLANT PHYSIOLOGY 2009; 149:616-624. [PMID: 19019983 PMCID: PMC2633827 DOI: 10.1104/pp.108.127787] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2008] [Accepted: 11/12/2008] [Indexed: 05/27/2023]
Abstract
Previous studies have shown that the pattern of single nucleotide polymorphism (SNP) in Arabidopsis (Arabidopsis thaliana) deviates from the distribution expected under a neutral model. Here, we test whether or not ancestral misinference could explain this deviation. We start by showing that there are significant and complex influences of context on mutation dynamics as inferred from SNP frequency, in Arabidopsis, and compare the results to observations about context dependency that have been made on a previous analysis of a maize (Zea mays) SNP dataset. The data concerning heterogeneity across sites are then used to make corrections for ancestral misinference in a context-dependent manner. Using Arabidopsis lyrata to infer the ancestral state for SNPs, we show that the resulting unfolded site frequency spectrum (SFS) in Arabidopsis is skewed toward sites with high frequency derived nucleotides. Sites are also partitioned into two general functional classes, second codon position and 4-fold degenerate sites. These two classes show different SFS; although both show an overrepresentation of high frequency derived sites, low frequency derived sites are vastly overrepresented at the second codon position, but significantly underrepresented at 4-fold degenerate sites. We find that these results are robust to corrections for ancestral misinference, even when context-dependent variation in mutation properties is taken into consideration. The data suggest that, in addition to purifying selection, complex demographic events and/or linked positive selection need to be invoked to explain the SFS, and they highlight the importance of sequence context in analyses of genome-wide variation.
Collapse
Affiliation(s)
- Brian R Morton
- Department of Biological Science, Barnard College, Columbia University, New York, New York 10027, USA.
| | | | | |
Collapse
|
34
|
Molecular Coevolution and the Three-Dimensionality of Natural Selection. Evol Biol 2009. [DOI: 10.1007/978-3-642-00952-5_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
35
|
Abstract
Neutralism and selectionism are extremes of an explanatory spectrum for understanding patterns of molecular evolution and the emergence of evolutionary innovation. Although recent genome-scale data from protein-coding genes argue against neutralism, molecular engineering and protein evolution data argue that neutral mutations and mutational robustness are important for evolutionary innovation. Here I propose a reconciliation in which neutral mutations prepare the ground for later evolutionary adaptation. Key to this perspective is an explicit understanding of molecular phenotypes that has only become accessible in recent years.
Collapse
|
36
|
Wright SI, Andolfatto P. The Impact of Natural Selection on the Genome: Emerging Patterns inDrosophilaandArabidopsis. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2008. [DOI: 10.1146/annurev.ecolsys.39.110707.173342] [Citation(s) in RCA: 87] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Stephen I. Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks St., Toronto, Ontario, M5S 3B2 Canada,
| | - Peter Andolfatto
- Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544,
| |
Collapse
|
37
|
Abstract
The distribution of genetic polymorphisms in a population contains information about evolutionary processes. The Poisson random field (PRF) model uses the polymorphism frequency spectrum to infer the mutation rate and the strength of directional selection. The PRF model relies on an infinite-sites approximation that is reasonable for most eukaryotic populations, but that becomes problematic when is large ( greater, similar 0.05). Here, we show that at large mutation rates characteristic of microbes and viruses the infinite-sites approximation of the PRF model induces systematic biases that lead it to underestimate negative selection pressures and mutation rates and erroneously infer positive selection. We introduce two new methods that extend our ability to infer selection pressures and mutation rates at large : a finite-site modification of the PRF model and a new technique based on diffusion theory. Our methods can be used to infer not only a "weighted average" of selection pressures acting on a gene sequence, but also the distribution of selection pressures across sites. We evaluate the accuracy of our methods, as well that of the original PRF approach, by comparison with Wright-Fisher simulations.
Collapse
|
38
|
Zhai W, Nielsen R, Slatkin M. An investigation of the statistical power of neutrality tests based on comparative and population genetic data. Mol Biol Evol 2008; 26:273-83. [PMID: 18922762 DOI: 10.1093/molbev/msn231] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
In this report, we investigate the statistical power of several tests of selective neutrality based on patterns of genetic diversity within and between species. The goal is to compare tests based solely on population genetic data with tests using comparative data or a combination of comparative and population genetic data. We show that in the presence of repeated selective sweeps on relatively neutral background, tests based on the d(N)/d(S) ratios in comparative data almost always have more power to detect selection than tests based on population genetic data, even if the overall level of divergence is low. Tests based solely on the distribution of allele frequencies or the site frequency spectrum, such as the Ewens-Watterson test or Tajima's D, have less power in detecting both positive and negative selection because of the transient nature of positive selection and the weak signal left by negative selection. The Hudson-Kreitman-Aguadé test is the most powerful test for detecting positive selection among the population genetic tests investigated, whereas McDonald-Kreitman test typically has more power to detect negative selection. We discuss our findings in the light of the discordant results obtained in several recently published genomic scans.
Collapse
Affiliation(s)
- Weiwei Zhai
- Department of Integrative Biology, University of California, Berkeley, USA.
| | | | | |
Collapse
|
39
|
Haddrill PR, Charlesworth B. Non-neutral processes drive the nucleotide composition of non-coding sequences in Drosophila. Biol Lett 2008; 4:438-41. [PMID: 18505714 PMCID: PMC2515589 DOI: 10.1098/rsbl.2008.0174] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
The nature of the forces affecting base composition is a key question in genome evolution. There is uncertainty as to whether differences in the GC contents of non-coding sequences reflect differences in mutational bias, or in the intensity of selection or biased gene conversion. We have used a polymorphism dataset for non-coding sequences on the X chromosome of Drosophila simulans to examine this question. The proportion of GC→AT versus AT→GC polymorphic mutations in a locus is correlated with its GC content. This implies the action of forces that favour GC over AT base pairs, which are apparently strongest in GC-rich sequences.
Collapse
Affiliation(s)
- Penelope R Haddrill
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Ashworth Laboratories, King's Buildings, Edinburgh, UK.
| | | |
Collapse
|
40
|
Palmé AE, Wright M, Savolainen O. Patterns of divergence among conifer ESTs and polymorphism in Pinus sylvestris identify putative selective sweeps. Mol Biol Evol 2008; 25:2567-77. [PMID: 18775901 DOI: 10.1093/molbev/msn194] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Finding genes that are under positive selection is a difficult task, especially in non-model organisms. Here, we have analyzed expressed sequence tag (EST) data from 4 species (Pinus pinaster, Pinus taeda, Picea glauca, and Pseudotsuga menziesii) to investigate selection patterns during their evolution and to identify genes likely to be under positive selection. To confirm selection, population samples of these genes have been sequenced in Pinus sylvestris, a species that was not included in the EST data set. The estimates of branch-specific Ka/Ks (nonsynonymous/synonymous substitution rates) across all genes in the EST data set were similar or smaller than estimates from other higher plant species. There was no evidence for the traditional indication of positive selection, Ka/Ks above 1. However, several lines of evidence based on polymorphism patterns suggest that genes with high Ka/Ks (0.20-0.52) in the EST data set are in fact more affected by positive selection in P. sylvestris than genes with low Ka/Ks (0.01-0.04). The high Ka/Ks genes have a lower level of polymorphism and more negative Tajima's D than the low Ka/Ks genes. Further, in the high Ka/Ks group, the Hudson-Kreitman-Aguade test is significant. This suggests that the EST data set is a good starting point for finding genes under positive selection in conifers and that even moderate Ka/Ks values could be indicative of selection. A group of 5 genes with high Ka/Ks collectively show evidence for positive selection within P. sylvestris.
Collapse
Affiliation(s)
- Anna E Palmé
- Department of Evolutionary Functional Genomics, Uppsala University, S-75236 Uppsala, Sweden.
| | | | | |
Collapse
|
41
|
Abstract
Rapid and inexpensive sequencing technologies are making it possible to collect whole genome sequence data on multiple individuals from a population. This type of data can be used to quickly identify genes that control important ecological and evolutionary phenotypes by finding the targets of adaptive natural selection, and we therefore refer to such approaches as "reverse ecology." To quantify the power gained in detecting positive selection using population genomic data, we compare three statistical methods for identifying targets of selection: the McDonald-Kreitman test, the mkprf method, and a likelihood implementation for detecting d(N)/d(S) > 1. Because the first two methods use polymorphism data we expect them to have more power to detect selection. However, when applied to population genomic datasets from human, fly, and yeast, the tests using polymorphism data were actually weaker in two of the three datasets. We explore reasons why the simpler comparative method has identified more genes under selection, and suggest that the different methods may really be detecting different signals from the same sequence data. Finally, we find several statistical anomalies associated with the mkprf method, including an almost linear dependence between the number of positively selected genes identified and the prior distributions used. We conclude that interpreting the results produced by this method should be done with some caution.
Collapse
Affiliation(s)
- Yong Fuga Li
- School of Informatics, Indiana University, Bloomington, IN, USA
| | | | | | | |
Collapse
|
42
|
Sethupathy P, Hannenhalli S. A tutorial of the poisson random field model in population genetics. Adv Bioinformatics 2008; 2008:257864. [PMID: 19920987 PMCID: PMC2775679 DOI: 10.1155/2008/257864] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2008] [Accepted: 05/15/2008] [Indexed: 11/18/2022] Open
Abstract
Population genetics is the study of allele frequency changes driven by various evolutionary forces such as mutation, natural selection, and random genetic drift. Although natural selection is widely recognized as a bona-fide phenomenon, the extent to which it drives evolution continues to remain unclear and controversial. Various qualitative techniques, or so-called "tests of neutrality", have been introduced to detect signatures of natural selection. A decade and a half ago, Stanley Sawyer and Daniel Hartl provided a mathematical framework, referred to as the Poisson random field (PRF), with which to determine quantitatively the intensity of selection on a particular gene or genomic region. The recent availability of large-scale genetic polymorphism data has sparked widespread interest in genome-wide investigations of natural selection. To that end, the original PRF model is of particular interest for geneticists and evolutionary genomicists. In this article, we will provide a tutorial of the mathematical derivation of the original Sawyer and Hartl PRF model.
Collapse
Affiliation(s)
- Praveen Sethupathy
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sridhar Hannenhalli
- Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Computer and Information Sciences, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
43
|
Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol 2008; 5:e310. [PMID: 17988176 PMCID: PMC2062478 DOI: 10.1371/journal.pbio.0050310] [Citation(s) in RCA: 495] [Impact Index Per Article: 29.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2007] [Accepted: 09/26/2007] [Indexed: 01/13/2023] Open
Abstract
The population genetic perspective is that the processes shaping genomic variation can be revealed only through simultaneous investigation of sequence polymorphism and divergence within and between closely related species. Here we present a population genetic analysis of Drosophila simulans based on whole-genome shotgun sequencing of multiple inbred lines and comparison of the resulting data to genome assemblies of the closely related species, D. melanogaster and D. yakuba. We discovered previously unknown, large-scale fluctuations of polymorphism and divergence along chromosome arms, and significantly less polymorphism and faster divergence on the X chromosome. We generated a comprehensive list of functional elements in the D. simulans genome influenced by adaptive evolution. Finally, we characterized genomic patterns of base composition for coding and noncoding sequence. These results suggest several new hypotheses regarding the genetic and biological mechanisms controlling polymorphism and divergence across the Drosophila genome, and provide a rich resource for the investigation of adaptive evolution and functional variation in D. simulans. Population genomics, the study of genome-wide patterns of sequence variation within and between closely related species, can provide a comprehensive view of the relative importance of mutation, recombination, natural selection, and genetic drift in evolution. It can also provide fundamental insights into the biological attributes of organisms that are specifically shaped by adaptive evolution. One approach for generating population genomic datasets is to align DNA sequences from whole-genome shotgun projects to a standard reference sequence. We used this approach to carry out whole-genome analysis of polymorphism and divergence in Drosophila simulans, a close relative of the model system, D. melanogaster. We find that polymorphism and divergence fluctuate on a large scale across the genome and that these fluctuations are probably explained by natural selection rather than by variation in mutation rates. Our analysis suggests that adaptive protein evolution is common and is often related to biological processes that may be associated with gene expression, chromosome biology, and reproduction. The approaches presented here will have broad applicability to future analysis of population genomic variation in other systems, including humans. Low-coverage genome sequences from multiple Drosophila simulans strains provide the first comprehensive view of polymorphism and divergence in the fruit fly.
Collapse
|
44
|
Five Drosophila genomes reveal nonneutral evolution and the signature of host specialization in the chemoreceptor superfamily. Genetics 2008; 177:1395-416. [PMID: 18039874 DOI: 10.1534/genetics.107.078683] [Citation(s) in RCA: 160] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The insect chemoreceptor superfamily comprises the olfactory receptor (Or) and gustatory receptor (Gr) multigene families. These families give insects the ability to smell and taste chemicals in the environment and are thus rich resources for linking molecular evolutionary and ecological processes. Although dramatic differences in family size among distant species and high divergence among paralogs have led to the belief that the two families evolve rapidly, a lack of evolutionary data over short time scales has frustrated efforts to identify the major forces shaping this evolution. Here, we investigate patterns of gene loss/gain, divergence, and polymorphism in the entire repertoire of approximately 130 chemoreceptor genes from five closely related species of Drosophila that share a common ancestor within the past 12 million years. We demonstrate that the overall evolution of the Or and Gr families is nonneutral. We also show that selection regimes differ both between the two families as wholes and within each family among groups of genes with varying functions, patterns of expression, and phylogenetic histories. Finally, we find that the independent evolution of host specialization in Drosophila sechellia and D. erecta is associated with a fivefold acceleration of gene loss and increased rates of amino acid evolution at receptors that remain intact. Gene loss appears to primarily affect Grs that respond to bitter compounds while elevated Ka/Ks is most pronounced in the subset of Ors that are expressed in larvae. Our results provide strong evidence that the observed phenomena result from the invasion of a novel ecological niche and present a unique synthesis of molecular evolutionary analyses with ecological data.
Collapse
|
45
|
The molecular basis of host adaptation in cactophilic Drosophila: molecular evolution of a glutathione S-transferase gene (GstD1) in Drosophila mojavensis. Genetics 2008; 178:1073-83. [PMID: 18245335 DOI: 10.1534/genetics.107.083287] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Drosophila mojavensis is a cactophilic fly endemic to the northwestern deserts of North America. This species includes four genetically isolated cactus host races each individually specializing on the necrotic tissues of a different cactus species. The necrosis of each cactus species provides the resident D. mojavensis populations with a distinct chemical environment. A previous investigation of the role of transcriptional variation in the adaptation of D. mojavensis to its hosts produced a set of candidate loci that are differentially expressed in response to host shifts, and among them was glutathione S-transferase D1 (GstD1). In both D. melanogaster and Anopheles gambiae, GstD1 has been implicated in the resistance of these species to the insecticide dichloro-diphenyl-trichloroethane (DDT). The pattern of sequence variation of the GstD1 locus from all four D. mojavensis populations, D. arizonae (sister species), and D. navojoa (outgroup) has been examined. The data suggest that in two populations of D. mojavensis GstD1 has gone through a period of adaptive amino acid evolution. Further analyses indicate that of the seven amino acid fixations that occurred in the D. mojavensis lineage, two of them occur in the active site pocket, potentially having a significant effect on substrate specificity and in the adaptation to alternative cactus hosts.
Collapse
|
46
|
|
47
|
Heger A, Ponting CP. Variable strength of translational selection among 12 Drosophila species. Genetics 2007; 177:1337-48. [PMID: 18039870 PMCID: PMC2147958 DOI: 10.1534/genetics.107.070466] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2007] [Accepted: 09/05/2007] [Indexed: 01/06/2023] Open
Abstract
Codon usage bias in Drosophila melanogaster genes has been attributed to negative selection of those codons whose cellular tRNA abundance restricts rates of mRNA translation. Previous studies, which involved limited numbers of genes, can now be compared against analyses of the entire gene complements of 12 Drosophila species whose genome sequences have become available. Using large numbers (6138) of orthologs represented in all 12 species, we establish that the codon preferences of more closely related species are better correlated. Differences between codon usage biases are attributed, in part, to changes in mutational biases. These biases are apparent from the strong correlation (r = 0.92, P < 0.001) among these genomes' intronic G + C contents and exonic G + C contents at degenerate third codon positions. To perform a cross-species comparison of selection on codon usage, while accounting for changes in mutational biases, we calibrated each genome in turn using the codon usage bias indices of highly expressed ribosomal protein genes. The strength of translational selection was predicted to have varied between species largely according to their phylogeny, with the D. melanogaster group species exhibiting the strongest degree of selection.
Collapse
Affiliation(s)
- Andreas Heger
- MRC Functional Genetics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom.
| | | |
Collapse
|
48
|
Akashi H, Goel P, John A. Ancestral inference and the study of codon bias evolution: implications for molecular evolutionary analyses of the Drosophila melanogaster subgroup. PLoS One 2007; 2:e1065. [PMID: 17957249 PMCID: PMC2020436 DOI: 10.1371/journal.pone.0001065] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2007] [Accepted: 09/21/2007] [Indexed: 11/18/2022] Open
Abstract
Reliable inference of ancestral sequences can be critical to identifying both patterns and causes of molecular evolution. Robustness of ancestral inference is often assumed among closely related species, but tests of this assumption have been limited. Here, we examine the performance of inference methods for data simulated under scenarios of codon bias evolution within the Drosophila melanogaster subgroup. Genome sequence data for multiple, closely related species within this subgroup make it an important system for studying molecular evolutionary genetics. The effects of asymmetric and lineage-specific substitution rates (i.e., varying levels of codon usage bias and departures from equilibrium) on the reliability of ancestral codon usage was investigated. Maximum parsimony inference, which has been widely employed in analyses of Drosophila codon bias evolution, was compared to an approach that attempts to account for uncertainty in ancestral inference by weighting ancestral reconstructions by their posterior probabilities. The latter approach employs maximum likelihood estimation of rate and base composition parameters. For equilibrium and most non-equilibrium scenarios that were investigated, the probabilistic method appears to generate reliable ancestral codon bias inferences for molecular evolutionary studies within the D. melanogaster subgroup. These reconstructions are more reliable than parsimony inference, especially when codon usage is strongly skewed. However, inference biases are considerable for both methods under particular departures from stationarity (i.e., when adaptive evolution is prevalent). Reliability of inference can be sensitive to branch lengths, asymmetry in substitution rates, and the locations and nature of lineage-specific processes within a gene tree. Inference reliability, even among closely related species, can be strongly affected by (potentially unknown) patterns of molecular evolution in lineages ancestral to those of interest.
Collapse
Affiliation(s)
- Hiroshi Akashi
- Institute of Molecular Evolutionary Genetics, Department of Biology, Pennsylvania State University, State College, Pennsylvania, United States of America.
| | | | | |
Collapse
|
49
|
Charlesworth J, Eyre-Walker A. The other side of the nearly neutral theory, evidence of slightly advantageous back-mutations. Proc Natl Acad Sci U S A 2007; 104:16992-7. [PMID: 17940029 PMCID: PMC2040392 DOI: 10.1073/pnas.0705456104] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2007] [Indexed: 11/18/2022] Open
Abstract
We argue that if there is a category of slightly deleterious mutations, then there should be a category of slightly advantageous back-mutations. We show that when there are both slightly deleterious and advantageous back-mutations, there is likely to be an increase in the rate of evolution after a population size expansion. This increase in the rate of evolution is short-lived. However, we show how its signature can be captured by comparing the rate of evolution in species that have undergone population size expansion versus contraction. We test our model by comparing the pattern of evolution in pairs of island and mainland species in which the colonization event was either island-to-mainland (population size expansion) or mainland-to-island (contraction). We show that the predicted pattern of evolution is observed.
Collapse
Affiliation(s)
- Jane Charlesworth
- Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Brighton BN1 9QG, United Kingdom
| | - Adam Eyre-Walker
- Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Brighton BN1 9QG, United Kingdom
| |
Collapse
|
50
|
Anisimova M, Liberles DA. The quest for natural selection in the age of comparative genomics. Heredity (Edinb) 2007; 99:567-79. [PMID: 17848974 DOI: 10.1038/sj.hdy.6801052] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Continued genome sequencing has fueled progress in statistical methods for understanding the action of natural selection at the molecular level. This article reviews various statistical techniques (and their applicability) for detecting adaptation events and the functional divergence of proteins. As large-scale automated studies become more frequent, they provide a useful resource for generating biological null hypotheses for further experimental and statistical testing. Furthermore, they shed light on typical patterns of lineage-specific evolution of organisms, on the functional and structural evolution of protein families and on the interplay between the two. More complex models are being developed to better reflect the underlying biological and chemical processes and to complement simpler statistical models. Linking molecular processes to their statistical signatures in genomes can be demanding, and the proper application of statistical models is discussed.
Collapse
Affiliation(s)
- M Anisimova
- Department of Biology, University College London, London, UK
| | | |
Collapse
|