1
|
Ho AT, Hurst LD. Unusual mammalian usage of TGA stop codons reveals that sequence conservation need not imply purifying selection. PLoS Biol 2022; 20:e3001588. [PMID: 35550630 PMCID: PMC9129041 DOI: 10.1371/journal.pbio.3001588] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 05/24/2022] [Accepted: 04/20/2022] [Indexed: 11/18/2022] Open
Abstract
The assumption that conservation of sequence implies the action of purifying selection is central to diverse methodologies to infer functional importance. GC-biased gene conversion (gBGC), a meiotic mismatch repair bias strongly favouring GC over AT, can in principle mimic the action of selection, this being thought to be especially important in mammals. As mutation is GC→AT biased, to demonstrate that gBGC does indeed cause false signals requires evidence that an AT-rich residue is selectively optimal compared to its more GC-rich allele, while showing also that the GC-rich alternative is conserved. We propose that mammalian stop codon evolution provides a robust test case. Although in most taxa TAA is the optimal stop codon, TGA is both abundant and conserved in mammalian genomes. We show that this mammalian exceptionalism is well explained by gBGC mimicking purifying selection and that TAA is the selectively optimal codon. Supportive of gBGC, we observe (i) TGA usage trends are consistent at the focal stop codon and elsewhere (in UTR sequences); (ii) that higher TGA usage and higher TAA→TGA substitution rates are predicted by a high recombination rate; and (iii) across species the difference in TAA <-> TGA substitution rates between GC-rich and GC-poor genes is largest in genomes that possess higher between-gene GC variation. TAA optimality is supported both by enrichment in highly expressed genes and trends associated with effective population size. High TGA usage and high TAA→TGA rates in mammals are thus consistent with gBGC’s predicted ability to “drive” deleterious mutations and supports the hypothesis that sequence conservation need not be indicative of purifying selection. A general trend for GC-rich trinucleotides to reside at frequencies far above their mutational equilibrium in high recombining domains supports the generality of these results.
Collapse
Affiliation(s)
- Alexander Thomas Ho
- Milner Centre for Evolution, University of Bath, Bath, United Kingdom
- * E-mail:
| | | |
Collapse
|
2
|
Rajaei M, Saxena AS, Johnson LM, Snyder MC, Crombie TA, Tanny RE, Andersen EC, Joyner-Matos J, Baer CF. Mutability of mononucleotide repeats, not oxidative stress, explains the discrepancy between laboratory-accumulated mutations and the natural allele-frequency spectrum in C. elegans. Genome Res 2021; 31:1602-1613. [PMID: 34404692 PMCID: PMC8415377 DOI: 10.1101/gr.275372.121] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 07/12/2021] [Indexed: 11/24/2022]
Abstract
Important clues about natural selection can be gleaned from discrepancies between the properties of segregating genetic variants and of mutations accumulated experimentally under minimal selection, provided the mutational process is the same in the laboratory as in nature. The base-substitution spectrum differs between C. elegans laboratory mutation accumulation (MA) experiments and the standing site-frequency spectrum, which has been argued to be in part owing to increased oxidative stress in the laboratory environment. Using genome sequence data from C. elegans MA lines carrying a mutation (mev-1) that increases the cellular titer of reactive oxygen species (ROS), leading to increased oxidative stress, we find the base-substitution spectrum is similar between mev-1, its wild-type progenitor (N2), and another set of MA lines derived from a different wild strain (PB306). Conversely, the rate of short insertions is greater in mev-1, consistent with studies in other organisms in which environmental stress increased the rate of insertion–deletion mutations. Further, the mutational properties of mononucleotide repeats in all strains are different from those of nonmononucleotide sequence, both for indels and base-substitutions, and whereas the nonmononucleotide spectra are fairly similar between MA lines and wild isolates, the mononucleotide spectra are very different, with a greater frequency of A:T → T:A transversions and an increased proportion of ±1-bp indels. The discrepancy in mutational spectra between laboratory MA experiments and natural variation is likely owing to a consistent (but unknown) effect of the laboratory environment that manifests itself via different modes of mutability and/or repair at mononucleotide loci.
Collapse
Affiliation(s)
- Moein Rajaei
- Department of Biology, University of Florida, Gainesville, Florida 32611, USA
| | | | - Lindsay M Johnson
- Department of Biology, University of Florida, Gainesville, Florida 32611, USA
| | - Michael C Snyder
- Department of Biology, University of Florida, Gainesville, Florida 32611, USA
| | - Timothy A Crombie
- Department of Biology, University of Florida, Gainesville, Florida 32611, USA.,Department of Molecular Biosciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Robyn E Tanny
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Erik C Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Joanna Joyner-Matos
- Department of Biology, Eastern Washington University, Cheney, Washington 99004, USA
| | - Charles F Baer
- Department of Biology, University of Florida, Gainesville, Florida 32611, USA.,University of Florida Genetics Institute, Gainesville, Florida 32608, USA
| |
Collapse
|
3
|
Saxena AS, Salomon MP, Matsuba C, Yeh SD, Baer CF. Evolution of the Mutational Process under Relaxed Selection in Caenorhabditis elegans. Mol Biol Evol 2019; 36:239-251. [PMID: 30445510 DOI: 10.1093/molbev/msy213] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The mutational process varies at many levels, from within genomes to among taxa. Many mechanisms have been linked to variation in mutation, but understanding of the evolution of the mutational process is rudimentary. Physiological condition is often implicated as a source of variation in microbial mutation rate and may contribute to mutation rate variation in multicellular organisms.Deleterious mutations are an ubiquitous source of variation in condition. We test the hypothesis that the mutational process depends on the underlying mutation load in two groups of Caenorhabditis elegans mutation accumulation (MA) lines that differ in their starting mutation loads. "First-order MA" (O1MA) lines maintained under minimal selection for ∼250 generations were divided into high-fitness and low-fitness groups and sets of "second-order MA" (O2MA) lines derived from each O1MA line were maintained for ∼150 additional generations. Genomes of 48 O2MA lines and their progenitors were sequenced. There is significant variation among O2MA lines in base-substitution rate (µbs), but no effect of initial fitness; the indel rate is greater in high-fitness O2MA lines. Overall, µbs is positively correlated with recombination and proximity to short tandem repeats and negatively correlated with 10 bp and 1 kb GC content. However, probability of mutation is sufficiently predicted by the three-nucleotide motif alone. Approximately 90% of the variance in standing nucleotide variation is explained by mutability. Total mutation rate increased in the O2MA lines, as predicted by the "drift barrier" model of mutation rate evolution. These data, combined with experimental estimates of fitness, suggest that epistasis is synergistic.
Collapse
Affiliation(s)
| | - Matthew P Salomon
- Department of Biology, University of Florida, Gainesville, FL
- Department of Molecular Oncology, John Wayne Cancer Institute, Santa Monica, CA
| | - Chikako Matsuba
- Department of Biology, University of Florida, Gainesville, FL
- Department of Molecular Oncology, John Wayne Cancer Institute, Santa Monica, CA
| | - Shu-Dan Yeh
- Department of Biology, University of Florida, Gainesville, FL
- Department of Life Sciences, National Central University, Taoyuan, Taiwan
| | - Charles F Baer
- Department of Biology, University of Florida, Gainesville, FL
- University of Florida Genetics Institute
| |
Collapse
|
4
|
Abstract
Environmental dependence of mutation in microbes is well-known, but most experiments have investigated contexts in which growth rate is greatly reduced below optimum. A new experiment shows mutational variability extends to contexts in which growth is near optimum.
Collapse
Affiliation(s)
- Charles F Baer
- Department of Biology, University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611-8525, USA.
| |
Collapse
|
5
|
Brunet FG, Audit B, Drillon G, Argoul F, Volff JN, Arneodo A. Evidence for DNA Sequence Encoding of an Accessible Nucleosomal Array across Vertebrates. Biophys J 2018; 114:2308-2316. [PMID: 29580552 PMCID: PMC6028776 DOI: 10.1016/j.bpj.2018.02.025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 02/07/2018] [Accepted: 02/20/2018] [Indexed: 12/15/2022] Open
Abstract
Nucleosome-depleted regions around which nucleosomes order following the "statistical" positioning scenario were recently shown to be encoded in the DNA sequence in human. This intrinsic nucleosomal ordering strongly correlates with oscillations in the local GC content as well as with the interspecies and intraspecies mutation profiles, revealing the existence of both positive and negative selection. In this letter, we show that these predicted nucleosome inhibitory energy barriers (NIEBs) with compacted neighboring nucleosomes are indeed ubiquitous to all vertebrates tested. These 1 kb-sized chromatin patterns are widely distributed along vertebrate chromosomes, overall covering more than a third of the genome. We have previously observed in human deviations from neutral evolution at these genome-wide distributed regions, which we interpreted as a possible indication of the selection of an open, accessible, and dynamic nucleosomal array to constitutively facilitate the epigenetic regulation of nuclear functions in a cell-type-specific manner. As a first, very appealing observation supporting this hypothesis, we report evidence of a strong association between NIEB borders and the poly(A) tails of Alu sequences in human. These results suggest that NIEBs provide adequate chromatin patterns favorable to the integration of Alu retrotransposons and, more generally to various transposable elements in the genomes of primates and other vertebrates.
Collapse
Affiliation(s)
- Frédéric G Brunet
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Benjamin Audit
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Guénola Drillon
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Françoise Argoul
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France
| | - Jean-Nicolas Volff
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Alain Arneodo
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France.
| |
Collapse
|
6
|
Drillon G, Audit B, Argoul F, Arneodo A. Evidence of selection for an accessible nucleosomal array in human. BMC Genomics 2016; 17:526. [PMID: 27472913 PMCID: PMC4966569 DOI: 10.1186/s12864-016-2880-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 07/04/2016] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Recently, a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix has been used to reveal some enrichment of nucleosome-inhibiting energy barriers (NIEBs) nearby ubiquitous human "master" replication origins. Here we use this model to predict the existence of about 1.6 millions NIEBs over the 22 human autosomes. RESULTS We show that these high energy barriers of mean size 153 bp correspond to nucleosome-depleted regions (NDRs) in vitro, as expected, but also in vivo. On either side of these NIEBs, we observe, in vivo and in vitro, a similar compacted nucleosome ordering, suggesting an absence of chromatin remodeling. This nucleosomal ordering strongly correlates with oscillations of the GC content as well as with the interspecies and intraspecies mutation profiles along these regions. Comparison of these divergence rates reveals the existence of both positive and negative selections linked to nucleosome positioning around these intrinsic NDRs. Overall, these NIEBs and neighboring nucleosomes cover 37.5 % of the human genome where nucleosome occupancy is stably encoded in the DNA sequence. These 1 kb-sized regions of intrinsic nucleosome positioning are equally found in GC-rich and GC-poor isochores, in early and late replicating regions, in intergenic and genic regions but not at gene promoters. CONCLUSION The source of selection pressure on the NIEBs has yet to be resolved in future work. One possible scenario is that these widely distributed chromatin patterns have been selected in human to impair the condensation of the nucleosomal array into the 30 nm chromatin fiber, so as to facilitate the epigenetic regulation of nuclear functions in a cell-type-specific manner.
Collapse
Affiliation(s)
- Guénola Drillon
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
| | - Benjamin Audit
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
| | - Françoise Argoul
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
- LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, F-33405 France
| | - Alain Arneodo
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
- LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, F-33405 France
| |
Collapse
|
7
|
Babbitt GA, Coppola EE, Alawad MA, Hudson AO. Can all heritable biology really be reduced to a single dimension? Gene 2016; 578:162-8. [DOI: 10.1016/j.gene.2015.12.043] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Revised: 12/16/2015] [Accepted: 12/17/2015] [Indexed: 12/23/2022]
|
8
|
Fortin CH, Schulze KV, Babbitt GA. TRX-LOGOS - a graphical tool to demonstrate DNA information content dependent upon backbone dynamics in addition to base sequence. SOURCE CODE FOR BIOLOGY AND MEDICINE 2015; 10:10. [PMID: 26413153 PMCID: PMC4583169 DOI: 10.1186/s13029-015-0040-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2015] [Accepted: 09/11/2015] [Indexed: 01/26/2023]
Abstract
BACKGROUND It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. SOFTWARE AND IMPLEMENTATION We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. RESULTS To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. CONCLUSION TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.
Collapse
Affiliation(s)
- Connor H Fortin
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester, NY 14623 USA
| | - Katharina V Schulze
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030 USA
| | - Gregory A Babbitt
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester, NY 14623 USA
| |
Collapse
|
9
|
Abstract
Mutational heterogeneity must be taken into account when reconstructing evolutionary histories, calibrating molecular clocks, and predicting links between genes and disease. Selective pressures and various DNA transactions have been invoked to explain the heterogeneous distribution of genetic variation between species, within populations, and in tissue-specific tumors. To examine relationships between such heterogeneity and variations in leading- and lagging-strand replication fidelity and mismatch repair, we accumulated 40,000 spontaneous mutations in eight diploid yeast strains in the absence of selective pressure. We found that replicase error rates vary by fork direction, coding state, nucleosome proximity, and sequence context. Further, error rates and DNA mismatch repair efficiency both vary by mismatch type, responsible polymerase, replication time, and replication origin proximity. Mutation patterns implicate replication infidelity as one driver of variation in somatic and germline evolution, suggest mechanisms of mutual modulation of genome stability and composition, and predict future observations in specific cancers.
Collapse
|
10
|
Babbitt GA, Alawad MA, Schulze KV, Hudson AO. Synonymous codon bias and functional constraint on GC3-related DNA backbone dynamics in the prokaryotic nucleoid. Nucleic Acids Res 2014; 42:10915-26. [PMID: 25200075 PMCID: PMC4176184 DOI: 10.1093/nar/gku811] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
While mRNA stability has been demonstrated to control rates of translation, generating both global and local synonymous codon biases in many unicellular organisms, this explanation cannot adequately explain why codon bias strongly tracks neighboring intergene GC content; suggesting that structural dynamics of DNA might also influence codon choice. Because minor groove width is highly governed by 3-base periodicity in GC, the existence of triplet-based codons might imply a functional role for the optimization of local DNA molecular dynamics via GC content at synonymous sites (≈GC3). We confirm a strong association between GC3-related intrinsic DNA flexibility and codon bias across 24 different prokaryotic multiple whole-genome alignments. We develop a novel test of natural selection targeting synonymous sites and demonstrate that GC3-related DNA backbone dynamics have been subject to moderate selective pressure, perhaps contributing to our observation that many genes possess extreme DNA backbone dynamics for their given protein space. This dual function of codons may impose universal functional constraints affecting the evolution of synonymous and non-synonymous sites. We propose that synonymous sites may have evolved as an 'accessory' during an early expansion of a primordial genetic code, allowing for multiplexed protein coding and structural dynamic information within the same molecular context.
Collapse
Affiliation(s)
- Gregory A Babbitt
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Mohammed A Alawad
- B. Thomas Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| | - Katharina V Schulze
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston TX, USA 77030
| | - André O Hudson
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester NY, USA 14623
| |
Collapse
|
11
|
Nucleosomes shape DNA polymorphism and divergence. PLoS Genet 2014; 10:e1004457. [PMID: 24991813 PMCID: PMC4081404 DOI: 10.1371/journal.pgen.1004457] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Accepted: 05/12/2014] [Indexed: 11/30/2022] Open
Abstract
An estimated 80% of genomic DNA in eukaryotes is packaged as nucleosomes, which, together with the remaining interstitial linker regions, generate higher order chromatin structures [1]. Nucleosome sequences isolated from diverse organisms exhibit ∼10 bp periodic variations in AA, TT and GC dinucleotide frequencies. These sequence elements generate intrinsically curved DNA and help establish the histone-DNA interface. We investigated an important unanswered question concerning the interplay between chromatin organization and genome evolution: do the DNA sequence preferences inherent to the highly conserved histone core exert detectable natural selection on genomic divergence and polymorphism? To address this hypothesis, we isolated nucleosomal DNA sequences from Drosophila melanogaster embryos and examined the underlying genomic variation within and between species. We found that divergence along the D. melanogaster lineage is periodic across nucleosome regions with base changes following preferred nucleotides, providing new evidence for systematic evolutionary forces in the generation and maintenance of nucleosome-associated dinucleotide periodicities. Further, Single Nucleotide Polymorphism (SNP) frequency spectra show striking periodicities across nucleosomal regions, paralleling divergence patterns. Preferred alleles occur at higher frequencies in natural populations, consistent with a central role for natural selection. These patterns are stronger for nucleosomes in introns than in intergenic regions, suggesting selection is stronger in transcribed regions where nucleosomes undergo more displacement, remodeling and functional modification. In addition, we observe a large-scale (∼180 bp) periodic enrichment of AA/TT dinucleotides associated with nucleosome occupancy, while GC dinucleotide frequency peaks in linker regions. Divergence and polymorphism data also support a role for natural selection in the generation and maintenance of these super-nucleosomal patterns. Our results demonstrate that nucleosome-associated sequence periodicities are under selective pressure, implying that structural interactions between nucleosomes and DNA sequence shape sequence evolution, particularly in introns. In eukaryotic cells, the majority of DNA is packaged in nucleosomes comprised of ∼147 bp of DNA wound tightly around the highly conserved histone octamer. Nucleosomal DNA from diverse organisms shows an anti-correlated ∼10 bp periodicity of AT-rich and GC-rich dinucleotides. These sequence features influence DNA bending and shape, facilitating structural interactions. We asked whether natural selection mediated through the periodic sequence preferences of nucleosomes shapes the evolution of non-protein-coding regions of D. melanogaster by examining the inter- and intra-species genomic variation relative to these fundamental chromatin building blocks. The sequence changes across nucleosome-bound regions on the melanogaster lineage mirror the observed nucleosome dinucleotide periodicities. Importantly, we show that the frequencies of polymorphisms in natural populations vary across these regions, paralleling divergence, with higher frequencies of preferred alleles. These patterns are most evident for intronic regions and indicate that non-protein coding regions are evolving toward sequences that facilitate the canonical association with the histone core. This result is consistent with the hypothesis that interactions between DNA and the core have systematic impacts on function that are subject to natural selection and are not solely due to mutational bias. These ubiquitous interactions with the histone core partially account for the evolutionary constraint observed in unannotated genomic regions, and may drive broad changes in base composition.
Collapse
|
12
|
Opposing forces of A/T-biased mutations and G/C-biased gene conversions shape the genome of the nematode Pristionchus pacificus. Genetics 2014; 196:1145-52. [PMID: 24414549 DOI: 10.1534/genetics.113.159863] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Base substitution mutations are a major source of genetic novelty and mutation accumulation line (MAL) studies revealed a nearly universal AT bias in de novo mutation spectra. While a comparison of de novo mutation spectra with the actual nucleotide composition in the genome suggests the existence of general counterbalancing mechanisms, little is known about the evolutionary and historical details of these opposing forces. Here, we correlate MAL-derived mutation spectra with patterns observed from population resequencing. Variation observed in natural populations has already been subject to evolutionary forces. Distinction between rare and common alleles, the latter of which are close to fixation and of presumably older age, can provide insight into mutational processes and their influence on genome evolution. We provide a genome-wide analysis of de novo mutations in 22 MALs of the nematode Pristionchus pacificus and compare the spectra with natural variants observed in resequencing of 104 natural isolates. MALs show an AT bias of 5.3, one of the highest values observed to date. In contrast, the AT bias in natural variants is much lower. Specifically, rare derived alleles show an AT bias of 2.4, whereas common derived alleles close to fixation show no AT bias at all. These results indicate the existence of a strong opposing force and they suggest that the GC content of the P. pacificus genome is in equilibrium. We discuss GC-biased gene conversion as a potential mechanism acting against AT-biased mutations. This study provides insight into genome evolution by combining MAL studies with natural variation.
Collapse
|
13
|
Abstract
Natural selection defined by differential survival and reproduction of individuals in populations is influenced by genetic, developmental, and environmental factors operating at every age and stage in human life history: generation of gametes, conception, birth, maturation, reproduction, senescence, and death. Biological systems are built upon a hierarchical organization nesting subcellular organelles, cells, tissues, and organs within individuals, individuals within families, and families within populations, and the latter among other populations. Natural selection often acts simultaneously at more than one level of biological organization and on specific traits, which we define as multilevel selection. Under this model, the individual is a fundamental unit of biological organization and also of selection, imbedded in a larger evolutionary context, just as it is a unit of medical intervention imbedded in larger biological, cultural, and environmental contexts. Here, we view human health and life span as necessary consequences of natural selection, operating at all levels and phases of biological hierarchy in human life history as well as in sociological and environmental milieu. An understanding of the spectrum of opportunities for natural selection will help us develop novel approaches to improving healthy life span through specific and global interventions that simultaneously focus on multiple levels of biological organization. Indeed, many opportunities exist to apply multilevel selection models employed in evolutionary biology and biodemography to improving human health at all hierarchical levels. Multilevel selection perspective provides a rational theoretical foundation for a synthesis of medicine and evolution that could lead to discovering effective predictive, preventive, palliative, potentially curative, and individualized approaches in medicine and in global health programs.
Collapse
|
14
|
Warnecke T, Becker EA, Facciotti MT, Nislow C, Lehner B. Conserved substitution patterns around nucleosome footprints in eukaryotes and Archaea derive from frequent nucleosome repositioning through evolution. PLoS Comput Biol 2013; 9:e1003373. [PMID: 24278010 PMCID: PMC3836710 DOI: 10.1371/journal.pcbi.1003373] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 10/13/2013] [Indexed: 11/21/2022] Open
Abstract
Nucleosomes, the basic repeat units of eukaryotic chromatin, have been suggested to influence the evolution of eukaryotic genomes, both by altering the propensity of DNA to mutate and by selection acting to maintain or exclude nucleosomes in particular locations. Contrary to the popular idea that nucleosomes are unique to eukaryotes, histone proteins have also been discovered in some archaeal genomes. Archaeal nucleosomes, however, are quite unlike their eukaryotic counterparts in many respects, including their assembly into tetramers (rather than octamers) from histone proteins that lack N- and C-terminal tails. Here, we show that despite these fundamental differences the association between nucleosome footprints and sequence evolution is strikingly conserved between humans and the model archaeon Haloferax volcanii. In light of this finding we examine whether selection or mutation can explain concordant substitution patterns in the two kingdoms. Unexpectedly, we find that neither the mutation nor the selection model are sufficient to explain the observed association between nucleosomes and sequence divergence. Instead, we demonstrate that nucleosome-associated substitution patterns are more consistent with a third model where sequence divergence results in frequent repositioning of nucleosomes during evolution. Indeed, we show that nucleosome repositioning is both necessary and largely sufficient to explain the association between current nucleosome positions and biased substitution patterns. This finding highlights the importance of considering the direction of causality between genetic and epigenetic change. Genome sequences as well as epigenetic states, such as DNA methylation or nucleosome binding patterns, change during evolution. But what is the causal relationship between the two? We already know that nucleotide variation within and between species is distributed unevenly around nucleosome footprints, but does this mean that sequence evolution follows a biased course because the presence of nucleosomes affects mutation and DNA repair dynamics? Or is it, in fact, the other way around, i.e. changes happen at the DNA level and prompt shifts in nucleosome positioning? To investigate the direction of causality in genetic versus epigenetic evolution, we analyze substitutions patterns in eukaryotes as well as the archaeon Haloferax volcanii in the context of genome-wide nucleosome binding maps. We demonstrate that the relationship between nucleosome positions and between-species divergence patterns, strikingly similar in eukaryotes and archaea, can be explained in large parts by nucleosomes shifting positions in response to substitution, although both mutation and selection biases might still exist. Our results illustrate that it is important to consider the direction of causality between epigenetic and genetic change when analyzing patterns of sequence divergence and using sequence conservation to infer selection on epigenetic states.
Collapse
Affiliation(s)
- Tobias Warnecke
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- * E-mail:
| | - Erin A. Becker
- Microbiology Graduate Group, University of California, Davis, Davis, California, United States of America
| | - Marc T. Facciotti
- Microbiology Graduate Group, University of California, Davis, Davis, California, United States of America
- Department of Biomedical Engineering, University of California, Davis, Davis, California, United States of America
- Genome Center, University of California, Davis, Davis, California, United States of America
| | - Corey Nislow
- Department of Pharmaceutical Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Ben Lehner
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- EMBL-CRG Systems Biology Unit, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain
| |
Collapse
|
15
|
Babbitt GA, Schulze KV. Codons support the maintenance of intrinsic DNA polymer flexibility over evolutionary timescales. Genome Biol Evol 2012; 4:954-65. [PMID: 22936074 PMCID: PMC3468960 DOI: 10.1093/gbe/evs073] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/21/2012] [Indexed: 01/02/2023] Open
Abstract
Despite our long familiarity with how the genetic code specifies the amino acid sequence, we still know little about why it is organized in the way that it is. Contrary to the view that the organization of the genetic code is a "frozen accident" of evolution, recent studies have demonstrated that it is highly nonrandom, with implications for both codon assignment and usage. We hypothesize that this inherent nonrandomness may facilitate the coexistence of both sequence and structural information in DNA. Here, we take advantage of a simple metric of intrinsic DNA flexibility to analyze mutational effects on the four phosphate linkages present in any given codon. Application of a simple evolutionary neutral model of substitution to random sequences, translated with alternative genetic codes, reveals that the standard code is highly optimized to favor synonymous substitutions that maximize DNA polymer flexibility, potentially counteracting neutral evolutionary drift toward stiffer DNA caused by spontaneous deamination. Comparison to existing mutational patterns in yeast also demonstrates evidence of strong selective constraint on DNA flexibility, especially at so-called "silent" sites. We also report a fundamental relationship between DNA flexibility, codon usage bias, and several important evolutionary descriptors of comparative genomics (e.g., base composition, transition/transversion ratio, and nonsynonymous vs. synonymous substitution rate). Recent advances in structural genomics have emphasized the role of the DNA polymer's flexibility in both gene function and whole genome folding, thereby implicating possible reasons for codons to facilitate the multiplexing of both genetic and structural information within the same molecular context.
Collapse
Affiliation(s)
- G A Babbitt
- TH Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester, NY, USA.
| | | |
Collapse
|
16
|
Denver DR, Wilhelm LJ, Howe DK, Gafner K, Dolan PC, Baer CF. Variation in base-substitution mutation in experimental and natural lineages of Caenorhabditis nematodes. Genome Biol Evol 2012; 4:513-22. [PMID: 22436997 PMCID: PMC3342874 DOI: 10.1093/gbe/evs028] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Variation among lineages in the mutation process has the potential to impact diverse biological processes ranging from susceptibilities to genetic disease to the mode and tempo of molecular evolution. The combination of high-throughput DNA sequencing (HTS) with mutation-accumulation (MA) experiments has provided a powerful approach to genome-wide mutation analysis, though insights into mutational variation have been limited by the vast evolutionary distances among the few species analyzed. We performed a HTS analysis of MA lines derived from four Caenorhabditis nematode natural genotypes: C. elegans N2 and PB306 and C. briggsae HK104 and PB800. Total mutation rates did not differ among the four sets of MA lines. A mutational bias toward G:C→A:T transitions and G:C→T:A transversions was observed in all four sets of MA lines. Chromosome-specific rates were mostly stable, though there was some evidence for a slightly elevated X chromosome mutation rate in PB306. Rates were homogeneous among functional coding sequence types and across autosomal cores, arms, and tips. Mutation spectra were similar among the four MA line sets but differed significantly when compared with patterns of natural base-substitution polymorphism for 13/14 comparisons performed. Our findings show that base-substitution mutation processes in these closely related animal lineages are mostly stable but differ from natural polymorphism patterns in these two species.
Collapse
Affiliation(s)
- Dee R Denver
- Department of Zoology and Center for Genome Research and Biocomputing, Oregon State University, OR, USA.
| | | | | | | | | | | |
Collapse
|
17
|
Parker SCJ, Tullius TD. DNA shape, genetic codes, and evolution. Curr Opin Struct Biol 2011; 21:342-7. [PMID: 21439813 PMCID: PMC3112471 DOI: 10.1016/j.sbi.2011.03.002] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 03/03/2011] [Accepted: 03/04/2011] [Indexed: 01/04/2023]
Abstract
Although the three-letter genetic code that maps nucleotide sequence to protein sequence is well known, there must exist other codes that are embedded in the human genome. Recent work points to sequence-dependent variation in DNA shape as one mechanism by which regulatory and other information could be encoded in DNA. Recent advances include the discovery of shape-dependent recognition of DNA that depends on minor groove width and electrostatics, the existence of overlapping codes in protein-coding regions of the genome, and evolutionary selection for compensatory changes in nucleotide composition that facilitate nucleosome occupancy. It is becoming clear that DNA shape is important to biological function, and therefore will be subject to evolutionary constraint.
Collapse
Affiliation(s)
- Stephen C. J. Parker
- Genome Informatics Section, Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Thomas D. Tullius
- Department of Chemistry and Program in Bioinformatics, Boston University, Boston, MA 02215, USA
| |
Collapse
|
18
|
Swamy KBS, Chu WY, Wang CY, Tsai HK, Wang D. Evidence of association between nucleosome occupancy and the evolution of transcription factor binding sites in yeast. BMC Evol Biol 2011; 11:150. [PMID: 21627806 PMCID: PMC3124427 DOI: 10.1186/1471-2148-11-150] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 05/31/2011] [Indexed: 11/14/2022] Open
Abstract
Background Divergence of transcription factor binding sites is considered to be an important source of regulatory evolution. The associations between transcription factor binding sites and phenotypic diversity have been investigated in many model organisms. However, the understanding of other factors that contribute to it is still limited. Recent studies have elucidated the effect of chromatin structure on molecular evolution of genomic DNA. Though the profound impact of nucleosome positions on gene regulation has been reported, their influence on transcriptional evolution is still less explored. With the availability of genome-wide nucleosome map in yeast species, it is thus desirable to investigate their impact on transcription factor binding site evolution. Here, we present a comprehensive analysis of the role of nucleosome positioning in the evolution of transcription factor binding sites. Results We compared the transcription factor binding site frequency in nucleosome occupied regions and nucleosome depleted regions in promoters of old (orthologs among Saccharomycetaceae) and young (Saccharomyces specific) genes; and in duplicate gene pairs. We demonstrated that nucleosome occupied regions accommodate greater binding site variations than nucleosome depleted regions in young genes and in duplicate genes. This finding was confirmed by measuring the difference in evolutionary rates of binding sites in sensu stricto yeasts at nucleosome occupied regions and nucleosome depleted regions. The binding sites at nucleosome occupied regions exhibited a consistently higher evolution rate than those at nucleosome depleted regions, corroborating the difference in the selection constraints at the two regions. Finally, through site-directed mutagenesis experiment, we found that binding site gain or loss events at nucleosome depleted regions may cause more expression differences than those in nucleosome occupied regions. Conclusions Our study indicates the existence of different selection constraint on binding sites at nucleosome occupied regions than at the nucleosome depleted regions. We found that the binding sites have a different rate of evolution at nucleosome occupied and depleted regions. Finally, using transcription factor binding site-directed mutagenesis experiment, we confirmed the difference in the impact of binding site changes on expression at these regions. Thus, our work demonstrates the importance of composite analysis of chromatin and transcriptional evolution.
Collapse
Affiliation(s)
- Krishna B S Swamy
- Institute of Information Science, Academia Sinica, Taipei, 115, Taiwan
| | | | | | | | | |
Collapse
|