1
|
Joseph J, Prentout D, Laverré A, Tricou T, Duret L. High prevalence of PRDM9-independent recombination hotspots in placental mammals. Proc Natl Acad Sci U S A 2024; 121:e2401973121. [PMID: 38809707 PMCID: PMC11161765 DOI: 10.1073/pnas.2401973121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 04/26/2024] [Indexed: 05/31/2024] Open
Abstract
In many mammals, recombination events are concentrated in hotspots directed by a sequence-specific DNA-binding protein named PRDM9. Intriguingly, PRDM9 has been lost several times in vertebrates, and notably among mammals, it has been pseudogenized in the ancestor of canids. In the absence of PRDM9, recombination hotspots tend to occur in promoter-like features such as CpG islands. It has thus been proposed that one role of PRDM9 could be to direct recombination away from PRDM9-independent hotspots. However, the ability of PRDM9 to direct recombination hotspots has been assessed in only a handful of species, and a clear picture of how much recombination occurs outside of PRDM9-directed hotspots in mammals is still lacking. In this study, we derived an estimator of past recombination activity based on signatures of GC-biased gene conversion in substitution patterns. We quantified recombination activity in PRDM9-independent hotspots in 52 species of boreoeutherian mammals. We observe a wide range of recombination rates at these loci: several species (such as mice, humans, some felids, or cetaceans) show a deficit of recombination, while a majority of mammals display a clear peak of recombination. Our results demonstrate that PRDM9-directed and PRDM9-independent hotspots can coexist in mammals and that their coexistence appears to be the rule rather than the exception. Additionally, we show that the location of PRDM9-independent hotspots is relatively more stable than that of PRDM9-directed hotspots, but that PRDM9-independent hotspots nevertheless evolve slowly in concert with DNA hypomethylation.
Collapse
Affiliation(s)
- Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| | - Djivan Prentout
- Department of Biological Sciences, Columbia University, New York, NY10027
| | - Alexandre Laverré
- Department of Ecology and Evolution, University of Lausanne, LausanneCH-1015, Switzerland
- Swiss Institute of Bioinformatics, LausanneCH-1015, Switzerland
| | - Théo Tricou
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| | - Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| |
Collapse
|
2
|
Rodrigues MF, Kern AD, Ralph PL. Shared evolutionary processes shape landscapes of genomic variation in the great apes. Genetics 2024; 226:iyae006. [PMID: 38242701 PMCID: PMC10990428 DOI: 10.1093/genetics/iyae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 10/26/2023] [Accepted: 01/03/2024] [Indexed: 01/21/2024] Open
Abstract
For at least the past 5 decades, population genetics, as a field, has worked to describe the precise balance of forces that shape patterns of variation in genomes. The problem is challenging because modeling the interactions between evolutionary processes is difficult, and different processes can impact genetic variation in similar ways. In this paper, we describe how diversity and divergence between closely related species change with time, using correlations between landscapes of genetic variation as a tool to understand the interplay between evolutionary processes. We find strong correlations between landscapes of diversity and divergence in a well-sampled set of great ape genomes, and explore how various processes such as incomplete lineage sorting, mutation rate variation, GC-biased gene conversion and selection contribute to these correlations. Through highly realistic, chromosome-scale, forward-in-time simulations, we show that the landscapes of diversity and divergence in the great apes are too well correlated to be explained via strictly neutral processes alone. Our best fitting simulation includes both deleterious and beneficial mutations in functional portions of the genome, in which 9% of fixations within those regions is driven by positive selection. This study provides a framework for modeling genetic variation in closely related species, an approach which can shed light on the complex balance of forces that have shaped genetic variation.
Collapse
Affiliation(s)
- Murillo F Rodrigues
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
- Department of Biology, University of Oregon, Eugene, OR 97403, USA
| | - Andrew D Kern
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
- Department of Biology, University of Oregon, Eugene, OR 97403, USA
| | - Peter L Ralph
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
- Department of Biology, University of Oregon, Eugene, OR 97403, USA
- Department of Mathematics, University of Oregon, Eugene, OR 97403, USA
| |
Collapse
|
3
|
Galtier N. Half a Century of Controversy: The Neutralist/Selectionist Debate in Molecular Evolution. Genome Biol Evol 2024; 16:evae003. [PMID: 38311843 PMCID: PMC10839204 DOI: 10.1093/gbe/evae003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/01/2024] [Indexed: 02/06/2024] Open
Abstract
The neutral and nearly neutral theories, introduced more than 50 yr ago, have raised and still raise passionate discussion regarding the forces governing molecular evolution and their relative importance. The debate, initially focused on the amount of within-species polymorphism and constancy of the substitution rate, has spread, matured, and now underlies a wide range of topics and questions. The neutralist/selectionist controversy has structured the field and influences the way molecular evolutionary scientists conceive their research.
Collapse
Affiliation(s)
- Nicolas Galtier
- ISEM, CNRS, IRD, Université de Montpellier, Montpellier, France
| |
Collapse
|
4
|
Cousins T, Tabin D, Patterson N, Reich D, Durvasula A. Accurate inference of population history in the presence of background selection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.18.576291. [PMID: 38313273 PMCID: PMC10838404 DOI: 10.1101/2024.01.18.576291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/06/2024]
Abstract
All published methods for learning about demographic history make the simplifying assumption that the genome evolves neutrally, and do not seek to account for the effects of natural selection on patterns of variation. This is a major concern, as ample work has demonstrated the pervasive effects of natural selection and in particular background selection (BGS) on patterns of genetic variation in diverse species. Simulations and theoretical work have shown that methods to infer changes in effective population size over time (Ne(t)) become increasingly inaccurate as the strength of linked selection increases. Here, we introduce an extension to the Pairwise Sequentially Markovian Coalescent (PSMC) algorithm, PSMC+, which explicitly co-models demographic history and natural selection. We benchmark our method using forward-in-time simulations with BGS and find that our approach improves the accuracy of effective population size inference. Leveraging a high resolution map of BGS in humans, we infer considerable changes in the magnitude of inferred effective population size relative to previous reports. Finally, we separately infer Ne(t) on the X chromosome and on the autosomes in diverse great apes without making a correction for selection, and find that the inferred ratio fluctuates substantially through time in a way that differs across species, showing that uncorrected selection may be an important driver of signals of genetic difference on the X chromosome and autosomes.
Collapse
Affiliation(s)
- Trevor Cousins
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Daniel Tabin
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Nick Patterson
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - David Reich
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Boston, MA, USA
| | - Arun Durvasula
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
5
|
Thom G, Moreira LR, Batista R, Gehara M, Aleixo A, Smith BT. Genomic Architecture Predicts Tree Topology, Population Structuring, and Demographic History in Amazonian Birds. Genome Biol Evol 2024; 16:evae002. [PMID: 38236173 PMCID: PMC10823491 DOI: 10.1093/gbe/evae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 10/26/2023] [Accepted: 12/12/2023] [Indexed: 01/19/2024] Open
Abstract
Geographic barriers are frequently invoked to explain genetic structuring across the landscape. However, inferences on the spatial and temporal origins of population variation have been largely limited to evolutionary neutral models, ignoring the potential role of natural selection and intrinsic genomic processes known as genomic architecture in producing heterogeneity in differentiation across the genome. To test how variation in genomic characteristics (e.g. recombination rate) impacts our ability to reconstruct general patterns of differentiation between species that cooccur across geographic barriers, we sequenced the whole genomes of multiple bird populations that are distributed across rivers in southeastern Amazonia. We found that phylogenetic relationships within species and demographic parameters varied across the genome in predictable ways. Genetic diversity was positively associated with recombination rate and negatively associated with species tree support. Gene flow was less pervasive in genomic regions of low recombination, making these windows more likely to retain patterns of population structuring that matched the species tree. We further found that approximately a third of the genome showed evidence of selective sweeps and linked selection, skewing genome-wide estimates of effective population sizes and gene flow between populations toward lower values. In sum, we showed that the effects of intrinsic genomic characteristics and selection can be disentangled from neutral processes to elucidate spatial patterns of population differentiation.
Collapse
Affiliation(s)
- Gregory Thom
- Department of Ornithology, American Museum of Natural History, New York, NY, USA
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Lucas Rocha Moreira
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Department of Vertebrate Genomics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Romina Batista
- Programa de Coleções Biológicas, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil
- School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Marcelo Gehara
- Department of Earth and Environmental Sciences, Rutgers University, Newark, NJ, USA
| | - Alexandre Aleixo
- Finnish Museum of Natural History, University of Helsinki, Helsinki, Finland
- Department of Environmental Genomics, Instituto Tecnológico Vale, Belém, Brazil
| | - Brian Tilston Smith
- Department of Ornithology, American Museum of Natural History, New York, NY, USA
| |
Collapse
|
6
|
Rodrigues MF, Kern AD, Ralph PL. Shared evolutionary processes shape landscapes of genomic variation in the great apes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.07.527547. [PMID: 36798346 PMCID: PMC9934647 DOI: 10.1101/2023.02.07.527547] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
For at least the past five decades population genetics, as a field, has worked to describe the precise balance of forces that shape patterns of variation in genomes. The problem is challenging because modelling the interactions between evolutionary processes is difficult, and different processes can impact genetic variation in similar ways. In this paper, we describe how diversity and divergence between closely related species change with time, using correlations between landscapes of genetic variation as a tool to understand the interplay between evolutionary processes. We find strong correlations between landscapes of diversity and divergence in a well sampled set of great ape genomes, and explore how various processes such as incomplete lineage sorting, mutation rate variation, GC-biased gene conversion and selection contribute to these correlations. Through highly realistic, chromosome-scale, forward-in-time simulations we show that the landscapes of diversity and divergence in the great apes are too well correlated to be explained via strictly neutral processes alone. Our best fitting simulation includes both deleterious and beneficial mutations in functional portions of the genome, in which 9% of fixations within those regions is driven by positive selection. This study provides a framework for modelling genetic variation in closely related species, an approach which can shed light on the complex balance of forces that have shaped genetic variation.
Collapse
Affiliation(s)
- Murillo F. Rodrigues
- Institute of Ecology and Evolution, University of Oregon
- Department of Biology, University of Oregon
| | - Andrew D. Kern
- Institute of Ecology and Evolution, University of Oregon
- Department of Biology, University of Oregon
| | - Peter L. Ralph
- Institute of Ecology and Evolution, University of Oregon
- Department of Biology, University of Oregon
- Department of Mathematics, University of Oregon
| |
Collapse
|
7
|
Morton BR. Context and Mutation in Gymnosperm Chloroplast DNA. Genes (Basel) 2023; 14:1492. [PMID: 37510396 PMCID: PMC10378972 DOI: 10.3390/genes14071492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 07/15/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023] Open
Abstract
Mutations and subsequent repair processes are known to be strongly context-dependent in the flowering-plant chloroplast genome. At least six flanking bases, three on each side, can have an influence on the relative rates of different types of mutation at any given site. In this analysis, examine context and substitution at noncoding and fourfold degenerate coding sites in gymnosperm DNA. The sequences are analyzed in sets of three, allowing the inference of the substitution direction and the generation of context-dependent rate matrices. The size of the dataset limits the analysis to the tetranucleotide context of the sites, but the evidence shows that there are significant contextual effects, with patterns that are similar to those observed in angiosperms. These effects most likely represent an influence on the underlying mutation/repair dynamics. The data extend the plastome lineages that feature very complex patterns of mutation, which can have significant effects on the evolutionary dynamics of the chloroplast genome.
Collapse
Affiliation(s)
- Brian R Morton
- Department of Biology, Barnard College, Columbia University, 3009 Broadway, New York, NY 10027, USA
| |
Collapse
|
8
|
Caballero M, Boos D, Koren A. Cell-type specificity of the human mutation landscape with respect to DNA replication dynamics. CELL GENOMICS 2023; 3:100315. [PMID: 37388911 PMCID: PMC10300547 DOI: 10.1016/j.xgen.2023.100315] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/24/2023] [Accepted: 04/03/2023] [Indexed: 07/01/2023]
Abstract
The patterns of genomic mutations are associated with various genomic features, most notably late replication timing, yet it remains contested which mutation types and signatures relate to DNA replication dynamics and to what extent. Here, we perform high-resolution comparisons of mutational landscapes between lymphoblastoid cell lines, chronic lymphocytic leukemia tumors, and three colon adenocarcinoma cell lines, including two with mismatch repair deficiency. Using cell-type-matched replication timing profiles, we demonstrate that mutation rates exhibit heterogeneous replication timing associations among cell types. This cell-type heterogeneity extends to the underlying mutational pathways, as mutational signatures show inconsistent replication timing bias between cell types. Moreover, replicative strand asymmetries exhibit similar cell-type specificity, albeit with different relationships to replication timing than mutation rates. Overall, we reveal an underappreciated complexity and cell-type specificity of mutational pathways and their relationship to replication timing.
Collapse
Affiliation(s)
- Madison Caballero
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Dominik Boos
- Vertebrate DNA Replication Lab, Center of Medical Biotechnology, University of Duisburg-Essen, 45117 Essen, Germany
| | - Amnon Koren
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
9
|
Charlesworth B, Jensen JD. Population Genetic Considerations Regarding Evidence for Biased Mutation Rates in Arabidopsis thaliana. Mol Biol Evol 2023; 40:6961073. [PMID: 36572441 PMCID: PMC9907473 DOI: 10.1093/molbev/msac275] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
It has recently been proposed that lower mutation rates in gene bodies compared with upstream and downstream sequences in Arabidopsis thaliana are the result of an "adaptive" modification of the rate of beneficial and deleterious mutations in these functional regions. This claim was based both on analyses of mutation accumulation lines and on population genomics data. Here, we show that several questionable assumptions were used in the population genomics analyses. In particular, we demonstrate that the difference between gene bodies and less selectively constrained sequences in the magnitude of Tajima's D can in principle be explained by the presence of sites subject to purifying selection and does not require lower mutation rates in regions experiencing selective constraints.
Collapse
Affiliation(s)
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, 85281 AZ
| |
Collapse
|
10
|
Hine E, Runcie DE, Allen SL, Wang Y, Chenoweth SF, Blows MW, McGuigan K. Maintenance of quantitative genetic variance in complex, multi-trait phenotypes: The contribution of rare, large effect variants in two Drosophila species. Genetics 2022; 222:6663993. [PMID: 35961029 PMCID: PMC9526065 DOI: 10.1093/genetics/iyac122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/02/2022] [Indexed: 11/29/2022] Open
Abstract
The interaction of evolutionary processes to determine quantitative genetic variation has implications for contemporary and future phenotypic evolution, as well as for our ability to detect causal genetic variants. While theoretical studies have provided robust predictions to discriminate among competing models, empirical assessment of these has been limited. In particular, theory highlights the importance of pleiotropy in resolving observations of selection and mutation, but empirical investigations have typically been limited to few traits. Here, we applied high-dimensional Bayesian Sparse Factor Genetic modeling to gene expression datasets in 2 species, Drosophila melanogaster and Drosophila serrata, to explore the distributions of genetic variance across high-dimensional phenotypic space. Surprisingly, most of the heritable trait covariation was due to few lines (genotypes) with extreme [>3 interquartile ranges (IQR) from the median] values. Intriguingly, while genotypes extreme for a multivariate factor also tended to have a higher proportion of individual traits that were extreme, we also observed genotypes that were extreme for multivariate factors but not for any individual trait. We observed other consistent differences between heritable multivariate factors with outlier lines vs those factors without extreme values, including differences in gene functions. We use these observations to identify further data required to advance our understanding of the evolutionary dynamics and nature of standing genetic variation for quantitative traits.
Collapse
Affiliation(s)
- Emma Hine
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Daniel E Runcie
- Department of Plant Sciences, University of California Davis, Davis, CA 95616, USA
| | - Scott L Allen
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Yiguan Wang
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia.,Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3FL, UK
| | - Stephen F Chenoweth
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Mark W Blows
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Katrina McGuigan
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| |
Collapse
|
11
|
Morton BR. Substitution rate heterogeneity across hexanucleotide contexts in noncoding chloroplast DNA. G3 GENES|GENOMES|GENETICS 2022; 12:6608088. [PMID: 35699494 PMCID: PMC9339276 DOI: 10.1093/g3journal/jkac150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022]
Abstract
Substitutions between closely related noncoding chloroplast DNA sequences are studied with respect to the composition of the 3 bases on each side of the substitution, that is the hexanucleotide context. There is about 100-fold variation in rate, among the contexts, particularly on substitutions of A and T. Rate heterogeneity of transitions differs from that of transversions, resulting in a more than 200-fold variation in the transitions: transversion bias. The data are consistent with a CpG effect, and it is shown that both the A + T content and the arrangement of purines/pyrimidines along the same DNA strand are correlated with rate variation. Expected equilibrium A + T content ranges from 36.4% to 82.8% across contexts, while G–C skew ranges from −77.4 to 72.2 and A–T skew ranges from −63.9 to 68.2. The predicted equilibria are associated with specific features of the content of the hexanucleotide context, and also show close agreement with the observed context-dependent compositions. Finally, by controlling for the content of nucleotides closer to the substitution site, it is shown that both the third and fourth nucleotide removed on each side of the substitution directly influence substitution dynamics at that site. Overall, the results demonstrate that noncoding sites in different contexts are evolving along very different evolutionary trajectories and that substitution dynamics are far more complex than typically assumed. This has important implications for a number of types of sequence analysis, particularly analyses of natural selection, and the context-dependent substitution matrices developed here can be applied in future analyses.
Collapse
Affiliation(s)
- Brian R Morton
- Department of Biology, Barnard College, Columbia University , New York, NY 10027, USA
| |
Collapse
|
12
|
Ho AT, Hurst LD. Stop codon usage as a window into genome evolution: mutation, selection, biased gene conversion and the TAG paradox. Genome Biol Evol 2022; 14:6648529. [PMID: 35867377 PMCID: PMC9348620 DOI: 10.1093/gbe/evac115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/17/2022] [Indexed: 11/16/2022] Open
Abstract
Protein coding genes terminate with one of three stop codons (TAA, TGA, or TAG) that, like synonymous codons, are not employed equally. With TGA and TAG having identical nucleotide content, analysis of their differential usage provides an unusual window into the forces operating on what are ostensibly functionally identical residues. Across genomes and between isochores within the human genome, TGA usage increases with G + C content but, with a common G + C → A + T mutation bias, this cannot be explained by mutation bias-drift equilibrium. Increased usage of TGA in G + C-rich genomes or genomic regions is also unlikely to reflect selection for the optimal stop codon, as TAA appears to be universally optimal, probably because it has the lowest read-through rate. Despite TAA being favored by selection and mutation bias, as with codon usage bias G + C pressure is the prime determinant of between-species TGA usage trends. In species with strong G + C-biased gene conversion (gBGC), such as mammals and birds, the high usage and conservation of TGA is best explained by an A + T → G + C repair bias. How to explain TGA enrichment in other G + C-rich genomes is less clear. Enigmatically, across bacterial and archaeal species and between human isochores TAG usage is mostly unresponsive to G + C pressure. This unresponsiveness we dub the TAG paradox as currently no mutational, selective, or gBGC model provides a well-supported explanation. That TAG does increase with G + C usage across eukaryotes makes the usage elsewhere yet more enigmatic. We suggest resolution of the TAG paradox may provide insights into either an unknown but common selective preference (probably at the DNA/RNA level) or an unrecognized complexity to the action of gBGC.
Collapse
Affiliation(s)
- Alexander T Ho
- Milner Centre for Evolution, University of Bath, Bath, UK
| | | |
Collapse
|
13
|
Johri P, Eyre-Walker A, Gutenkunst RN, Lohmueller KE, Jensen JD. On the prospect of achieving accurate joint estimation of selection with population history. Genome Biol Evol 2022; 14:6604401. [PMID: 35675379 PMCID: PMC9254643 DOI: 10.1093/gbe/evac088] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/02/2022] [Indexed: 11/15/2022] Open
Abstract
As both natural selection and population history can affect genome-wide patterns of variation, disentangling the contributions of each has remained as a major challenge in population genetics. We here discuss historical and recent progress towards this goal—highlighting theoretical and computational challenges that remain to be addressed, as well as inherent difficulties in dealing with model complexity and model violations—and offer thoughts on potentially fruitful next steps.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | | | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.,Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
14
|
Johri P, Aquadro CF, Beaumont M, Charlesworth B, Excoffier L, Eyre-Walker A, Keightley PD, Lynch M, McVean G, Payseur BA, Pfeifer SP, Stephan W, Jensen JD. Recommendations for improving statistical inference in population genomics. PLoS Biol 2022; 20:e3001669. [PMID: 35639797 PMCID: PMC9154105 DOI: 10.1371/journal.pbio.3001669] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Charles F. Aquadro
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Mark Beaumont
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne, Switzerland
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Peter D. Keightley
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Michael Lynch
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Bret A. Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Susanne P. Pfeifer
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
- * E-mail:
| |
Collapse
|
15
|
Ho AT, Hurst LD. Unusual mammalian usage of TGA stop codons reveals that sequence conservation need not imply purifying selection. PLoS Biol 2022; 20:e3001588. [PMID: 35550630 PMCID: PMC9129041 DOI: 10.1371/journal.pbio.3001588] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 05/24/2022] [Accepted: 04/20/2022] [Indexed: 11/18/2022] Open
Abstract
The assumption that conservation of sequence implies the action of purifying selection is central to diverse methodologies to infer functional importance. GC-biased gene conversion (gBGC), a meiotic mismatch repair bias strongly favouring GC over AT, can in principle mimic the action of selection, this being thought to be especially important in mammals. As mutation is GC→AT biased, to demonstrate that gBGC does indeed cause false signals requires evidence that an AT-rich residue is selectively optimal compared to its more GC-rich allele, while showing also that the GC-rich alternative is conserved. We propose that mammalian stop codon evolution provides a robust test case. Although in most taxa TAA is the optimal stop codon, TGA is both abundant and conserved in mammalian genomes. We show that this mammalian exceptionalism is well explained by gBGC mimicking purifying selection and that TAA is the selectively optimal codon. Supportive of gBGC, we observe (i) TGA usage trends are consistent at the focal stop codon and elsewhere (in UTR sequences); (ii) that higher TGA usage and higher TAA→TGA substitution rates are predicted by a high recombination rate; and (iii) across species the difference in TAA <-> TGA substitution rates between GC-rich and GC-poor genes is largest in genomes that possess higher between-gene GC variation. TAA optimality is supported both by enrichment in highly expressed genes and trends associated with effective population size. High TGA usage and high TAA→TGA rates in mammals are thus consistent with gBGC’s predicted ability to “drive” deleterious mutations and supports the hypothesis that sequence conservation need not be indicative of purifying selection. A general trend for GC-rich trinucleotides to reside at frequencies far above their mutational equilibrium in high recombining domains supports the generality of these results.
Collapse
Affiliation(s)
- Alexander Thomas Ho
- Milner Centre for Evolution, University of Bath, Bath, United Kingdom
- * E-mail:
| | | |
Collapse
|
16
|
Ho AT, Hurst LD. Variation in Release Factor Abundance Is Not Needed to Explain Trends in Bacterial Stop Codon Usage. Mol Biol Evol 2022; 39:msab326. [PMID: 34751397 PMCID: PMC8789281 DOI: 10.1093/molbev/msab326] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
In bacteria stop codons are recognized by one of two class I release factors (RF1) recognizing TAG, RF2 recognizing TGA, and TAA being recognized by both. Variation across bacteria in the relative abundance of RF1 and RF2 is thus hypothesized to select for different TGA/TAG usage. This has been supported by correlations between TAG:TGA ratios and RF1:RF2 ratios across multiple bacterial species, potentially also explaining why TAG usage is approximately constant despite extensive variation in GC content. It is, however, possible that stop codon trends are determined by other forces and that RF ratios adapt to stop codon usage, rather than vice versa. Here, we determine which direction of the causal arrow is the more parsimonious. Our results support the notion that RF1/RF2 ratios become adapted to stop codon usage as the same trends, notably the anomalous TAG behavior, are seen in contexts where RF1:RF2 ratios cannot be, or are unlikely to be, causative, that is, at 3'untranslated sites never used for translation termination, in intragenomic analyses, and across archaeal species (that possess only one RF1). We conclude that specifics of RF biology are unlikely to fully explain TGA/TAG relative usage. We discuss why the causal relationships for the evolution of synonymous stop codon usage might be different from those affecting synonymous sense codon usage, noting that transitions between TGA and TAG require two-point mutations one of which is likely to be deleterious.
Collapse
Affiliation(s)
- Alexander T Ho
- Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Milner Centre for Evolution, University of Bath, Bath, United Kingdom
| |
Collapse
|
17
|
Agarwal I, Przeworski M. Mutation saturation for fitness effects at human CpG sites. eLife 2021; 10:e71513. [PMID: 34806592 PMCID: PMC8683084 DOI: 10.7554/elife.71513] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 11/21/2021] [Indexed: 01/06/2023] Open
Abstract
Whole exome sequences have now been collected for millions of humans, with the related goals of identifying pathogenic mutations in patients and establishing reference repositories of data from unaffected individuals. As a result, we are approaching an important limit, in which datasets are large enough that, in the absence of natural selection, every highly mutable site will have experienced at least one mutation in the genealogical history of the sample. Here, we focus on CpG sites that are methylated in the germline and experience mutations to T at an elevated rate of ~10-7 per site per generation; considering synonymous mutations in a sample of 390,000 individuals, ~ 99 % of such CpG sites harbor a C/T polymorphism. Methylated CpG sites provide a natural mutation saturation experiment for fitness effects: as we show, at nt sample sizes, not seeing a non-synonymous polymorphism is indicative of strong selection against that mutation. We rely on this idea in order to directly identify a subset of CpG transitions that are likely to be highly deleterious, including ~27 % of possible loss-of-function mutations, and up to 20 % of possible missense mutations, depending on the type of functional site in which they occur. Unlike methylated CpGs, most mutation types, with rates on the order of 10-8 or 10-9, remain very far from saturation. We discuss what these findings imply for interpreting the potential clinical relevance of mutations from their presence or absence in reference databases and for inferences about the fitness effects of new mutations.
Collapse
Affiliation(s)
- Ipsita Agarwal
- Department of Biological Sciences, Columbia UniversityNew YorkUnited States
| | - Molly Przeworski
- Department of Biological Sciences, Columbia UniversityNew YorkUnited States
- Department of Systems Biology, Columbia UniversityNew YorkUnited States
| |
Collapse
|
18
|
Lucena-Perez M, Kleinman-Ruiz D, Marmesat E, Saveljev AP, Schmidt K, Godoy JA. Bottleneck-associated changes in the genomic landscape of genetic diversity in wild lynx populations. Evol Appl 2021; 14:2664-2679. [PMID: 34815746 PMCID: PMC8591332 DOI: 10.1111/eva.13302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 08/17/2021] [Accepted: 09/08/2021] [Indexed: 01/06/2023] Open
Abstract
Demographic bottlenecks generally reduce genetic diversity through more intense genetic drift, but their net effect may vary along the genome due to the random nature of genetic drift and to local effects of recombination, mutation, and selection. Here, we analyzed the changes in genetic diversity following a bottleneck by comparing whole-genome diversity patterns in populations with and without severe recent documented declines of Iberian (Lynx pardinus, n = 31) and Eurasian lynx (Lynx lynx, n = 29). As expected, overall genomic diversity correlated negatively with bottleneck intensity and/or duration. Correlations of genetic diversity with divergence, chromosome size, gene or functional site content, GC content, or recombination were observed in nonbottlenecked populations, but were weaker in bottlenecked populations. Also, functional features under intense purifying selection and the X chromosome showed an increase in the observed density of variants, even resulting in higher θ W diversity than in nonbottlenecked populations. Increased diversity seems to be related to both a higher mutational input in those regions creating a large collection of low-frequency variants, a few of which increase in frequency during the bottleneck to the point they become detectable with our limited sample, and the reduced efficacy of purifying selection, which affects not only protein structure and function but also the regulation of gene expression. The results of this study alert to the possible reduction of fitness and adaptive potential associated with the genomic erosion in regulatory elements. Further, the detection of a gain of diversity in ultra-conserved elements can be used as a sensitive and easy-to-apply signature of genetic erosion in wild populations.
Collapse
Affiliation(s)
- Maria Lucena-Perez
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
| | - Daniel Kleinman-Ruiz
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
- Departamento de Genética Facultad de Biología Universidad Complutense Madrid Spain
| | - Elena Marmesat
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
| | - Alexander P Saveljev
- Department of Animal Ecology Russian Research Institute of Game Management and Fur Farming Kirov Russia
| | - Krzysztof Schmidt
- Mammal Research Institute Polish Academy of Sciences Białowieża Poland
| | - José A Godoy
- Departamento de Ecología Integrativa Estación Biológica de Doñana (CSIC) Sevilla Spain
| |
Collapse
|
19
|
Seplyarskiy VB, Sunyaev S. The origin of human mutation in light of genomic data. Nat Rev Genet 2021; 22:672-686. [PMID: 34163020 DOI: 10.1038/s41576-021-00376-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/06/2021] [Indexed: 02/05/2023]
Abstract
Despite years of active research into the role of DNA repair and replication in mutagenesis, surprisingly little is known about the origin of spontaneous human mutation in the germ line. With the advent of high-throughput sequencing, genome-scale data have revealed statistical properties of mutagenesis in humans. These properties include variation of the mutation rate and spectrum along the genome at different scales in relation to epigenomic features and dependency on parental age. Moreover, mutations originated in mothers are less frequent than mutations originated in fathers and have a distinct genomic distribution. Statistical analyses that interpret these patterns in the context of known biochemistry can provide mechanistic models of mutagenesis in humans.
Collapse
Affiliation(s)
- Vladimir B Seplyarskiy
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shamil Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. .,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
20
|
Campbell CR, Tiley GP, Poelstra JW, Hunnicutt KE, Larsen PA, Lee HJ, Thorne JL, Dos Reis M, Yoder AD. Pedigree-based and phylogenetic methods support surprising patterns of mutation rate and spectrum in the gray mouse lemur. Heredity (Edinb) 2021; 127:233-244. [PMID: 34272504 PMCID: PMC8322134 DOI: 10.1038/s41437-021-00446-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 05/25/2021] [Accepted: 05/26/2021] [Indexed: 02/06/2023] Open
Abstract
Mutations are the raw material on which evolution acts, and knowledge of their frequency and genomic distribution is crucial for understanding how evolution operates at both long and short timescales. At present, the rate and spectrum of de novo mutations have been directly characterized in relatively few lineages. Our study provides the first direct mutation-rate estimate for a strepsirrhine (i.e., the lemurs and lorises), which comprises nearly half of the primate clade. Using high-coverage linked-read sequencing for a focal quartet of gray mouse lemurs (Microcebus murinus), we estimated the mutation rate to be among the highest calculated for a mammal at 1.52 × 10-8 (95% credible interval: 1.28 × 10-8-1.78 × 10-8) mutations/site/generation. Further, we found an unexpectedly low count of paternal mutations, and only a modest overrepresentation of mutations at CpG sites. Despite the surprising nature of these results, we found both the rate and spectrum to be robust to the manipulation of a wide range of computational filtering criteria. We also sequenced a technical replicate to estimate a false-negative and false-positive rate for our data and show that any point estimate of a de novo mutation rate should be considered with a large degree of uncertainty. For validation, we conducted an independent analysis of context-dependent substitution types for gray mouse lemur and five additional primate species for which de novo mutation rates have also been estimated. These comparisons revealed general consistency of the mutation spectrum between the pedigree-based and the substitution-rate analyses for all species compared.
Collapse
Affiliation(s)
- C Ryan Campbell
- Department of Biology, Duke University, Durham, NC, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC, USA
| | | | | | - Kelsie E Hunnicutt
- Department of Biology, Duke University, Durham, NC, USA
- Department of Biological Sciences, University of Denver, Denver, CO, USA
| | - Peter A Larsen
- Department of Biology, Duke University, Durham, NC, USA
- Department of Veterinary and Biomedical Sciences, University of Minnesota, St. Paul, MN, USA
| | - Hui-Jie Lee
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Mario Dos Reis
- School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, NC, USA.
| |
Collapse
|
21
|
Boman J, Mugal CF, Backström N. The Effects of GC-Biased Gene Conversion on Patterns of Genetic Diversity among and across Butterfly Genomes. Genome Biol Evol 2021; 13:evab064. [PMID: 33760095 PMCID: PMC8175052 DOI: 10.1093/gbe/evab064] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/22/2021] [Indexed: 12/28/2022] Open
Abstract
Recombination reshuffles the alleles of a population through crossover and gene conversion. These mechanisms have considerable consequences on the evolution and maintenance of genetic diversity. Crossover, for example, can increase genetic diversity by breaking the linkage between selected and nearby neutral variants. Bias in favor of G or C alleles during gene conversion may instead promote the fixation of one allele over the other, thus decreasing diversity. Mutation bias from G or C to A and T opposes GC-biased gene conversion (gBGC). Less recognized is that these two processes may-when balanced-promote genetic diversity. Here, we investigate how gBGC and mutation bias shape genetic diversity patterns in wood white butterflies (Leptidea sp.). This constitutes the first in-depth investigation of gBGC in butterflies. Using 60 resequenced genomes from six populations of three species, we find substantial variation in the strength of gBGC across lineages. When modeling the balance of gBGC and mutation bias and comparing analytical results with empirical data, we reject gBGC as the main determinant of genetic diversity in these butterfly species. As alternatives, we consider linked selection and GC content. We find evidence that high values of both reduce diversity. We also show that the joint effects of gBGC and mutation bias can give rise to a diversity pattern which resembles the signature of linked selection. Consequently, gBGC should be considered when interpreting the effects of linked selection on levels of genetic diversity.
Collapse
Affiliation(s)
- Jesper Boman
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Sweden
| | - Carina F Mugal
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Sweden
| | - Niclas Backström
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Sweden
| |
Collapse
|
22
|
Yeager M, Machiela MJ, Kothiyal P, Dean M, Bodelon C, Suman S, Wang M, Mirabello L, Nelson CW, Zhou W, Palmer C, Ballew B, Colli LM, Freedman ND, Dagnall C, Hutchinson A, Vij V, Maruvka Y, Hatch M, Illienko I, Belayev Y, Nakamura N, Chumak V, Bakhanova E, Belyi D, Kryuchkov V, Golovanov I, Gudzenko N, Cahoon EK, Albert P, Drozdovitch V, Little MP, Mabuchi K, Stewart C, Getz G, Bazyka D, Berrington de Gonzalez A, Chanock SJ. Lack of transgenerational effects of ionizing radiation exposure from the Chernobyl accident. Science 2021; 372:725-729. [PMID: 33888597 DOI: 10.1126/science.abg2365] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 04/12/2021] [Indexed: 12/15/2022]
Abstract
Effects of radiation exposure from the Chernobyl nuclear accident remain a topic of interest. We investigated germline de novo mutations (DNMs) in children born to parents employed as cleanup workers or exposed to occupational and environmental ionizing radiation after the accident. Whole-genome sequencing of 130 children (born 1987-2002) and their parents did not reveal an increase in the rates, distributions, or types of DNMs relative to the results of previous studies. We find no elevation in total DNMs, regardless of cumulative preconception gonadal paternal [mean = 365 milligrays (mGy), range = 0 to 4080 mGy] or maternal (mean = 19 mGy, range = 0 to 550 mGy) exposure to ionizing radiation. Thus, we conclude that, over this exposure range, evidence is lacking for a substantial effect on germline DNMs in humans, suggesting minimal impact from transgenerational genetic effects.
Collapse
Affiliation(s)
- Meredith Yeager
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA. .,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Mitchell J Machiela
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Prachi Kothiyal
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,SymbioSeq LLC, Arlington, VA 20148, USA
| | - Michael Dean
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Clara Bodelon
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Shalabh Suman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Mingyi Wang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Lisa Mirabello
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Chase W Nelson
- Biodiversity Research Center, Academia Sinica, Taipei, 11529, Taiwan.,Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
| | - Weiyin Zhou
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Cameron Palmer
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Bari Ballew
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Leandro M Colli
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Department of Medical Imaging, Hematology, and Oncology, Ribeirao Preto Medical School, University of Sao Paulo, Ribeirao Preto, SP, 14049-900, Brazil
| | - Neal D Freedman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Casey Dagnall
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Amy Hutchinson
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD 21701, USA
| | - Vibha Vij
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Yosi Maruvka
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA.,Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Maureen Hatch
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Iryna Illienko
- National Research Centre for Radiation Medicine, 53 Yu. Illienka Street, Kyiv, 04050, Ukraine
| | - Yuri Belayev
- National Research Centre for Radiation Medicine, 53 Yu. Illienka Street, Kyiv, 04050, Ukraine
| | - Nori Nakamura
- Department of Molecular Biosciences, Radiation Effects Research Foundation, 5-2 Hijiyama Park, Minami-ku, Hiroshima, 732-0815, Japan
| | - Vadim Chumak
- National Research Centre for Radiation Medicine, 53 Yu. Illienka Street, Kyiv, 04050, Ukraine
| | - Elena Bakhanova
- National Research Centre for Radiation Medicine, 53 Yu. Illienka Street, Kyiv, 04050, Ukraine
| | - David Belyi
- National Research Centre for Radiation Medicine, 53 Yu. Illienka Street, Kyiv, 04050, Ukraine
| | - Victor Kryuchkov
- Burnasyan Federal Medical and Biophysical Centre, 46 Zhivopisnaya Street, Moscow, 123182, Russia
| | - Ivan Golovanov
- Burnasyan Federal Medical and Biophysical Centre, 46 Zhivopisnaya Street, Moscow, 123182, Russia
| | - Natalia Gudzenko
- National Research Centre for Radiation Medicine, 53 Yu. Illienka Street, Kyiv, 04050, Ukraine
| | - Elizabeth K Cahoon
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Paul Albert
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Vladimir Drozdovitch
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Mark P Little
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Kiyohiko Mabuchi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA
| | - Chip Stewart
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Gad Getz
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA.,Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02114, USA.,Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA.,Harvard Medical School, Boston, MA 02115, USA
| | - Dimitry Bazyka
- National Research Centre for Radiation Medicine, 53 Yu. Illienka Street, Kyiv, 04050, Ukraine
| | | | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20892, USA.
| |
Collapse
|
23
|
de Oliveira JL, Morales AC, Hurst LD, Urrutia AO, Thompson CRL, Wolf JB. Inferring Adaptive Codon Preference to Understand Sources of Selection Shaping Codon Usage Bias. Mol Biol Evol 2021; 38:3247-3266. [PMID: 33871580 PMCID: PMC8321536 DOI: 10.1093/molbev/msab099] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Alternative synonymous codons are often used at unequal frequencies. Classically, studies of such codon usage bias (CUB) attempted to separate the impact of neutral from selective forces by assuming that deviations from a predicted neutral equilibrium capture selection. However, GC-biased gene conversion (gBGC) can also cause deviation from a neutral null. Alternatively, selection has been inferred from CUB in highly expressed genes, but the accuracy of this approach has not been extensively tested, and gBGC can interfere with such extrapolations (e.g., if expression and gene conversion rates covary). It is therefore critical to examine deviations from a mutational null in a species with no gBGC. To achieve this goal, we implement such an analysis in the highly AT rich genome of Dictyostelium discoideum, where we find no evidence of gBGC. We infer neutral CUB under mutational equilibrium to quantify "adaptive codon preference," a nontautologous genome wide quantitative measure of the relative selection strength driving CUB. We observe signatures of purifying selection consistent with selection favoring adaptive codon preference. Preferred codons are not GC rich, underscoring the independence from gBGC. Expression-associated "preference" largely matches adaptive codon preference but does not wholly capture the influence of selection shaping patterns across all genes, suggesting selective constraints associated specifically with high expression. We observe patterns consistent with effects on mRNA translation and stability shaping adaptive codon preference. Thus, our approach to quantifying adaptive codon preference provides a framework for inferring the sources of selection that shape CUB across different contexts within the genome.
Collapse
Affiliation(s)
- Janaina Lima de Oliveira
- Instituto de Biologia, Universidade Federal da Bahia, Salvador, Bahia, 40170-115, Brazil.,Milner Centre for Evolution and Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Atahualpa Castillo Morales
- Milner Centre for Evolution and Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Laurence D Hurst
- Milner Centre for Evolution and Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Araxi O Urrutia
- Milner Centre for Evolution and Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK.,Instituto de Ecologia, UNAM, Ciudad de Mexico 04510, Mexico
| | - Christopher R L Thompson
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London, WC1E 6BT, UK
| | - Jason B Wolf
- Milner Centre for Evolution and Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| |
Collapse
|
24
|
Abstract
Recombination increases the local GC-content in genomic regions through GC-biased gene conversion (gBGC). The recent discovery of a large genomic region with extreme GC-content in the fat sand rat Psammomys obesus provides a model to study the effects of gBGC on chromosome evolution. Here, we compare the GC-content and GC-to-AT substitution patterns across protein-coding genes of four gerbil species and two murine rodents (mouse and rat). We find that the known high-GC region is present in all the gerbils, and is characterized by high substitution rates for all mutational categories (AT-to-GC, GC-to-AT, and GC-conservative) both at synonymous and nonsynonymous sites. A higher AT-to-GC than GC-to-AT rate is consistent with the high GC-content. Additionally, we find more than 300 genes outside the known region with outlying values of AT-to-GC synonymous substitution rates in gerbils. Of these, over 30% are organized into at least 17 large clusters observable at the megabase-scale. The unusual GC-skewed substitution pattern suggests the evolution of genomic regions with very high recombination rates in the gerbil lineage, which can lead to a runaway increase in GC-content. Our results imply that rapid evolution of GC-content is possible in mammals, with gerbil species providing a powerful model to study the mechanisms of gBGC.
Collapse
Affiliation(s)
- Rodrigo Pracana
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | | | - John F Mulley
- School of Natural Sciences, Bangor University, Bangor, Gwynedd, United Kingdom
| | | |
Collapse
|
25
|
Yan Y, Li Z, Li Y, Wu Z, Yang R. Correlated Evolution of Large DNA Fragments in the 3D Genome of Arabidopsis thaliana. Mol Biol Evol 2021; 37:1621-1636. [PMID: 32044988 DOI: 10.1093/molbev/msaa031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In eukaryotes, the three-dimensional (3D) conformation of the genome is far from random, and this nonrandom chromatin organization is strongly correlated with gene expression and protein function, which are two critical determinants of the selective constraints and evolutionary rates of genes. However, whether genes and other elements that are located close to each other in the 3D genome evolve in a coordinated way has not been investigated in any organism. To address this question, we constructed chromatin interaction networks (CINs) in Arabidopsis thaliana based on high-throughput chromosome conformation capture data and demonstrated that adjacent large DNA fragments in the CIN indeed exhibit more similar levels of polymorphism and evolutionary rates than random fragment pairs. Using simulations that account for the linear distance between fragments, we proved that the 3D chromosomal organization plays a role in the observed correlated evolution. Spatially interacting fragments also exhibit more similar mutation rates and functional constraints in both coding and noncoding regions than the random expectations, indicating that the correlated evolution between 3D neighbors is a result of combined evolutionary forces. A collection of 39 genomic and epigenomic features can explain much of the variance in genetic diversity and evolutionary rates across the genome. Moreover, features that have a greater effect on the evolution of regional sequences tend to show higher similarity between neighboring fragments in the CIN, suggesting a pivotal role of epigenetic modifications and chromatin organization in determining the correlated evolution of large DNA fragments in the 3D genome.
Collapse
Affiliation(s)
- Yubin Yan
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zhaohong Li
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ye Li
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zefeng Wu
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ruolin Yang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| |
Collapse
|
26
|
The Impact of DNA Methylation Dynamics on the Mutation Rate During Human Germline Development. G3-GENES GENOMES GENETICS 2020; 10:3337-3346. [PMID: 32727923 PMCID: PMC7466984 DOI: 10.1534/g3.120.401511] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
DNA methylation is a dynamic epigenetic modification found in most eukaryotic genomes. It is known to lead to a high CpG to TpG mutation rate. However, the relationship between the methylation dynamics in germline development and the germline mutation rate remains unexplored. In this study, we used whole genome bisulfite sequencing (WGBS) data of cells at 13 stages of human germline development and rare variants from the 1000 Genome Project as proxies for germline mutations to investigate the correlation between dynamic methylation levels and germline mutation rates at different scales. At the single-site level, we found a significant correlation between methylation and the germline point mutation rate at CpG sites during germline developmental stages. Then we explored the mutability of methylation dynamics in all stages. Our results also showed a broad correlation between the regional methylation level and the rate of C > T mutation at CpG sites in all genomic regions, especially in intronic regions; a similar link was also seen at all chromosomal levels. Our findings indicate that the dynamic DNA methylome during human germline development has a broader mutational impact than is commonly assumed.
Collapse
|
27
|
Simon H, Huttley G. Quantifying Influences on Intragenomic Mutation Rate. G3 (BETHESDA, MD.) 2020; 10:2641-2652. [PMID: 32527747 PMCID: PMC7407452 DOI: 10.1534/g3.120.401335] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 05/28/2020] [Indexed: 12/14/2022]
Abstract
We report work to quantify the impact on the probability of human genome polymorphism both of recombination and of sequence context at different scales. We use population-based analyses of data on human genetic variants obtained from the public Ensembl database. For recombination, we calculate the variance due to recombination and the probability that a recombination event causes a mutation. We employ novel statistical procedures to take account of the spatial auto-correlation of recombination and mutation rates along the genome. Our results support the view that genomic diversity in recombination hotspots arises largely from a direct effect of recombination on mutation rather than predominantly from the effect of selective sweeps. We also use the statistic of variance due to context to compare the effect on the probability of polymorphism of contexts of various sizes. We find that when the 12 point mutations are considered separately, variance due to context increases significantly as we move from 3-mer to 5-mer and from 5-mer to 7-mer contexts. However, when all mutations are considered in aggregate, these differences are outweighed by the effect of interaction between the central base and its immediate neighbors. This interaction is itself dominated by the transition mutations, including, but not limited to, the CpG effect. We also demonstrate strand-asymmetry of contextual influence in intronic regions, which is hypothesized to be a result of transcription coupled DNA repair. We consider the extent to which the measures we have used can be used to meaningfully compare the relative magnitudes of the impact of recombination and context on mutation.
Collapse
Affiliation(s)
- Helmut Simon
- Research School of Biology, the Australian National University
| | - Gavin Huttley
- Research School of Biology, the Australian National University
| |
Collapse
|
28
|
Molecular Clocks without Rocks: New Solutions for Old Problems. Trends Genet 2020; 36:845-856. [PMID: 32709458 DOI: 10.1016/j.tig.2020.06.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Revised: 06/02/2020] [Accepted: 06/11/2020] [Indexed: 02/07/2023]
Abstract
Molecular data have been used to date species divergences ever since they were described as documents of evolutionary history in the 1960s. Yet, an inadequate fossil record and discordance between gene trees and species trees are persistently problematic. We examine how, by accommodating gene tree discordance and by scaling branch lengths to absolute time using mutation rate and generation time, multispecies coalescent (MSC) methods can potentially overcome these challenges. We find that time estimates can differ - in some cases, substantially - depending on whether MSC methods or traditional phylogenetic methods that apply concatenation are used, and whether the tree is calibrated with pedigree-based mutation rates or with fossils. We discuss the advantages and shortcomings of both approaches and provide practical guidance for data analysis when using these methods.
Collapse
|
29
|
Germline de novo mutation rates on exons versus introns in humans. Nat Commun 2020; 11:3304. [PMID: 32620809 PMCID: PMC7334200 DOI: 10.1038/s41467-020-17162-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 06/02/2020] [Indexed: 02/06/2023] Open
Abstract
A main assumption of molecular population genetics is that genomic mutation rate does not depend on sequence function. Challenging this assumption, a recent study has found a reduction in the mutation rate in exons compared to introns in somatic cells, ascribed to an enhanced exonic mismatch repair system activity. If this reduction happens also in the germline, it can compromise studies of population genomics, including the detection of selection when using introns as proxies for neutrality. Here we compile and analyze published germline de novo mutation data to test if the exonic mutation rate is also reduced in germ cells. After controlling for sampling bias in datasets with diseased probands and extended nucleotide context dependency, we find no reduction in the mutation rate in exons compared to introns in the germline. Therefore, there is no evidence that enhanced exonic mismatch repair activity determines the mutation rate in germline cells. Evidence that somatic mutation rates in introns exceed those in exons challenges the molecular evolution tenet that mutation rate and sequence function are independent. Here, authors analyze germline de novo mutations and reveal no evidence for mutation rate differences between exons and introns.
Collapse
|
30
|
Extreme differences between human germline and tumor mutation densities are driven by ancestral human-specific deviations. Nat Commun 2020; 11:2512. [PMID: 32427823 PMCID: PMC7237693 DOI: 10.1038/s41467-020-16296-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 04/22/2020] [Indexed: 12/29/2022] Open
Abstract
Mutations do not accumulate uniformly across the genome. Human germline and tumor mutation density correlate poorly, and each is associated with different genomic features. Here, we use non-human great ape (NHGA) germlines to determine human germline- and tumor-specific deviations from an ancestral-like great ape genome-wide mutational landscape. Strikingly, we find that the distribution of mutation densities in tumors presents a stronger correlation with NHGA than with human germlines. This effect is driven by human-specific differences in the distribution of mutations at non-CpG sites. We propose that ancestral human demographic events, together with the human-specific mutation slowdown, disrupted the human genome-wide distribution of mutation densities. Tumors partially recover this distribution by accumulating preneoplastic-like somatic mutations. Our results highlight the potential utility of using NHGA population data, rather than human controls, to establish the expected mutational background of healthy somatic cells.
Collapse
|
31
|
Berrio A, Haygood R, Wray GA. Identifying branch-specific positive selection throughout the regulatory genome using an appropriate proxy neutral. BMC Genomics 2020; 21:359. [PMID: 32404186 PMCID: PMC7222330 DOI: 10.1186/s12864-020-6752-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 04/21/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Adaptive changes in cis-regulatory elements are an essential component of evolution by natural selection. Identifying adaptive and functional noncoding DNA elements throughout the genome is therefore crucial for understanding the relationship between phenotype and genotype. RESULTS We used ENCODE annotations to identify appropriate proxy neutral sequences and demonstrate that the conservativeness of the test can be modulated during the filtration of reference alignments. We applied the method to noncoding Human Accelerated Elements as well as open chromatin elements previously identified in 125 human tissues and cell lines to demonstrate its utility. Then, we evaluated the impact of query region length, proxy neutral sequence length, and branch count on test sensitivity and specificity. We found that the length of the query alignment can vary between 150 bp and 1 kb without affecting the estimation of selection, while for the reference alignment, we found that a length of 3 kb is adequate for proper testing. We also simulated sequence alignments under different classes of evolution and validated our ability to distinguish positive selection from relaxation of constraint and neutral evolution. Finally, we re-confirmed that a quarter of all non-coding Human Accelerated Elements are evolving by positive selection. CONCLUSION Here, we introduce a method we called adaptiPhy, which adds significant improvements to our earlier method that tests for branch-specific directional selection in noncoding sequences. The motivation for these improvements is to provide a more sensitive and better targeted characterization of directional selection and neutral evolution across the genome.
Collapse
Affiliation(s)
- Alejandro Berrio
- Department of Biology, Duke University, Biological Sciences Building, 124 Science Drive, Durham, NC, 27708, USA.
| | - Ralph Haygood
- Ronin Institute for Independent Scholarship, 127 Haddon Pl., Montclair, NJ, 07043, USA
| | - Gregory A Wray
- Department of Biology, Duke University, Biological Sciences Building, 124 Science Drive, Durham, NC, 27708, USA
| |
Collapse
|
32
|
Sekiya M, Matsuda T, Yamamoto Y, Furuta Y, Ohyama M, Murayama Y, Sugano Y, Ohsaki Y, Iwasaki H, Yahagi N, Yatoh S, Suzuki H, Shimano H. Deciphering genetic signatures by whole exome sequencing in a case of co-prevalence of severe renal hypouricemia and diabetes with impaired insulin secretion. BMC MEDICAL GENETICS 2020; 21:91. [PMID: 32375679 PMCID: PMC7201978 DOI: 10.1186/s12881-020-01031-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 04/22/2020] [Indexed: 11/21/2022]
Abstract
Background Renal hypouricemia (RHUC) is a hereditary disorder where mutations in SLC22A12 gene and SLC2A9 gene cause RHUC type 1 (RHUC1) and RHUC type 2 (RHUC2), respectively. These genes regulate renal tubular reabsorption of urates while there exist other genes counterbalancing the net excretion of urates including ABCG2 and SLC17A1. Urate metabolism is tightly interconnected with glucose metabolism, and SLC2A9 gene may be involved in insulin secretion from pancreatic β-cells. On the other hand, a myriad of genes are responsible for the impaired insulin secretion independently of urate metabolism. Case presentation We describe a 67 year-old Japanese man who manifested severe hypouricemia (0.7 mg/dl (3.8–7.0 mg/dl), 41.6 μmol/l (226–416 μmol/l)) and diabetes with impaired insulin secretion. His high urinary fractional excretion of urate (65.5%) and low urinary C-peptide excretion (25.7 μg/day) were compatible with the diagnosis of RHUC and impaired insulin secretion, respectively. Considering the fact that metabolic pathways regulating urates and glucose are closely interconnected, we attempted to delineate the genetic basis of the hypouricemia and the insulin secretion defect observed in this patient using whole exome sequencing. Intriguingly, we found homozygous Trp258* mutations in SLC22A12 gene causing RHUC1 while concurrent mutations reported to be associated with hyperuricemia were also discovered including ABCG2 (Gln141Lys) and SLC17A1 (Thr269Ile). SLC2A9, that also facilitates glucose transport, has been implicated to enhance insulin secretion, however, the non-synonymous mutations found in SLC2A9 gene of this patient were not dysfunctional variants. Therefore, we embarked on a search for causal mutations for his impaired insulin secretion, resulting in identification of multiple mutations in HNF1A gene (MODY3) as well as other genes that play roles in pancreatic β-cells. Among them, the Leu80fs in the homeobox gene NKX6.1 was an unreported mutation. Conclusion We found a case of RHUC1 carrying mutations in SLC22A12 gene accompanied with compensatory mutations associated with hyperuricemia, representing the first report showing coexistence of the mutations with opposed potential to regulate urate concentrations. On the other hand, independent gene mutations may be responsible for his impaired insulin secretion, which contains novel mutations in key genes in the pancreatic β-cell functions that deserve further scrutiny.
Collapse
Affiliation(s)
- Motohiro Sekiya
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Takaaki Matsuda
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Yuki Yamamoto
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Yasuhisa Furuta
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Mariko Ohyama
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Yuki Murayama
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Yoko Sugano
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Yoshinori Ohsaki
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Hitoshi Iwasaki
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Naoya Yahagi
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Shigeru Yatoh
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Hiroaki Suzuki
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan
| | - Hitoshi Shimano
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8575, Japan.
| |
Collapse
|
33
|
Li C, Luscombe NM. Nucleosome positioning stability is a modulator of germline mutation rate variation across the human genome. Nat Commun 2020; 11:1363. [PMID: 32170069 PMCID: PMC7070026 DOI: 10.1038/s41467-020-15185-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 02/23/2020] [Indexed: 02/08/2023] Open
Abstract
Nucleosome organization has been suggested to affect local mutation rates in the genome. However, the lack of de novo mutation and high-resolution nucleosome data has limited the investigation of this hypothesis. Additionally, analyses using indirect mutation rate measurements have yielded contradictory and potentially confounding results. Here, we combine data on >300,000 human de novo mutations with high-resolution nucleosome maps and find substantially elevated mutation rates around translationally stable (‘strong’) nucleosomes. We show that the mutational mechanisms affected by strong nucleosomes are low-fidelity replication, insufficient mismatch repair and increased double-strand breaks. Strong nucleosomes preferentially locate within young SINE/LINE transposons, suggesting that when subject to increased mutation rates, transposons are then more rapidly inactivated. Depletion of strong nucleosomes in older transposons suggests frequent positioning changes during evolution. The findings have important implications for human genetics and genome evolution. Nucleosome organization has been suggested to affect local mutation rates in the genome. Here, the authors analyse data on >300,000 human de novo mutations and high-resolution nucleosome maps and provide evidence that nucleosome positioning stability modulates germline mutation rate variation across the human genome.
Collapse
Affiliation(s)
- Cai Li
- The Francis Crick Institute, London, NW1 1AT, UK. .,School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
| | - Nicholas M Luscombe
- The Francis Crick Institute, London, NW1 1AT, UK.,Okinawa Institute of Science & Technology Graduate University, Okinawa, 904-0495, Japan.,UCL Genetics Institute, University College London, London, WC1E 6BT, UK
| |
Collapse
|
34
|
Gonzalez-Perez A, Sabarinathan R, Lopez-Bigas N. Local Determinants of the Mutational Landscape of the Human Genome. Cell 2020; 177:101-114. [PMID: 30901533 DOI: 10.1016/j.cell.2019.02.051] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 02/13/2019] [Accepted: 02/26/2019] [Indexed: 12/19/2022]
Abstract
Large-scale chromatin features, such as replication time and accessibility influence the rate of somatic and germline mutations at the megabase scale. This article reviews how local chromatin structures -e.g., DNA wrapped around nucleosomes, transcription factors bound to DNA- affect the mutation rate at a local scale. It dissects how the interaction of some mutagenic agents and/or DNA repair systems with these local structures influence the generation of mutations. We discuss how this local mutation rate variability affects our understanding of the evolution of the genomic sequence, and the study of the evolution of organisms and tumors.
Collapse
Affiliation(s)
- Abel Gonzalez-Perez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| | - Radhakrishnan Sabarinathan
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India.
| | - Nuria Lopez-Bigas
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
35
|
Castellano D, Eyre-Walker A, Munch K. Impact of Mutation Rate and Selection at Linked Sites on DNA Variation across the Genomes of Humans and Other Homininae. Genome Biol Evol 2020; 12:3550-3561. [PMID: 31596481 PMCID: PMC6944223 DOI: 10.1093/gbe/evz215] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/03/2019] [Indexed: 12/23/2022] Open
Abstract
DNA diversity varies across the genome of many species. Variation in diversity across a genome might arise from regional variation in the mutation rate, variation in the intensity and mode of natural selection, and regional variation in the recombination rate. We show that both noncoding and nonsynonymous diversity are positively correlated to a measure of the mutation rate and the recombination rate and negatively correlated to the density of conserved sequences in 50 kb windows across the genomes of humans and nonhuman homininae. Interestingly, we find that although noncoding diversity is equally affected by these three genomic variables, nonsynonymous diversity is mostly dominated by the density of conserved sequences. The positive correlation between diversity and our measure of the mutation rate seems to be largely a direct consequence of regions with higher mutation rates having more diversity. However, the positive correlation with recombination rate and the negative correlation with the density of conserved sequences suggest that selection at linked sites also affect levels of diversity. This is supported by the observation that the ratio of the number of nonsynonymous to noncoding polymorphisms is negatively correlated to a measure of the effective population size across the genome. We show these patterns persist even when we restrict our analysis to GC-conservative mutations, demonstrating that the patterns are not driven by GC biased gene conversion. In conclusion, our comparative analyses describe how recombination rate, gene density, and mutation rate interact to produce the patterns of DNA diversity that we observe along the hominine genomes.
Collapse
Affiliation(s)
- David Castellano
- Bioinformatics Research Centre, Aarhus University, Denmark
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, Barcelona, Spain
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, Denmark
| |
Collapse
|
36
|
Lee J, Hong SE. Functional annotation of de novo variants from healthy individuals. Genomics Inform 2019; 17:e46. [PMID: 31896246 PMCID: PMC6944041 DOI: 10.5808/gi.2019.17.4.e46] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 12/05/2019] [Indexed: 11/27/2022] Open
Abstract
The implications of germline de novo variants (DNVs) in diseases are well documented. Despite extensive research, inconsistencies between studies remain a challenge, and the distribution and genetic characteristics of DNVs need to be precisely evaluated. To address this issue at the whole-genome scale, a large number of DNVs identified from the whole-genome sequencing of 1,902 healthy trios (i.e., parents and progeny) from the Simons Foundation for Autism Research Initiative study and 20 healthy Korean trios were analyzed. These apparently nonpathogenic DNVs were enriched in functional elements of the genome but relatively depleted in regions of common copy number variants, implying their potential function as triggers of evolution even in healthy groups. No strong mutational hotspots were identified. The pathogenicity of the DNVs was not strongly elevated, reflecting the health status of the cohort. The mutational signatures were consistent with previous studies. This study will serve as a reference for future DNV studies.
Collapse
Affiliation(s)
- Jean Lee
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul 03080, Korea
| | - Sung Eun Hong
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul 03080, Korea
| |
Collapse
|
37
|
Castellano D, Macià MC, Tataru P, Bataillon T, Munch K. Comparison of the Full Distribution of Fitness Effects of New Amino Acid Mutations Across Great Apes. Genetics 2019; 213:953-966. [PMID: 31488516 PMCID: PMC6827385 DOI: 10.1534/genetics.119.302494] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 08/29/2019] [Indexed: 12/31/2022] Open
Abstract
The distribution of fitness effects (DFE) is central to many questions in evolutionary biology. However, little is known about the differences in DFE between closely related species. We use >9000 coding genes orthologous one-to-one across great apes, gibbons, and macaques to assess the stability of the DFE across great apes. We use the unfolded site frequency spectrum of polymorphic mutations (n = 8 haploid chromosomes per population) to estimate the DFE. We find that the shape of the deleterious DFE is strikingly similar across great apes. We confirm that effective population size (Ne ) is a strong predictor of the strength of negative selection, consistent with the nearly neutral theory. However, we also find that the strength of negative selection varies more than expected given the differences in Ne between species. Across species, mean fitness effects of new deleterious mutations covaries with Ne , consistent with positive epistasis among deleterious mutations. We find that the strength of negative selection for the smallest populations, bonobos and western chimpanzees, is higher than expected given their Ne This may result from a more efficient purging of strongly deleterious recessive variants in these populations. Forward simulations confirm that these findings are not artifacts of the way we are inferring Ne and DFE parameters. All findings are replicated using only GC-conservative mutations, thereby confirming that GC-biased gene conversion is not affecting our conclusions.
Collapse
Affiliation(s)
- David Castellano
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Moisès Coll Macià
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Paula Tataru
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| |
Collapse
|
38
|
Signatures of replication timing, recombination, and sex in the spectrum of rare variants on the human X chromosome and autosomes. Proc Natl Acad Sci U S A 2019; 116:17916-17924. [PMID: 31427530 PMCID: PMC6731651 DOI: 10.1073/pnas.1900714116] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The sources of human germline mutations are poorly understood. Part of the difficulty is that mutations occur very rarely, and so direct pedigree-based approaches remain limited in the numbers that they can examine. To address this problem, we consider the spectrum of low-frequency variants in a dataset (Genome Aggregation Database, gnomAD) of 13,860 human X chromosomes and autosomes. X-autosome differences are reflective of germline sex differences and have been used extensively to learn about male versus female mutational processes; what is less appreciated is that they also reflect chromosome-level biochemical features that differ between the X and autosomes. We tease these components apart by comparing the mutation spectrum in multiple genomic compartments on the autosomes and between the X and autosomes. In so doing, we are able to ascribe specific mutation patterns to replication timing and recombination and to identify differences in the types of mutations that accrue in males and females. In particular, we identify C > G as a mutagenic signature of male meiotic double-strand breaks on the X, which may result from late repair. Our results show how biochemical processes of damage and repair in the germline interact with sex-specific life history traits to shape mutation patterns on both the X chromosome and autosomes.
Collapse
|
39
|
Konrad A, Brady MJ, Bergthorsson U, Katju V. Mutational Landscape of Spontaneous Base Substitutions and Small Indels in Experimental Caenorhabditis elegans Populations of Differing Size. Genetics 2019; 212:837-854. [PMID: 31110155 PMCID: PMC6614903 DOI: 10.1534/genetics.119.302054] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 05/16/2019] [Indexed: 02/08/2023] Open
Abstract
Experimental investigations into the rates and fitness effects of spontaneous mutations are fundamental to our understanding of the evolutionary process. To gain insights into the molecular and fitness consequences of spontaneous mutations, we conducted a mutation accumulation (MA) experiment at varying population sizes in the nematode Caenorhabditis elegans, evolving 35 lines in parallel for 409 generations at three population sizes (N = 1, 10, and 100 individuals). Here, we focus on nuclear SNPs and small insertion/deletions (indels) under minimal influence of selection, as well as their accrual rates in larger populations under greater selection efficacy. The spontaneous rates of base substitutions and small indels are 1.84 (95% C.I. ± 0.14) × 10-9 substitutions and 6.84 (95% C.I. ± 0.97) × 10-10 changes/site/generation, respectively. Small indels exhibit a deletion bias with deletions exceeding insertions by threefold. Notably, there was no correlation between the frequency of base substitutions, nonsynonymous substitutions, or small indels with population size. These results contrast with our previous analysis of mitochondrial DNA mutations and nuclear copy-number changes in these MA lines, and suggest that nuclear base substitutions and small indels are under less stringent purifying selection compared to the former mutational classes. A transition bias was observed in exons as was a near universal base substitution bias toward A/T. Strongly context-dependent base substitutions, where 5'-Ts and 3'-As increase the frequency of A/T → T/A transversions, especially at the boundaries of A or T homopolymeric runs, manifest as higher mutation rates in (i) introns and intergenic regions relative to exons, (ii) chromosomal cores vs. arms and tips, and (iii) germline-expressed genes.
Collapse
Affiliation(s)
- Anke Konrad
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77845
| | - Meghan J Brady
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77845
| | - Ulfar Bergthorsson
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77845
| | - Vaishali Katju
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77845
| |
Collapse
|
40
|
Zeng K, Jackson BC, Barton HJ. Methods for Estimating Demography and Detecting Between-Locus Differences in the Effective Population Size and Mutation Rate. Mol Biol Evol 2019; 36:423-433. [PMID: 30428070 PMCID: PMC6409433 DOI: 10.1093/molbev/msy212] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
It is known that the effective population size (Ne) and the mutation rate (u) vary across the genome. Here, we show that ignoring this heterogeneity may lead to biased estimates of past demography. To solve the problem, we develop new methods for jointly inferring past changes in population size and detecting variation in Ne and u between loci. These methods rely on either polymorphism data alone or both polymorphism and divergence data. In addition to inferring demography, we can use the methods to study a variety of questions: 1) comparing sex chromosomes with autosomes (for finding evidence for male-driven evolution, an unequal sex ratio, or sex-biased demographic changes) and 2) analyzing multilocus data from within autosomes or sex chromosomes (for studying determinants of variability in Ne and u). Simulations suggest that the methods can provide accurate parameter estimates and have substantial statistical power for detecting difference in Ne and u. As an example, we use the methods to analyze a polymorphism data set from Drosophila simulans. We find clear evidence for rapid population expansion. The results also indicate that the autosomes have a higher mutation rate than the X chromosome and that the sex ratio is probably female-biased. The new methods have been implemented in a user-friendly package.
Collapse
Affiliation(s)
- Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| | - Benjamin C Jackson
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Henry J Barton
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
41
|
Rousselle M, Laverré A, Figuet E, Nabholz B, Galtier N. Influence of Recombination and GC-biased Gene Conversion on the Adaptive and Nonadaptive Substitution Rate in Mammals versus Birds. Mol Biol Evol 2019; 36:458-471. [PMID: 30590692 PMCID: PMC6389324 DOI: 10.1093/molbev/msy243] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Recombination is expected to affect functional sequence evolution in several ways. On the one hand, recombination is thought to improve the efficiency of multilocus selection by dissipating linkage disequilibrium. On the other hand, natural selection can be counteracted by recombination-associated transmission distorters such as GC-biased gene conversion (gBGC), which tends to promote G and C alleles irrespective of their fitness effect in high-recombining regions. It has been suggested that gBGC might impact coding sequence evolution in vertebrates, and particularly the ratio of nonsynonymous to synonymous substitution rates (dN/dS). However, distinctive gBGC patterns have been reported in mammals and birds, maybe reflecting the documented contrasts in evolutionary dynamics of recombination rate between these two taxa. Here, we explore how recombination and gBGC affect coding sequence evolution in mammals and birds by analyzing proteome-wide data in six species of Galloanserae (fowls) and six species of catarrhine primates. We estimated the dN/dS ratio and rates of adaptive and nonadaptive evolution in bins of genes of increasing recombination rate, separately analyzing AT → GC, GC → AT, and G ↔ C/A ↔ T mutations. We show that in both taxa, recombination and gBGC entail a decrease in dN/dS. Our analysis indicates that recombination enhances the efficiency of purifying selection by lowering Hill-Robertson effects, whereas gBGC leads to an overestimation of the adaptive rate of AT → GC mutations. Finally, we report a mutagenic effect of recombination, which is independent of gBGC.
Collapse
Affiliation(s)
| | - Alexandre Laverré
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Emeric Figuet
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Benoit Nabholz
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Nicolas Galtier
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| |
Collapse
|
42
|
Katju V, Bergthorsson U. Old Trade, New Tricks: Insights into the Spontaneous Mutation Process from the Partnering of Classical Mutation Accumulation Experiments with High-Throughput Genomic Approaches. Genome Biol Evol 2019; 11:136-165. [PMID: 30476040 PMCID: PMC6330053 DOI: 10.1093/gbe/evy252] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/22/2018] [Indexed: 12/17/2022] Open
Abstract
Mutations spawn genetic variation which, in turn, fuels evolution. Hence, experimental investigations into the rate and fitness effects of spontaneous mutations are central to the study of evolution. Mutation accumulation (MA) experiments have served as a cornerstone for furthering our understanding of spontaneous mutations for four decades. In the pregenomic era, phenotypic measurements of fitness-related traits in MA lines were used to indirectly estimate key mutational parameters, such as the genomic mutation rate, new mutational variance per generation, and the average fitness effect of mutations. Rapidly emerging next-generating sequencing technology has supplanted this phenotype-dependent approach, enabling direct empirical estimates of the mutation rate and a more nuanced understanding of the relative contributions of different classes of mutations to the standing genetic variation. Whole-genome sequencing of MA lines bears immense potential to provide a unified account of the evolutionary process at multiple levels-the genetic basis of variation, and the evolutionary dynamics of mutations under the forces of selection and drift. In this review, we have attempted to synthesize key insights into the spontaneous mutation process that are rapidly emerging from the partnering of classical MA experiments with high-throughput sequencing, with particular emphasis on the spontaneous rates and molecular properties of different mutational classes in nuclear and mitochondrial genomes of diverse taxa, the contribution of mutations to the evolution of gene expression, and the rate and stability of transgenerational epigenetic modifications. Future advances in sequencing technologies will enable greater species representation to further refine our understanding of mutational parameters and their functional consequences.
Collapse
Affiliation(s)
- Vaishali Katju
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77843-4458
| | - Ulfar Bergthorsson
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77843-4458
| |
Collapse
|
43
|
Spence JP, Steinrücken M, Terhorst J, Song YS. Inference of population history using coalescent HMMs: review and outlook. Curr Opin Genet Dev 2018; 53:70-76. [PMID: 30056275 PMCID: PMC6296859 DOI: 10.1016/j.gde.2018.07.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 07/08/2018] [Accepted: 07/09/2018] [Indexed: 01/02/2023]
Abstract
Studying how diverse human populations are related is of historical and anthropological interest, in addition to providing a realistic null model for testing for signatures of natural selection or disease associations. Furthermore, understanding the demographic histories of other species is playing an increasingly important role in conservation genetics. A number of statistical methods have been developed to infer population demographic histories using whole-genome sequence data, with recent advances focusing on allowing for more flexible modeling choices, scaling to larger data sets, and increasing statistical power. Here we review coalescent hidden Markov models, a powerful class of population genetic inference methods that can utilize linkage disequilibrium information effectively. We highlight recent advances, give advice for practitioners, point out potential pitfalls, and present possible future research directions.
Collapse
Affiliation(s)
- Jeffrey P Spence
- Computational Biology Graduate Group, University of California, Berkeley, United States
| | | | | | - Yun S Song
- Computer Science Division and Department of Statistics, University of California, Berkeley, United States; Chan Zuckerberg Biohub, San Francisco, United States.
| |
Collapse
|
44
|
Gossmann TI, Bockwoldt M, Diringer L, Schwarz F, Schumann VF. Evidence for Strong Fixation Bias at 4-fold Degenerate Sites Across Genes in the Great Tit Genome. Front Ecol Evol 2018. [DOI: 10.3389/fevo.2018.00203] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
|
45
|
Skov L, Hui R, Shchur V, Hobolth A, Scally A, Schierup MH, Durbin R. Detecting archaic introgression using an unadmixed outgroup. PLoS Genet 2018; 14:e1007641. [PMID: 30226838 PMCID: PMC6161914 DOI: 10.1371/journal.pgen.1007641] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 09/28/2018] [Accepted: 08/17/2018] [Indexed: 12/24/2022] Open
Abstract
Human populations outside of Africa have experienced at least two bouts of introgression from archaic humans, from Neanderthals and Denisovans. In Papuans there is prior evidence of both these introgressions. Here we present a new approach to detect segments of individual genomes of archaic origin without using an archaic reference genome. The approach is based on a hidden Markov model that identifies genomic regions with a high density of single nucleotide variants (SNVs) not seen in unadmixed populations. We show using simulations that this provides a powerful approach to identifying segments of archaic introgression with a low rate of false detection, given data from a suitable outgroup population is available, without the archaic introgression but containing a majority of the variation that arose since initial separation from the archaic lineage. Furthermore our approach is able to infer admixture proportions and the times both of admixture and of initial divergence between the human and archaic populations. We apply the model to detect archaic introgression in 89 Papuans and show how the identified segments can be assigned to likely Neanderthal or Denisovan origin. We report more Denisovan admixture than previous studies and find a shift in size distribution of fragments of Neanderthal and Denisovan origin that is compatible with a difference in admixture time. Furthermore, we identify small amounts of Denisova ancestry in South East Asians and South Asians. The genetic history of present-day individuals includes episodes of mating between divergent groups, which have led to 'introgressed' genetic material persisting in modern genome sequences. Perhaps the most notable examples of such events in humans are the introgressions from Neanderthals into non-Africans 50,000 or so years ago, and from a related archaic group known as Denisovans into the ancestors of indigenous people from Papua-New Guinea and Australia. Methods to identify introgressions and the genomic regions that derive from them generally involve the use of reference genome sequences for the source populations. However, there are advantages in having methods independent of reference sequences, both to reduce bias and to detect possible introgression from groups for which we currently lack a reference genome. In this paper we describe such an approach, in a statistical framework which exploits the fact that introgressed regions will contain a high density of genetic variants that are private to the group receiving the divergent material. We apply this method to 89 Papuan genome sequences, estimating times of introgression and initial divergence between archaic and modern humans, and compare it to other related methods.
Collapse
Affiliation(s)
- Laurits Skov
- Bioinformatics Research Centre, Aarhus University, Aarhus C., Denmark
- * E-mail: (LS); (RD)
| | - Ruoyun Hui
- Department of Genetics, University of Cambridge, Cambridge United Kingdom
| | - Vladimir Shchur
- Wellcome Sanger Institute, Hinxton, Cambridge, United Kingdom
| | - Asger Hobolth
- Bioinformatics Research Centre, Aarhus University, Aarhus C., Denmark
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Cambridge United Kingdom
| | | | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge United Kingdom
- Wellcome Sanger Institute, Hinxton, Cambridge, United Kingdom
- * E-mail: (LS); (RD)
| |
Collapse
|
46
|
Afanasyeva A, Bockwoldt M, Cooney CR, Heiland I, Gossmann TI. Human long intrinsically disordered protein regions are frequent targets of positive selection. Genome Res 2018; 28:975-982. [PMID: 29858274 PMCID: PMC6028134 DOI: 10.1101/gr.232645.117] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 06/01/2018] [Indexed: 12/20/2022]
Abstract
Intrinsically disordered regions occur frequently in proteins and are characterized by a lack of a well-defined three-dimensional structure. Although these regions do not show a higher order of structural organization, they are known to be functionally important. Disordered regions are rapidly evolving, largely attributed to relaxed purifying selection and an increased role of genetic drift. It has also been suggested that positive selection might contribute to their rapid diversification. However, for our own species, it is currently unknown whether positive selection has played a role during the evolution of these protein regions. Here, we address this question by investigating the evolutionary pattern of more than 6600 human proteins with intrinsically disordered regions and their ordered counterparts. Our comparative approach with data from more than 90 mammalian genomes uses a priori knowledge of disordered protein regions, and we show that this increases the power to detect positive selection by an order of magnitude. We can confirm that human intrinsically disordered regions evolve more rapidly, not only within humans but also across the entire mammalian phylogeny. They have, however, experienced substantial evolutionary constraint, hinting at their fundamental functional importance. We find compelling evidence that disordered protein regions are frequent targets of positive selection and estimate that the relative rate of adaptive substitutions differs fourfold between disordered and ordered protein regions in humans. Our results suggest that disordered protein regions are important targets of genetic innovation and that the contribution of positive selection in these regions is more pronounced than in other protein parts.
Collapse
Affiliation(s)
- Arina Afanasyeva
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S102TN, United Kingdom.,Institute of Nanobiotechnologies, Peter the Great St. Petersburg Polytechnic University, Saint-Petersburg 195251, Russia.,Petersburg Nuclear Physics Institute, B.P. Konstantinov NRC Kurchatov Institute, Gatchina, Leningrad District 188300, Russia.,National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki City, Osaka 567-0085, Japan
| | - Mathias Bockwoldt
- Department of Arctic and Marine Biology, UiT The Arctic University of Norway, 9037 Tromsø, Norway
| | - Christopher R Cooney
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S102TN, United Kingdom
| | - Ines Heiland
- Department of Arctic and Marine Biology, UiT The Arctic University of Norway, 9037 Tromsø, Norway
| | - Toni I Gossmann
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S102TN, United Kingdom
| |
Collapse
|