1
|
Kozłowska-Masłoń J, Ciomborowska-Basheer J, Kubiak MR, Makałowska I. Evolution of retrocopies in the context of HUSH silencing. Biol Direct 2024; 19:60. [PMID: 39095906 PMCID: PMC11295320 DOI: 10.1186/s13062-024-00507-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 07/29/2024] [Indexed: 08/04/2024] Open
Abstract
Retrotransposition is one of the main factors responsible for gene duplication and thus genome evolution. However, the sequences that undergo this process are not only an excellent source of biological diversity, but in certain cases also pose a threat to the integrity of the DNA. One of the mechanisms that protects against the incorporation of mobile elements is the HUSH complex, which is responsible for silencing long, intronless, transcriptionally active transposed sequences that are rich in adenine on the sense strand. In this study, broad sets of human and porcine retrocopies were analysed with respect to the above factors, taking into account evolution of these molecules. Analysis of expression pattern, genomic structure, transcript length, and nucleotide substitution frequency showed the strong relationship between the expression level and exon length as well as the protective nature of introns. The results of the studies also showed that there is no direct correlation between the expression level and adenine content. However, protein-coding retrocopies, which have a lower adenine content, have a significantly higher expression level than the adenine-rich non-coding but expressed retrocopies. Therefore, although the mechanism of HUSH silencing may be an important part of the regulation of retrocopy expression, it is one component of a more complex molecular network that remains to be elucidated.
Collapse
Affiliation(s)
- Joanna Kozłowska-Masłoń
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
- Laboratory of Cancer Genetics, Greater Poland Cancer Centre, Garbary 15, Poznań, Poland
| | - Joanna Ciomborowska-Basheer
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
- Laboratory of Nature Education and Conservation, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
| | - Magdalena Regina Kubiak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
| | - Izabela Makałowska
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland.
| |
Collapse
|
2
|
Borovská I, Vořechovský I, Královičová J. Alu RNA fold links splicing with signal recognition particle proteins. Nucleic Acids Res 2023; 51:8199-8216. [PMID: 37309897 PMCID: PMC10450188 DOI: 10.1093/nar/gkad500] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 05/23/2023] [Accepted: 05/31/2023] [Indexed: 06/14/2023] Open
Abstract
Transcriptomic diversity in primates was considerably expanded by exonizations of intronic Alu elements. To better understand their cellular mechanisms we have used structure-based mutagenesis coupled with functional and proteomic assays to study the impact of successive primate mutations and their combinations on inclusion of a sense-oriented AluJ exon in the human F8 gene. We show that the splicing outcome was better predicted by consecutive RNA conformation changes than by computationally derived splicing regulatory motifs. We also demonstrate an involvement of SRP9/14 (signal recognition particle) heterodimer in splicing regulation of Alu-derived exons. Nucleotide substitutions that accumulated during primate evolution relaxed the conserved left-arm AluJ structure including helix H1 and reduced the capacity of SRP9/14 to stabilize the closed Alu conformation. RNA secondary structure-constrained mutations that promoted open Y-shaped conformations of the Alu made the Alu exon inclusion reliant on DHX9. Finally, we identified additional SRP9/14 sensitive Alu exons and predicted their functional roles in the cell. Together, these results provide unique insights into architectural elements required for sense Alu exonization, identify conserved pre-mRNA structures involved in exon selection and point to a possible chaperone activity of SRP9/14 outside the mammalian signal recognition particle.
Collapse
Affiliation(s)
- Ivana Borovská
- Institute of Molecular Physiology and Genetics, Centre of Biosciences, Slovak Academy of Sciences, Bratislava 840 05, Slovak Republic
| | - Igor Vořechovský
- Faculty of Medicine, University of Southampton, HDH, MP808, Southampton SO16 6YD, United Kingdom
| | - Jana Královičová
- Institute of Molecular Physiology and Genetics, Centre of Biosciences, Slovak Academy of Sciences, Bratislava 840 05, Slovak Republic
- Institute of Zoology, Slovak Academy of Sciences, Bratislava 845 06, Slovak Republic
| |
Collapse
|
3
|
Johri P, Eyre-Walker A, Gutenkunst RN, Lohmueller KE, Jensen JD. On the prospect of achieving accurate joint estimation of selection with population history. Genome Biol Evol 2022; 14:evac088. [PMID: 35675379 PMCID: PMC9254643 DOI: 10.1093/gbe/evac088] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/02/2022] [Indexed: 11/15/2022] Open
Abstract
As both natural selection and population history can affect genome-wide patterns of variation, disentangling the contributions of each has remained as a major challenge in population genetics. We here discuss historical and recent progress towards this goal-highlighting theoretical and computational challenges that remain to be addressed, as well as inherent difficulties in dealing with model complexity and model violations-and offer thoughts on potentially fruitful next steps.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | | | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
4
|
Cabrera VM. Human molecular evolutionary rate, time dependency and transient polymorphism effects viewed through ancient and modern mitochondrial DNA genomes. Sci Rep 2021; 11:5036. [PMID: 33658608 PMCID: PMC7930196 DOI: 10.1038/s41598-021-84583-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 02/15/2021] [Indexed: 01/31/2023] Open
Abstract
Human evolutionary genetics gives a chronological framework to interpret the human history. It is based on the molecular clock hypothesis that suppose a straightforward relationship between the mutation rate and the substitution rate with independence of other factors as demography dynamics. Analyzing ancient and modern human complete mitochondrial genomes we show here that, along the time, the substitution rate can be significantly slower or faster than the average germline mutation rate confirming a time dependence effect mainly attributable to changes in the effective population size of the human populations, with an exponential growth in recent times. We also detect that transient polymorphisms play a slowdown role in the evolutionary rate deduced from haplogroup intraspecific trees. Finally, we propose the use of the most divergent lineages within haplogroups as a practical approach to correct these molecular clock mismatches.
Collapse
Affiliation(s)
- Vicente M Cabrera
- Retired member of Departamento de Genética, Facultad de Biología, Universidad de La Laguna, Canary Islands, Spain.
| |
Collapse
|
5
|
Rong S, Buerer L, Rhine CL, Wang J, Cygan KJ, Fairbrother WG. Mutational bias and the protein code shape the evolution of splicing enhancers. Nat Commun 2020; 11:2845. [PMID: 32504065 PMCID: PMC7275064 DOI: 10.1038/s41467-020-16673-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 04/28/2020] [Indexed: 02/06/2023] Open
Abstract
Exonic splicing enhancers (ESEs) are enriched in exons relative to introns and bind splicing activators. This study considers a fundamental question of co-evolution: How did ESE motifs become enriched in exons prior to the evolution of ESE recognition? We hypothesize that the high exon to intron motif ratios necessary for ESE function were created by mutational bias coupled with purifying selection on the protein code. These two forces retain certain coding motifs in exons while passively depleting them from introns. Through the use of simulations, genomic analyses, and high throughput splicing assays, we confirm the key predictions of this hypothesis, including an overlap between protein and splicing information in ESEs. We discuss the implications of mutational bias as an evolutionary driver in other cis-regulatory systems. Splicing is regulated by cis-acting elements in pre-mRNAs such as exonic or intronic splicing enhancers and silencers. Here the authors show that exonic splicing enhancers are enriched in exons compared to introns due to mutational bias coupled with purifying selection on the protein code.
Collapse
Affiliation(s)
- Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA
| | - Luke Buerer
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA
| | - Christy L Rhine
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Jing Wang
- Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Kamil J Cygan
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - William G Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA. .,Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, 02912, USA. .,Hassenfeld Child Health Innovation Institute of Brown University, Providence, RI, 02912, USA.
| |
Collapse
|
6
|
Sun JH, Ai SM, Liu SQ. Methylation-driven model for analysis of dinucleotide evolution in genomes. Theor Biol Med Model 2020; 17:3. [PMID: 32264909 PMCID: PMC7140373 DOI: 10.1186/s12976-020-00122-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 03/10/2020] [Indexed: 11/16/2022] Open
Abstract
Background CpGs, the major methylation sites in vertebrate genomes, exhibit a high mutation rate from the methylated form of CpG to TpG/CpA and, therefore, influence the evolution of genome composition. However, the quantitative effects of CpG to TpG/CpA mutations on the evolution of genome composition in terms of the dinucleotide frequencies/proportions remain poorly understood. Results Based on the neutral theory of molecular evolution, we propose a methylation-driven model (MDM) that allows predicting the changes in frequencies/proportions of the 16 dinucleotides and in the GC content of a genome given the known number of CpG to TpG/CpA mutations. The application of MDM to the 10 published vertebrate genomes shows that, for most of the 16 dinucleotides and the GC content, a good consistency is achieved between the predicted and observed trends of changes in the frequencies and content relative to the assumed initial values, and that the model performs better on the mammalian genomes than it does on the lower-vertebrate genomes. The model’s performance depends on the genome composition characteristics, the assumed initial state of the genome, and the estimated parameters, one or more of which are responsible for the different application effects on the mammalian and lower-vertebrate genomes and for the large deviations of the predicted frequencies of a few dinucleotides from their observed frequencies. Conclusions Despite certain limitations of the current model, the successful application to the higher-vertebrate (mammalian) genomes witnesses its potential for facilitating studies aimed at understanding the role of methylation in driving the evolution of genome dinucleotide composition.
Collapse
Affiliation(s)
- Jian-Hong Sun
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan & School of Life Sciences, Yunnan University, Kunming, 650091, China.,College of Engineering, Honghe University, Mengzi, 661100, China
| | - Shi-Meng Ai
- Department of Applied Mathematics, Yunnan Agricultural University, Kunming, 650201, China
| | - Shu-Qun Liu
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan & School of Life Sciences, Yunnan University, Kunming, 650091, China.
| |
Collapse
|
7
|
Corcoran P, Gossmann TI, Barton HJ, Slate J, Zeng K. Determinants of the Efficacy of Natural Selection on Coding and Noncoding Variability in Two Passerine Species. Genome Biol Evol 2018; 9:2987-3007. [PMID: 29045655 PMCID: PMC5714183 DOI: 10.1093/gbe/evx213] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/16/2017] [Indexed: 02/06/2023] Open
Abstract
Population genetic theory predicts that selection should be more effective when the effective population size (Ne) is larger, and that the efficacy of selection should correlate positively with recombination rate. Here, we analyzed the genomes of ten great tits and ten zebra finches. Nucleotide diversity at 4-fold degenerate sites indicates that zebra finches have a 2.83-fold larger Ne. We obtained clear evidence that purifying selection is more effective in zebra finches. The proportion of substitutions at 0-fold degenerate sites fixed by positive selection (α) is high in both species (great tit 48%; zebra finch 64%) and is significantly higher in zebra finches. When α was estimated on GC-conservative changes (i.e., between A and T and between G and C), the estimates reduced in both species (great tit 22%; zebra finch 53%). A theoretical model presented herein suggests that failing to control for the effects of GC-biased gene conversion (gBGC) is potentially a contributor to the overestimation of α, and that this effect cannot be alleviated by first fitting a demographic model to neutral variants. We present the first estimates in birds for α in the untranslated regions, and found evidence for substantial adaptive changes. Finally, although purifying selection is stronger in high-recombination regions, we obtained mixed evidence for α increasing with recombination rate, especially after accounting for gBGC. These results highlight that it is important to consider the potential confounding effects of gBGC when quantifying selection and that our understanding of what determines the efficacy of selection is incomplete.
Collapse
Affiliation(s)
- Pádraic Corcoran
- Department of Animal and Plant Sciences, University of Sheffield, South Yorkshire, United Kingdom
| | - Toni I Gossmann
- Department of Animal and Plant Sciences, University of Sheffield, South Yorkshire, United Kingdom
| | - Henry J Barton
- Department of Animal and Plant Sciences, University of Sheffield, South Yorkshire, United Kingdom
| | | | - Jon Slate
- Department of Animal and Plant Sciences, University of Sheffield, South Yorkshire, United Kingdom
| | - Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, South Yorkshire, United Kingdom
| |
Collapse
|
8
|
Thornlow BP, Hough J, Roger JM, Gong H, Lowe TM, Corbett-Detig RB. Transfer RNA genes experience exceptionally elevated mutation rates. Proc Natl Acad Sci U S A 2018; 115:8996-9001. [PMID: 30127029 PMCID: PMC6130373 DOI: 10.1073/pnas.1801240115] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Transfer RNAs (tRNAs) are a central component for the biological synthesis of proteins, and they are among the most highly conserved and frequently transcribed genes in all living things. Despite their clear significance for fundamental cellular processes, the forces governing tRNA evolution are poorly understood. We present evidence that transcription-associated mutagenesis and strong purifying selection are key determinants of patterns of sequence variation within and surrounding tRNA genes in humans and diverse model organisms. Remarkably, the mutation rate at broadly expressed cytosolic tRNA loci is likely between 7 and 10 times greater than the nuclear genome average. Furthermore, evolutionary analyses provide strong evidence that tRNA genes, but not their flanking sequences, experience strong purifying selection acting against this elevated mutation rate. We also find a strong correlation between tRNA expression levels and the mutation rates in their immediate flanking regions, suggesting a simple method for estimating individual tRNA gene activity. Collectively, this study illuminates the extreme competing forces in tRNA gene evolution and indicates that mutations at tRNA loci contribute disproportionately to mutational load and have unexplored fitness consequences in human populations.
Collapse
Affiliation(s)
- Bryan P Thornlow
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Josh Hough
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Jacquelyn M Roger
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Henry Gong
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064
| | - Todd M Lowe
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064;
- Genomics Institute, University of California, Santa Cruz, CA 95064
| | - Russell B Corbett-Detig
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064;
- Genomics Institute, University of California, Santa Cruz, CA 95064
| |
Collapse
|
9
|
Pranckėnienė L, Jakaitienė A, Ambrozaitytė L, Kavaliauskienė I, Kučinskas V. Insights Into de novo Mutation Variation in Lithuanian Exome. Front Genet 2018; 9:315. [PMID: 30154829 PMCID: PMC6102505 DOI: 10.3389/fgene.2018.00315] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/24/2018] [Indexed: 01/23/2023] Open
Abstract
In the last decade, one of the biggest challenges in genomics research has been to distinguish definitive pathogenic variants from all likely pathogenic variants identified by next-generation sequencing. This task is particularly complex because of our lack of knowledge regarding overall genome variation and pathogenicity of the variants. Therefore, obtaining sufficient information about genome variants in the general population is necessary as such data could be used for the interpretation of de novo mutations (DNMs) in the context of patient's phenotype in cases of sporadic genetic disease. In this study, data from whole-exome sequencing of the general population in Lithuania were directly examined. In total, 84 (VarScan) and 95 (VarSeqTM) DNMs were identified and validated using different algorithms. Thirty-nine of these mutations were considered likely to be pathogenic based on gene function, evolutionary conservation, and mutation impact. The mutation rate estimated per position pair per generation was 2.74 × 10-8 [95% CI: 2.24 × 10-8-3.35 × 10-8] (VarScan) and 2.4 × 10-8 [95% CI: 1.96 × 10-8-2.99 × 10-8] (VarSeqTM), with 1.77 × 10-8 [95% CI: 6.03 × 10-9-5.2 × 10-8] de novo indels per position per generation. The rate of germline DNMs in the Lithuanian population and the effects of the genomic and epigenetic context on DNM formation were calculated for the first time in this study, providing a basis for further analysis of DNMs in individuals with genetic diseases. Considering these findings, additional studies in patient groups with genetic diseases with unclear etiology may facilitate our ability to distinguish certain pathogenic or adaptive DNMs from tolerated background DNMs and to reliably identify disease-causing DNMs by their properties through direct observation.
Collapse
Affiliation(s)
- Laura Pranckėnienė
- Department of Human and Medical Genetics, Institute of Biomedical Sciences, Faculty of Medicine, Vilnius University, Vilnius, Lithuania
| | | | | | | | | |
Collapse
|
10
|
Tatarinova TV, Chekalin E, Nikolsky Y, Bruskin S, Chebotarov D, McNally KL, Alexandrov N. Nucleotide diversity analysis highlights functionally important genomic regions. Sci Rep 2016; 6:35730. [PMID: 27774999 PMCID: PMC5075931 DOI: 10.1038/srep35730] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2016] [Accepted: 09/30/2016] [Indexed: 12/15/2022] Open
Abstract
We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserved group of genes, whereas kinases and membrane-localized transporters are the most variable ones. TFs may be conserved because they belong to some of the most connected regulatory hubs that modulate transcription of vast downstream gene networks, whereas signaling kinases and transporters need to adapt rapidly to changing environmental conditions. In general, the observed profound patterns of nucleotide variability reveal functionally important genomic regions. As expected, nucleotide diversity is much higher in intergenic regions than within gene bodies (regions spanning gene models), and protein-coding sequences are more conserved than untranslated gene regions. We have observed a sharp decline in nucleotide diversity that begins at about 250 nucleotides upstream of the transcription start and reaches minimal diversity exactly at the transcription start. We found the transcription termination sites to have remarkably symmetrical patterns of SNP density, implying presence of functional sites near transcription termination. Also, nucleotide diversity was significantly lower near 3′ UTRs, the area rich with regulatory regions.
Collapse
Affiliation(s)
- Tatiana V Tatarinova
- Center for Personalized Medicine and Spatial Sciences Institute, University of Southern California, Los Angeles, CA, USA.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russian Federation
| | | | - Yuri Nikolsky
- Vavilov Institute of General Genetics, Moscow, Russia.,F1 Genomics, San Diego, CA, USA.,School of Systems Biology, George Mason University, VA, USA
| | | | - Dmitry Chebotarov
- International Rice Research Institute, Los Baños, Laguna 4031, Philippines
| | - Kenneth L McNally
- International Rice Research Institute, Los Baños, Laguna 4031, Philippines
| | | |
Collapse
|
11
|
Kainov YA, Aushev VN, Naumenko SA, Tchevkina EM, Bazykin GA. Complex Selection on Human Polyadenylation Signals Revealed by Polymorphism and Divergence Data. Genome Biol Evol 2016; 8:1971-9. [PMID: 27324920 PMCID: PMC4943204 DOI: 10.1093/gbe/evw137] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/05/2016] [Indexed: 12/19/2022] Open
Abstract
Polyadenylation is a step of mRNA processing which is crucial for its expression and stability. The major polyadenylation signal (PAS) represents a nucleotide hexamer that adheres to the AATAAA consensus sequence. Over a half of human genes have multiple cleavage and polyadenylation sites, resulting in a great diversity of transcripts differing in function, stability, and translational activity. Here, we use available whole-genome human polymorphism data together with data on interspecies divergence to study the patterns of selection acting on PAS hexamers. Common variants of PAS hexamers are depleted of single nucleotide polymorphisms (SNPs), and SNPs within PAS hexamers have a reduced derived allele frequency (DAF) and increased conservation, indicating prevalent negative selection; at the same time, the SNPs that "improve" the PAS (i.e., those leading to higher cleavage efficiency) have increased DAF, compared to those that "impair" it. SNPs are rarer at PAS of "unique" polyadenylation sites (one site per gene); among alternative polyadenylation sites, at the distal PAS and at exonic PAS. Similar trends were observed in DAFs and divergence between species of placental mammals. Thus, selection permits PAS mutations mainly at redundant and/or weakly functional PAS. Nevertheless, a fraction of the SNPs at PAS hexamers likely affect gene functions; in particular, some of the observed SNPs are associated with disease.
Collapse
Affiliation(s)
- Yaroslav A Kainov
- Centre for Developmental Neurobiology, King's College London, London, United Kingdom Oncogenes Regulation Department, N.N. Blokhin Russian Cancer Research Center, Institute of Carcinogenesis, Moscow, Russia
| | - Vasily N Aushev
- Oncogenes Regulation Department, N.N. Blokhin Russian Cancer Research Center, Institute of Carcinogenesis, Moscow, Russia Department of Preventive Medicine, Icahn School of Medicine at Mount Sinai, New York
| | - Sergey A Naumenko
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, Canada
| | - Elena M Tchevkina
- Oncogenes Regulation Department, N.N. Blokhin Russian Cancer Research Center, Institute of Carcinogenesis, Moscow, Russia
| | - Georgii A Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia Skolkovo Institute of Science and Technology, Skolkovo, Russia Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Russia Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russia Pirogov Russian National Research Medical University, Moscow, Russia
| |
Collapse
|
12
|
Panchin AY, Makeev VJ, Medvedeva YA. Preservation of methylated CpG dinucleotides in human CpG islands. Biol Direct 2016; 11:11. [PMID: 27005429 PMCID: PMC4804638 DOI: 10.1186/s13062-016-0113-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 03/14/2016] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND CpG dinucleotides are extensively underrepresented in mammalian genomes. It is widely accepted that genome-wide CpG depletion is predominantly caused by an elevated CpG > TpG mutation rate due to frequent cytosine methylation in the CpG context. Meanwhile the CpG content in genomic regions called CpG islands (CGIs) is noticeably higher. This observation is usually explained by lower CpG > TpG substitution rates within CGIs due to reduced cytosine methylation levels. RESULTS By combining genome-wide data on substitutions and methylation levels in several human cell types we have shown that cytosine methylation in human sperm cells was strongly and consistently associated with increased CpG > TpG substitution rates. In contrast, this correlation was not observed for embryonic stem cells or fibroblasts. Surprisingly, the decreased sperm CpG methylation level was insufficient to explain the reduced CpG > TpG substitution rates in CGIs. CONCLUSIONS While cytosine methylation in human sperm cells is strongly associated with increased CpG > TpG substitution rates, substitution rates are significantly reduced within CGIs even after sperm CpG methylation levels and local GC content are controlled for. Our findings are consistent with strong negative selection preserving methylated CpGs within CGIs including intergenic ones.
Collapse
Affiliation(s)
- Alexander Y Panchin
- Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127994, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, GSP-1, 119991, Russia.,Research Institute for Genetics and Selection of Industrial Microorganisms, Moscow, 117545, Russia.,Moscow Institute of Physics and Technology, Moscow Regoin, 141700, Russia
| | - Yulia A Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, GSP-1, 119991, Russia. .,Center for Bioengineering, Research Center of Biotechnology RAS, Russian Academy of Science, Moscow, 117312, Russia.
| |
Collapse
|
13
|
Francioli LC, Polak PP, Koren A, Menelaou A, Chun S, Renkens I, van Duijn CM, Swertz M, Wijmenga C, van Ommen G, Slagboom PE, Boomsma DI, Ye K, Guryev V, Arndt PF, Kloosterman WP, de Bakker PIW, Sunyaev SR. Genome-wide patterns and properties of de novo mutations in humans. Nat Genet 2015; 47:822-826. [PMID: 25985141 PMCID: PMC4485564 DOI: 10.1038/ng.3292] [Citation(s) in RCA: 252] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Accepted: 04/07/2015] [Indexed: 12/12/2022]
Abstract
Mutations create variation in the population, fuel evolution, and cause genetic diseases. Current knowledge about de novo mutations is incomplete and mostly indirect 1–10. Here, we analyze 11,020 de novo mutations from whole-genomes of 250 families. We show that de novo mutations in offspring of older fathers are not only more numerous 11–13 but also occur more frequently in early-replicating, genic regions. Functional regions exhibit higher mutation rates due to CpG dinucleotides and reveal signatures of transcription-coupled repair, while mutation clusters with a unique signature point to a novel mutational mechanism. Mutation and recombination rates independently associate with nucleotide diversity, and regional variation in human-chimpanzee divergence is only partly explained by mutation rate heterogeneity. Finally, we provide a genome-wide mutation rate map for medical and population genetics applications. Our results reveal novel insights and refine long-standing hypotheses about human mutagenesis.
Collapse
Affiliation(s)
- Laurent C Francioli
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Paz P Polak
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Amnon Koren
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Androniki Menelaou
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sung Chun
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Ivo Renkens
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
| | | | | | - Morris Swertz
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands.,University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
| | - Cisca Wijmenga
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands.,University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
| | - Gertjan van Ommen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - P Eline Slagboom
- Section of Molecular Epidemiology, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - Dorret I Boomsma
- Department of Biological Psychology, VU University Amsterdam, Amsterdam, The Netherlands
| | - Kai Ye
- Section of Molecular Epidemiology, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands.,The Genome Institute, Washington University, St. Louis, MO, USA
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Peter F Arndt
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Wigard P Kloosterman
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Paul I W de Bakker
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands.,Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Shamil R Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
14
|
Xu L, Tang H, Chen DW, El-Naggar AK, Wei P, Sturgis EM. Genome-wide association study identifies common genetic variants associated with salivary gland carcinoma and its subtypes. Cancer 2015; 121:2367-74. [PMID: 25823930 DOI: 10.1002/cncr.29381] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Revised: 01/20/2015] [Accepted: 02/09/2015] [Indexed: 01/20/2023]
Abstract
BACKGROUND Salivary gland carcinomas (SGCs) are a rare malignancy with unknown etiology. The objective of the current study was to identify genetic variants modifying the risk of SGC and its major subtypes: adenoid cystic carcinoma and mucoepidermoid carcinoma. METHODS The authors conducted a genome-wide association study in 309 well-defined SGC cases and 535 cancer-free controls. A single-nucleotide polymorphism (SNP)-level discovery study was performed in non-Hispanic white individuals followed by a replication study in Hispanic individuals. A logistic regression analysis was applied to calculate odds ratios (ORs) and 95% confidence intervals (95% CIs). A meta-analysis of the results was conducted. RESULTS A genome-wide significant association with SGC in non-Hispanic white individuals was detected at coding SNPs in CHRNA2 (cholinergic receptor, nicotinic, alpha 2 [neuronal]) (OR, 8.55; 95% CI, 4.53-16.13 [P = 3.6 × 10(-11)]), OR4F15 (olfactory receptor, family 4, subfamily F, member 15) (OR, 5.26; 95% CI, 3.13-8.83 [P = 3.5 × 10(-10)]), ZNF343 (zinc finger protein 343) (OR, 3.28; 95% CI, 2.12-5.07 [P = 9.1 × 10(-8)]), and PARP4 (poly(ADP-ribose) polymerase family, member 4) (OR, 2.00; 95% CI, 1.54-2.59 [P = 1.7 × 10(-7)]). Meta-analysis of the non-Hispanic white and Hispanic cohorts identified another genome-wide significant SNP in ELL2 (meta-OR, 1.86; 95% CI, 1.48-2.34 [P = 1.3 × 10(-7)]). Risk alleles were largely enriched in mucoepidermoid carcinoma, in which the SNPs in CHRNA2, OR4F15, and ZNF343 had ORs of 15.71 (95% CI, 6.59-37.47 [P = 5.2 × 10(-10)]), 15.60 (95% CI, 6.50-37.41 [P = 7.5 × 10(-10)]), and 6.49 (95% CI, 3.36-12.52 [P = 2.5 × 10(-8)]), respectively. None of these SNPs retained a significant association with adenoid cystic carcinoma. CONCLUSIONS To the best of the authors' knowledge, the current study is the first to identify a panel of SNPs associated with the risk of SGC. Confirmation of these findings along with functional analysis of identified SNPs are needed.
Collapse
Affiliation(s)
- Li Xu
- Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Hongwei Tang
- Department of Gastrointestinal Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Diane W Chen
- Clincal Research, Quality Improvement, Baylor College of Medicine, Houston, Texas
| | - Adel K El-Naggar
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Peng Wei
- Division of Biostatistics and Human Genetics Center, School of Public Health, The University of Texas Health Science Center, Houston, Texas
| | - Erich M Sturgis
- Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, Houston, Texas.,Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| |
Collapse
|
15
|
Lagging-strand replication shapes the mutational landscape of the genome. Nature 2015; 518:502-506. [PMID: 25624100 PMCID: PMC4374164 DOI: 10.1038/nature14183] [Citation(s) in RCA: 168] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 01/05/2015] [Indexed: 12/21/2022]
Abstract
The origin of mutations is central to understanding evolution and of key relevance to health. Variation occurs non-randomly across the genome, and mechanisms for this remain to be defined. Here, we report that the 5′-ends of Okazaki fragments have significantly elevated levels of nucleotide substitution, indicating a replicative origin for such mutations. With a novel method, emRiboSeq, we map the genome-wide contribution of polymerases, and show that despite Okazaki fragment processing, DNA synthesised by error-prone Pol-α is retained in vivo, comprising ~1.5% of the mature genome. We propose that DNA-binding proteins that rapidly re-associate post-replication act as partial barriers to Pol-δ mediated displacement of Pol-α synthesised DNA, resulting in incorporation of such Pol-α tracts and elevated mutation rates at specific sites. We observe a mutational cost to chromatin and regulatory protein binding, resulting in mutation hotspots at regulatory elements, with signatures of this process detectable in both yeast and humans.
Collapse
|
16
|
Abstract
A role for somatic mutations in carcinogenesis is well accepted, but the degree to which mutation rates influence cancer initiation and development is under continuous debate. Recently accumulated genomic data have revealed that thousands of tumour samples are riddled by hypermutation, broadening support for the idea that many cancers acquire a mutator phenotype. This major expansion of cancer mutation data sets has provided unprecedented statistical power for the analysis of mutation spectra, which has confirmed several classical sources of mutation in cancer, highlighted new prominent mutation sources (such as apolipoprotein B mRNA editing enzyme catalytic polypeptide-like (APOBEC) enzymes) and empowered the search for cancer drivers. The confluence of cancer mutation genomics and mechanistic insight provides great promise for understanding the basic development of cancer through mutations.
Collapse
|
17
|
McCole RB, Fonseka CY, Koren A, Wu CT. Abnormal dosage of ultraconserved elements is highly disfavored in healthy cells but not cancer cells. PLoS Genet 2014; 10:e1004646. [PMID: 25340765 PMCID: PMC4207606 DOI: 10.1371/journal.pgen.1004646] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Accepted: 08/04/2014] [Indexed: 12/17/2022] Open
Abstract
Ultraconserved elements (UCEs) are strongly depleted from segmental duplications and copy number variations (CNVs) in the human genome, suggesting that deletion or duplication of a UCE can be deleterious to the mammalian cell. Here we address the process by which CNVs become depleted of UCEs. We begin by showing that depletion for UCEs characterizes the most recent large-scale human CNV datasets and then find that even newly formed de novo CNVs, which have passed through meiosis at most once, are significantly depleted for UCEs. In striking contrast, CNVs arising specifically in cancer cells are, as a rule, not depleted for UCEs and can even become significantly enriched. This observation raises the possibility that CNVs that arise somatically and are relatively newly formed are less likely to have established a CNV profile that is depleted for UCEs. Alternatively, lack of depletion for UCEs from cancer CNVs may reflect the diseased state. In support of this latter explanation, somatic CNVs that are not associated with disease are depleted for UCEs. Finally, we show that it is possible to observe the CNVs of induced pluripotent stem (iPS) cells become depleted of UCEs over time, suggesting that depletion may be established through selection against UCE-disrupting CNVs without the requirement for meiotic divisions.
Collapse
Affiliation(s)
- Ruth B. McCole
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Chamith Y. Fonseka
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Biological and Biomedical Sciences PhD program, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Amnon Koren
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - C.-ting Wu
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
18
|
Abstract
Mutational heterogeneity must be taken into account when reconstructing evolutionary histories, calibrating molecular clocks, and predicting links between genes and disease. Selective pressures and various DNA transactions have been invoked to explain the heterogeneous distribution of genetic variation between species, within populations, and in tissue-specific tumors. To examine relationships between such heterogeneity and variations in leading- and lagging-strand replication fidelity and mismatch repair, we accumulated 40,000 spontaneous mutations in eight diploid yeast strains in the absence of selective pressure. We found that replicase error rates vary by fork direction, coding state, nucleosome proximity, and sequence context. Further, error rates and DNA mismatch repair efficiency both vary by mismatch type, responsible polymerase, replication time, and replication origin proximity. Mutation patterns implicate replication infidelity as one driver of variation in somatic and germline evolution, suggest mechanisms of mutual modulation of genome stability and composition, and predict future observations in specific cancers.
Collapse
|
19
|
Ségurel L, Wyman MJ, Przeworski M. Determinants of Mutation Rate Variation in the Human Germline. Annu Rev Genomics Hum Genet 2014; 15:47-70. [DOI: 10.1146/annurev-genom-031714-125740] [Citation(s) in RCA: 232] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Laure Ségurel
- Laboratoire Éco-Anthropologie et Ethnobiologie, UMR 7206, Muséum National d'Histoire Naturelle–Centre National de la Recherche Scientifique–Université Paris 7 Diderot, Paris 75231, France;
| | - Minyoung J. Wyman
- Department of Biological Sciences, Columbia University, New York, NY 10027;
| | - Molly Przeworski
- Department of Human Genetics and Howard Hughes Medical Institute, University of Chicago, Chicago, Illinois 60637;
| |
Collapse
|
20
|
Veeramah KR, Gutenkunst RN, Woerner AE, Watkins JC, Hammer MF. Evidence for increased levels of positive and negative selection on the X chromosome versus autosomes in humans. Mol Biol Evol 2014; 31:2267-82. [PMID: 24830675 DOI: 10.1093/molbev/msu166] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Partially recessive variants under positive selection are expected to go to fixation more quickly on the X chromosome as a result of hemizygosity, an effect known as faster-X. Conversely, purifying selection is expected to reduce substitution rates more effectively on the X chromosome. Previous work in humans contrasted divergence on the autosomes and X chromosome, with results tending to support the faster-X effect. However, no study has yet incorporated both divergence and polymorphism to quantify the effects of both purifying and positive selection, which are opposing forces with respect to divergence. In this study, we develop a framework that integrates previously developed theory addressing differential rates of X and autosomal evolution with methods that jointly estimate the level of purifying and positive selection via modeling of the distribution of fitness effects (DFE). We then utilize this framework to estimate the proportion of nonsynonymous substitutions fixed by positive selection (α) using exome sequence data from a West African population. We find that varying the female to male breeding ratio (β) has minimal impact on the DFE for the X chromosome, especially when compared with the effect of varying the dominance coefficient of deleterious alleles (h). Estimates of α range from 46% to 51% and from 4% to 24% for the X chromosome and autosomes, respectively. While dependent on h, the magnitude of the difference between α values estimated for these two systems is highly statistically significant over a range of biologically realistic parameter values, suggesting faster-X has been operating in humans.
Collapse
Affiliation(s)
- Krishna R Veeramah
- Arizona Research Laboratories Division of Biotechnology, University of ArizonaDepartment of Ecology and Evolution, Stony Brook University
| | | | - August E Woerner
- Arizona Research Laboratories Division of Biotechnology, University of Arizona
| | | | - Michael F Hammer
- Arizona Research Laboratories Division of Biotechnology, University of Arizona
| |
Collapse
|
21
|
Malyarchuk BA. Mutational process in protein-coding genes of human mitochondrial genome in context of evolution of Homo genus. Mol Biol 2013. [DOI: 10.1134/s0026893313060083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
22
|
Polychronakos C, Li Q. Understanding type 1 diabetes through genetics: advances and prospects. Nat Rev Genet 2011; 12:781-92. [PMID: 22005987 DOI: 10.1038/nrg3069] [Citation(s) in RCA: 157] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Starting with early crucial discoveries of the role of the major histocompatibility complex, genetic studies have long had a role in understanding the biology of type 1 diabetes (T1D), which is one of the most heritable common diseases. Recent genome-wide association studies (GWASs) have given us a clearer picture of the allelic architecture of genetic susceptibility to T1D. Fine mapping and functional studies are gradually revealing the complex mechanisms whereby immune self-tolerance is lost, involving multiple aspects of adaptive immunity. The triggering of these events by dysregulation of the innate immune system has also been implicated by genetic evidence. Finally, genetic prediction of T1D risk is showing promise of use for preventive strategies.
Collapse
Affiliation(s)
- Constantin Polychronakos
- Departments of Pediatrics and Human Genetics, McGill University, Montreal, Québec, Canada H3H 1P3. Constantin.
| | | |
Collapse
|
23
|
Brown CA, Scharner J, Felice K, Meriggioli MN, Tarnopolsky M, Bower M, Zammit PS, Mendell JR, Ellis JA. Novel and recurrent EMD mutations in patients with Emery–Dreifuss muscular dystrophy, identify exon 2 as a mutation hot spot. J Hum Genet 2011; 56:589-94. [DOI: 10.1038/jhg.2011.65] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
24
|
Ying H, Huttley G. Exploiting CpG hypermutability to identify phenotypically significant variation within human protein-coding genes. Genome Biol Evol 2011; 3:938-49. [PMID: 21398426 PMCID: PMC3184784 DOI: 10.1093/gbe/evr021] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The CpG dinucleotide is disproportionately represented in human genetic variation due to the hypermutability of 5-methyl-cytosine (5mC). We exploit this hypermutability and a novel codon substitution model to identify candidate functionally important exonic nucleotides. Population genetic theory suggests that codon positions with high cross-species CpG frequency will derive from stronger purifying selection. Using the phylogeny-based maximum likelihood inference framework, we applied codon substitution models with context-dependent parameters to measure the mutagenic and selective processes affecting CpG dinucleotides within exonic sequence. The suitability of these models was validated on >2,000 protein coding genes from a naturally occurring biological control, four yeast species that do not methylate their DNA. As expected, our analyses of yeast revealed no evidence for an elevated CpG transition rate or for substitution suppression affecting CpG-containing codons. Our analyses of >12,000 protein-coding genes from four primate lineages confirm the systemic influence of 5mC hypermutability on the divergence of these genes. After adjusting for confounding influences of mutation and the properties of the encoded amino acids, we confirmed that CpG-containing codons are under greater purifying selection in primates. Genes with significant evidence of enhanced suppression of nonsynonymous CpG changes were also shown to be significantly enriched in Online Mendelian Inheritance in Man. We developed a method for ranking candidate phenotypically influential CpG positions in human genes. Application of this method indicates that of the ∼1 million exonic CpG dinucleotides within humans, ∼20% are strong candidates for both hypermutability and disease association.
Collapse
Affiliation(s)
- Hua Ying
- Department of Genome Biology, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 0200, Australia
| | | |
Collapse
|
25
|
Necşulea A, Popa A, Cooper DN, Stenson PD, Mouchiroud D, Gautier C, Duret L. Meiotic recombination favors the spreading of deleterious mutations in human populations. Hum Mutat 2011; 32:198-206. [PMID: 21120948 DOI: 10.1002/humu.21407] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2010] [Accepted: 10/28/2010] [Indexed: 11/09/2022]
Abstract
Although mutations that are detrimental to the fitness of organisms are expected to be rapidly purged from populations by natural selection, some disease-causing mutations are present at high frequencies in human populations. Several nonexclusive hypotheses have been proposed to account for this apparent paradox (high new mutation rate, genetic drift, overdominance, or recent changes in selective pressure). However, the factors ultimately responsible for the presence at high frequency of disease-causing mutations are still contentious. Here we establish the existence of an additional process that contributes to the spreading of deleterious mutations: GC-biased gene conversion (gBGC), a process associated with recombination that tends to favor the transmission of GC-alleles over AT-alleles. We show that the spectrum of amino acid-altering polymorphisms in human populations exhibits the footprints of gBGC. This pattern cannot be explained in terms of selection and is evident with all nonsynonymous mutations, including those predicted to be detrimental to protein structure and function, and those implicated in human genetic disease. We present simulations to illustrate the conditions under which gBGC can extend the persistence time of deleterious mutations in a finite population. These results indicate that gBGC meiotic drive contributes to the spreading of deleterious mutations in human populations.
Collapse
Affiliation(s)
- Anamaria Necşulea
- Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, France
| | | | | | | | | | | | | |
Collapse
|
26
|
Lu J, Wang K, Rodova M, Esteves R, Berry D, E L, Crafter A, Barrett M, Cardoso SM, Onyango I, Parker WD, Fontes J, Burns JM, Swerdlow RH. Polymorphic variation in cytochrome oxidase subunit genes. J Alzheimers Dis 2010; 21:141-54. [PMID: 20413852 DOI: 10.3233/jad-2010-100123] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Cytochrome oxidase (COX) activity varies between individuals and low activities associate with Alzheimer's disease. Whether genetic heterogeneity influences function of this multimeric enzyme is unknown. To explore this we sequenced three mitochondrial DNA (mtDNA) and ten nuclear COX subunit genes from at least 50 individuals. 20% had non-synonymous mtDNA COX gene polymorphisms, 12% had a COX4I1 non-synonymous G to A transition, and other genes rarely contained non-synonymous polymorphisms. Frequent untranslated region (UTR) polymorphisms were seen in COX6A1, COX6B1, COX6C, and COX7A1; heterogeneity in a COX7A1 5' UTR Sp1 site was extensive. Synonymous polymorphisms were common and less frequent in the more conserved COX1 than the less conserved COX3, suggesting at least in mtDNA synonymous polymorphisms experience selection pressure and are not functionally silent. Compound gene variations occurred within individuals. To test whether variations could have functional consequences, we studied the COX4I1 G to A transition and an AGCCCC deletion in the COX7A1 5' UTR Sp1 site. Cells expressing the COX4I1 polymorphism had reduced COX Vmax activity. In reporter construct-transduced cells where green fluorescent protein expression depended on the COX7A1 Sp1 site, AGCCCC deletion reduced fluorescence. Our findings indicate COX subunit gene heterogeneity is pervasive and may mediate COX functional variation.
Collapse
Affiliation(s)
- Jianghua Lu
- Department of Neurology, University of Kansas School of Medicine, Kansas City, KS 66160, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods 2010; 7:248-9. [PMID: 20354512 PMCID: PMC2855889 DOI: 10.1038/nmeth0410-248] [Citation(s) in RCA: 9999] [Impact Index Per Article: 714.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Ivan A. Adzhubei
- Division of Genetics, Brigham & Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Steffen Schmidt
- Department of Biochemistry, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Leonid Peshkin
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA
| | - Vasily E. Ramensky
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Anna Gerasimova
- Life Sciences Institute and Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, USA
| | - Peer Bork
- European Molecular Biology Laboratory, Heidelberg, Germany
| | - Alexey S. Kondrashov
- Life Sciences Institute and Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, USA
| | - Shamil R. Sunyaev
- Division of Genetics, Brigham & Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
28
|
Relative mutation rates of each nucleotide for another estimated from allele frequency spectra at human gene loci. Genet Res (Camb) 2009; 91:293-303. [PMID: 19640324 DOI: 10.1017/s0016672309990164] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
This study aims to comprehensively examine the mutation rates of one base for another in human gene loci. In contrast to most previous efforts based on divergence data from untranscribed regions, the present study employs the basic theory of the reversible recurrent mutation model using large-scale, high-quality re-sequencing data from public databases of gene loci. Population mutation parameters (4Nnu and 4Nmu) are obtained for each pair of base substitutions. The estimated parameters show good strand reversal symmetry, supporting the existence of mutation-drift equilibrium. Analysis of specific gene regions including mRNA, coding sequence (CDS), 5'-untranslated region (5'-UTRs), 3'-UTR and intron shows that there are clear differences in the mutation rates of each base for another depending on the location of the base in question. Results from analyses that take the adjacent bases into account exhibit excellent strand reversal symmetry, confirming that the identity of an adjacent base influences mutation rates. The CpG to TpG (or CpG to CpA) substitution is found at a rate approximately seven-fold higher than the reverse transition in intron regions due to cytosine deamination, but the effect is strongly reduced in mRNA regions and almost entirely lost in 5'-UTRs. However, from the overall increased transitions in sites other than CpGs and the proportion of CpGs in the total sequence, CpG methylation is not the main factor responsible for the increased rate of transitions as compared with transversions. In this report, after adjusting average mutation rates to the sequence compositions, no substitution bias is found between A+T and C+G, indicating base composition equilibrium in human gene loci. Population differences are also identified between groups of people of African and European descent, presumably due to past population histories. By applying the basic theory of population genetics to re-sequenced data, this study contributes new, detailed information regarding mutations in human gene regions.
Collapse
|
29
|
Li JB, Gao Y, Aach J, Zhang K, Kryukov GV, Xie B, Ahlford A, Yoon JK, Rosenbaum AM, Zaranek AW, LeProust E, Sunyaev SR, Church GM. Multiplex padlock targeted sequencing reveals human hypermutable CpG variations. Genome Res 2009; 19:1606-15. [PMID: 19525355 DOI: 10.1101/gr.092213.109] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Utilizing the full power of next-generation sequencing often requires the ability to perform large-scale multiplex enrichment of many specific genomic loci in multiple samples. Several technologies have been recently developed but await substantial improvements. We report the 10,000-fold improvement of a previously developed padlock-based approach, and apply the assay to identifying genetic variations in hypermutable CpG regions across human chromosome 21. From approximately 3 million reads derived from a single Illumina Genome Analyzer lane, approximately 94% (approximately 50,500) target sites can be observed with at least one read. The uniformity of coverage was also greatly improved; up to 93% and 57% of all targets fell within a 100- and 10-fold coverage range, respectively. Alleles at >400,000 target base positions were determined across six subjects and examined for single nucleotide polymorphisms (SNPs), and the concordance with independently obtained genotypes was 98.4%-100%. We detected >500 SNPs not currently in dbSNP, 362 of which were in targeted CpG locations. Transitions in CpG sites were at least 13.7 times more abundant than non-CpG transitions. Fractions of polymorphic CpG sites are lower in CpG-rich regions and show higher correlation with human-chimpanzee divergence within CpG versus non-CpG sites. This is consistent with the hypothesis that methylation rate heterogeneity along chromosomes contributes to mutation rate variation in humans. Our success suggests that targeted CpG resequencing is an efficient way to identify common and rare genetic variations. In addition, the significantly improved padlock capture technology can be readily applied to other projects that require multiplex sample preparation.
Collapse
Affiliation(s)
- Jin Billy Li
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|