1
|
van der Kuyl AC. Mutation Rate Variation and Other Challenges in 2-LTR Dating of Primate Endogenous Retrovirus Integrations. J Mol Evol 2025; 93:62-82. [PMID: 39715846 DOI: 10.1007/s00239-024-10225-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 12/07/2024] [Indexed: 12/25/2024]
Abstract
The time of integration of germline-targeting Long Terminal Repeat (LTR) retroposons, such as endogenous retroviruses (ERVs), can be estimated by assessing the nucleotide divergence between the LTR sequences flanking the viral genes. Due to the viral replication mechanism, both LTRs are identical at the moment of integration, when the provirus becomes part of the host genome. After that time, proviral sequences evolve within the host DNA. When the mutation rate is known, nucleotide divergence between the LTRs would then be a measure of time elapsed since integration. Though frequently used, the approach has been complicated by the choice of host mutation rate and, to a lesser extent, by the method selected to estimate nucleotide divergence. As a result, outcomes can be incompatible with, for instance, speciation events identified from the fossil record. The review will give an overview of research reporting LTR-retroposon dating, and a summary of important factors to consider, including the quality, assembly, and alignment of sequences, the mutation rate of foreign DNA in host genomes, and the choice of a distance estimation method. Primates will here be the focus of the analysis because their genomes, ERVs, and fossil record have been extensively studied. However, most of the factors discussed have a wide applicability in the vertebrate field.
Collapse
Affiliation(s)
- Antoinette Cornelia van der Kuyl
- Laboratory of Experimental Virology, Department of Medical Microbiology and Infection Prevention, Amsterdam UMC, Location AMC, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands.
- Amsterdam Institute for Immunology & Infectious Diseases, 1100 DD, Amsterdam, The Netherlands.
| |
Collapse
|
2
|
Urra C, Sanhueza D, Pavez C, Tapia P, Núñez-Lillo G, Minio A, Miossec M, Blanco-Herrera F, Gainza F, Castro A, Cantu D, Meneses C. Identification of grapevine clones via high-throughput amplicon sequencing: a proof-of-concept study. G3 (BETHESDA, MD.) 2023; 13:jkad145. [PMID: 37395733 PMCID: PMC10468313 DOI: 10.1093/g3journal/jkad145] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 05/22/2023] [Accepted: 06/14/2023] [Indexed: 07/04/2023]
Abstract
Wine cultivars are available to growers in multiple clonal selections with agronomic and enological differences. Phenotypic differences between clones originated from somatic mutations that accrued over thousands of asexual propagation cycles. Genetic diversity between grape cultivars remains unexplored, and tools to discriminate unequivocally clones have been lacking. This study aimed to uncover genetic variations among a group of clonal selections of 4 important Vitis vinifera cultivars: Cabernet sauvignon, Sauvignon blanc, Chardonnay, and Merlot, and use this information to develop genetic markers to discriminate the clones of these cultivars. We sequenced with short-read sequencing technology the genomes of 18 clones, including biological replicates for a total of 46 genomes. Sequences were aligned to their respective cultivar's reference genome for variant calling. We used reference genomes of Cabernet sauvignon, Chardonnay, and Merlot and developed a de novo genome assembly of Sauvignon blanc using long-read sequencing. On average, 4 million variants were detected for each clone, with 74.2% being single nucleotide variants and 25.8% being small insertions or deletions (InDel). The frequency of these variants was consistent across all clones. From these variants, we validated 46 clonal markers using high-throughput amplicon sequencing for 77.7% of the evaluated clones, most of them small InDel. These results represent an advance in grapevine genotyping strategies and will benefit the viticulture industry for the characterization and identification of the plant material.
Collapse
Affiliation(s)
- Claudio Urra
- UC Davis-Chile, Life Sciences Innovation Center, Santiago 7520424, Chile
- Centro de Biotecnología Vegetal, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago 8370186, Chile
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Ingeniería y Tecnología, Universidad Mayor, Santiago 8580745, Chile
| | - Dayan Sanhueza
- Centro de Biotecnología Vegetal, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago 8370186, Chile
| | - Catalina Pavez
- UC Davis-Chile, Life Sciences Innovation Center, Santiago 7520424, Chile
- Centro de Biotecnología Vegetal, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago 8370186, Chile
| | - Patricio Tapia
- Centro de Biotecnología Vegetal, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago 8370186, Chile
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago 8331150, Chile
| | - Gerardo Núñez-Lillo
- Escuela de Agronomía, Facultad de Ciencias Agronómicas y de los Alimentos, Pontificia Universidad Católica de Valparaíso, Quillota 2263782, Chile
| | - Andrea Minio
- Department of Viticulture and Enology, University of California Davis, Davis, CA 95616-5270, USA
| | - Matthieu Miossec
- Center for Bioinformatics and Integrative Biology, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago 8370186, Chile
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Francisca Blanco-Herrera
- Centro de Biotecnología Vegetal, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago 8370186, Chile
- ANID—Millennium Science Initiative Program—Millennium Nucleus for the Development of Super Adaptable Plants (MN-SAP), Santiago 8331150, Chile
| | - Felipe Gainza
- Center for Research and Innovation, Viña Concha y Toro S.A, Pencahue, Talca 3460000, Chile
| | - Alvaro Castro
- UC Davis-Chile, Life Sciences Innovation Center, Santiago 7520424, Chile
| | - Dario Cantu
- Department of Viticulture and Enology, University of California Davis, Davis, CA 95616-5270, USA
| | - Claudio Meneses
- ANID—Millennium Science Initiative Program—Millennium Nucleus for the Development of Super Adaptable Plants (MN-SAP), Santiago 8331150, Chile
- Departamento de Fruticultura y Enología, Facultad de Agronomía e Ingeniería Forestal, Pontificia Universidad Católica de Chile, Santiago 7820436, Chile
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago 8331150, Chile
- ANID—Millennium Science Initiative Program Millenium Institute Center for Genome Regulation, CRG, Santiago 8331150, Chile
| |
Collapse
|
3
|
Wu C, Paradis NJ, Lakernick PM, Hryb M. L-shaped distribution of the relative substitution rate (c/μ) observed for SARS-COV-2's genome, inconsistent with the selectionist theory, the neutral theory and the nearly neutral theory but a near-neutral balanced selection theory: Implication on "neutralist-selectionist" debate. Comput Biol Med 2023; 153:106522. [PMID: 36638615 PMCID: PMC9814386 DOI: 10.1016/j.compbiomed.2022.106522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 12/17/2022] [Accepted: 12/31/2022] [Indexed: 01/07/2023]
Abstract
The genomic substitution rate (GSR) of SARS-CoV-2 exhibits a molecular clock feature and does not change under fluctuating environmental factors such as the infected human population (10°-107), vaccination etc. The molecular clock feature is believed to be inconsistent with the selectionist theory (ST). The GSR shows lack of dependence on the effective population size, suggesting Ohta's nearly neutral theory (ONNT) is not applicable to this virus. Big variation of the substitution rate within its genome is also inconsistent with Kimura's neutral theory (KNT). Thus, all three existing evolution theories fail to explain the evolutionary nature of this virus. In this paper, we proposed a Segment Substitution Rate Model (SSRM) under non-neutral selections and pointed out that a balanced mechanism between negative and positive selection of some segments that could also lead to the molecular clock feature. We named this hybrid mechanism as near-neutral balanced selection theory (NNBST) and examined if it was followed by SARS-CoV-2 using the three independent sets of SARS-CoV-2 genomes selected by the Nextstrain team. Intriguingly, the relative substitution rate of this virus exhibited an L-shaped probability distribution consisting with NNBST rather than Poisson distribution predicted by KNT or an asymmetric distribution predicted by ONNT in which nearly neutral sites are believed to be slightly deleterious only, or the distribution that is lack of nearly neutral sites predicted by ST. The time-dependence of the substitution rates for some segments and their correlation with the vaccination were observed, supporting NNBST. Our relative substitution rate method provides a tool to resolve the long standing "neutralist-selectionist" controversy. Implications of NNBST in resolving Lewontin's Paradox is also discussed.
Collapse
Affiliation(s)
- Chun Wu
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA; Department of Biological & Biomedical Sciences, Rowan University, Glassboro, NJ, 08028, USA.
| | - Nicholas J Paradis
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA
| | - Phillip M Lakernick
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA
| | - Mariya Hryb
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ, 08028, USA
| |
Collapse
|
4
|
Duchemin L, Lanore V, Veber P, Boussau B. Evaluation of Methods to Detect Shifts in Directional Selection at the Genome Scale. Mol Biol Evol 2022; 40:6889995. [PMID: 36510704 PMCID: PMC9940701 DOI: 10.1093/molbev/msac247] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 10/24/2022] [Accepted: 10/26/2022] [Indexed: 12/15/2022] Open
Abstract
Identifying the footprints of selection in coding sequences can inform about the importance and function of individual sites. Analyses of the ratio of nonsynonymous to synonymous substitutions (dN/dS) have been widely used to pinpoint changes in the intensity of selection, but cannot distinguish them from changes in the direction of selection, that is, changes in the fitness of specific amino acids at a given position. A few methods that rely on amino-acid profiles to detect changes in directional selection have been designed, but their performances have not been well characterized. In this paper, we investigate the performance of six of these methods. We evaluate them on simulations along empirical phylogenies in which transition events have been annotated and compare their ability to detect sites that have undergone changes in the direction or intensity of selection to that of a widely used dN/dS approach, codeml's branch-site model A. We show that all methods have reduced performance in the presence of biased gene conversion but not CpG hypermutability. The best profile method, Pelican, a new implementation of Tamuri AU, Hay AJ, Goldstein RA. (2009. Identifying changes in selective constraints: host shifts in influenza. PLoS Comput Biol. 5(11):e1000564), performs as well as codeml in a range of conditions except for detecting relaxations of selection, and performs better when tree length increases, or in the presence of persistent positive selection. It is fast, enabling genome-scale searches for site-wise changes in the direction of selection associated with phenotypic changes.
Collapse
Affiliation(s)
| | - Vincent Lanore
- Laboratoire de Biométrie et Biologie Evolutive, Univ Lyon, Univ Lyon 1, CNRS, VetAgro Sup, UMR5558, Villeurbanne, France
| | - Philippe Veber
- Laboratoire de Biométrie et Biologie Evolutive, Univ Lyon, Univ Lyon 1, CNRS, VetAgro Sup, UMR5558, Villeurbanne, France
| | | |
Collapse
|
5
|
Lu FH, McKenzie N, Gardiner LJ, Luo MC, Hall A, Bevan MW. Reduced chromatin accessibility underlies gene expression differences in homologous chromosome arms of diploid Aegilops tauschii and hexaploid wheat. Gigascience 2020; 9:5860314. [PMID: 32562491 PMCID: PMC7305686 DOI: 10.1093/gigascience/giaa070] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 02/17/2020] [Accepted: 06/02/2020] [Indexed: 12/18/2022] Open
Abstract
Background Polyploidy is centrally important in the evolution and domestication of plants because it leads to major genomic changes, such as altered patterns of gene expression, which are thought to underlie the emergence of new traits. Despite the common occurrence of these globally altered patterns of gene expression in polyploids, the mechanisms involved are not well understood. Results Using a precisely defined framework of highly conserved syntenic genes on hexaploid wheat chromosome 3DL and its progenitor 3 L chromosome arm of diploid Aegilops tauschii, we show that 70% of these gene pairs exhibited proportionately reduced gene expression, in which expression in the hexaploid context of the 3DL genes was ∼40% of the levels observed in diploid Ae tauschii. Several genes showed elevated expression during the later stages of grain development in wheat compared with Ae tauschii. Gene sequence and methylation differences probably accounted for only a few cases of differences in gene expression. In contrast, chromosome-wide patterns of reduced chromatin accessibility of genes in the hexaploid chromosome arm compared with its diploid progenitor were correlated with both reduced gene expression and the imposition of new patterns of gene expression. Conclusions Our pilot-scale analyses show that chromatin compaction may orchestrate reduced gene expression levels in the hexaploid chromosome arm of wheat compared to its diploid progenitor chromosome arm.
Collapse
Affiliation(s)
- Fu-Hao Lu
- Department Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Neil McKenzie
- Department Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Laura-Jayne Gardiner
- Earlham Institute, Norwich Research Park, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Ming-Cheng Luo
- Department of Plant Sciences, University of California, 1 Shields Avenue, Davis, CA 95616, USA
| | - Anthony Hall
- Earlham Institute, Norwich Research Park, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Michael W Bevan
- Correspondence address. Michael W Bevan, Department Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK. E-mail:
| |
Collapse
|
6
|
Vondras AM, Minio A, Blanco-Ulate B, Figueroa-Balderas R, Penn MA, Zhou Y, Seymour D, Ye Z, Liang D, Espinoza LK, Anderson MM, Walker MA, Gaut B, Cantu D. The genomic diversification of grapevine clones. BMC Genomics 2019; 20:972. [PMID: 31830913 PMCID: PMC6907202 DOI: 10.1186/s12864-019-6211-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 10/22/2019] [Indexed: 12/14/2022] Open
Abstract
Background Vegetatively propagated clones accumulate somatic mutations. The purpose of this study was to better appreciate clone diversity and involved defining the nature of somatic mutations throughout the genome. Fifteen Zinfandel winegrape clone genomes were sequenced and compared to one another using a highly contiguous genome reference produced from one of the clones, Zinfandel 03. Results Though most heterozygous variants were shared, somatic mutations accumulated in individual and subsets of clones. Overall, heterozygous mutations were most frequent in intergenic space and more frequent in introns than exons. A significantly larger percentage of CpG, CHG, and CHH sites in repetitive intergenic space experienced transition mutations than in genic and non-repetitive intergenic spaces, likely because of higher levels of methylation in the region and because methylated cytosines often spontaneously deaminate. Of the minority of mutations that occurred in exons, larger proportions of these were putatively deleterious when they occurred in relatively few clones. Conclusions These data support three major conclusions. First, repetitive intergenic space is a major driver of clone genome diversification. Second, clones accumulate putatively deleterious mutations. Third, the data suggest selection against deleterious variants in coding regions or some mechanism by which mutations are less frequent in coding than noncoding regions of the genome.
Collapse
Affiliation(s)
- Amanda M Vondras
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Andrea Minio
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Barbara Blanco-Ulate
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA.,Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Rosa Figueroa-Balderas
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Michael A Penn
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Yongfeng Zhou
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92617, USA
| | - Danelle Seymour
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92617, USA
| | - Zirou Ye
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Dingren Liang
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Lucero K Espinoza
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Michael M Anderson
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - M Andrew Walker
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA
| | - Brandon Gaut
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92617, USA
| | - Dario Cantu
- Department of Viticulture and Enology, University of California Davis, Davis, CA, 95616, USA.
| |
Collapse
|
7
|
Zhao A, Zhao C, Tateishi-Karimata H, Ren J, Sugimoto N, Qu X. Incorporation of O(6)-methylguanine restricts the conformational conversion of the human telomere G-quadruplex under molecular crowding conditions. Chem Commun (Camb) 2016; 52:1903-6. [PMID: 26673900 DOI: 10.1039/c5cc09728b] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Here we systematically studied the incorporation of O(6)-methylguanine (6mG) into different positions of the human telomere G-quadruplex. In contrast to the natural G-quadruplex, the 6mG incorporated G-quadruplexes impeded the conformational conversion of the G-quadruplex from a hybrid to a parallel structure under molecular crowding conditions in a K(+) containing buffer.
Collapse
Affiliation(s)
- Andong Zhao
- Laboratory of Chemical Biology and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Changchun, Jilin 130022, China.
| | - Chuanqi Zhao
- Laboratory of Chemical Biology and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Changchun, Jilin 130022, China.
| | - Hisae Tateishi-Karimata
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 7-1-20 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan
| | - Jinsong Ren
- Laboratory of Chemical Biology and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Changchun, Jilin 130022, China.
| | - Naoki Sugimoto
- Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 7-1-20 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan and Graduate School of Frontiers of Innovative Research in Science and Technology (FIRST), Konan University, 7-1-20 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan
| | - Xiaogang Qu
- Laboratory of Chemical Biology and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Changchun, Jilin 130022, China.
| |
Collapse
|
8
|
Abstract
Over 40% of mammalian genomes comprise the products of reverse transcription. Among such retrotransposed sequences are those characterized by the presence of long terminal repeats (LTRs), including the endogenous retroviruses (ERVs), which are inherited genetic elements closely resembling the proviruses formed following exogenous retrovirus infection. Sequences derived from ERVs make up at least 8 to 10% of the human and mouse genomes and range from ancient sequences that predate mammalian divergence to elements that are currently still active. In this chapter we describe the discovery, classification and origins of ERVs in mammals and consider cellular mechanisms that have evolved to control their expression. We also discuss the negative effects of ERVs as agents of genetic disease and cancer and review examples of ERV protein domestication to serve host functions, as in placental development. Finally, we address growing evidence that the gene regulatory potential of ERV LTRs has been exploited multiple times during evolution to regulate genes and gene networks. Thus, although recently endogenized retroviral elements are often pathogenic, those that survive the forces of negative selection become neutral components of the host genome or can be harnessed to serve beneficial roles.
Collapse
|
9
|
Ci W, Liu J. Programming and inheritance of parental DNA methylomes in vertebrates. Physiology (Bethesda) 2015; 30:63-8. [PMID: 25559156 DOI: 10.1152/physiol.00037.2014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
5-Methylcytosine (5mC) is a major epigenetic modification in animals. The programming and inheritance of parental DNA methylomes ensures the compatibility for totipotency and embryonic development. In vertebrates, the DNA methylomes of sperm and oocyte are significantly different. During early embryogenesis, the paternal and maternal methylomes will reset to the same state. Herein, we focus on recent advances in how offspring obtain the DNA methylation information from parents in vertebrates.
Collapse
Affiliation(s)
- Weimin Ci
- CAS Key Laboratory of Genomic and Precision Medicine, Being Institute of Genomics, CAS, Beijing, China; and
| | - Jiang Liu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
10
|
Zeng J, Yi SV. Specific modifications of histone tails, but not DNA methylation, mirror the temporal variation of mammalian recombination hotspots. Genome Biol Evol 2014; 6:2918-29. [PMID: 25326136 PMCID: PMC4224356 DOI: 10.1093/gbe/evu230] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Recombination clusters nonuniformly across mammalian genomes at discrete genomic loci referred to as recombination hotspots. Despite their ubiquitous presence, individual hotspots rapidly lose their activities, and the molecular and evolutionary mechanisms underlying such frequent hotspot turnovers (the so-called “recombination hotspot paradox”) remain unresolved. Even though some sequence motifs are significantly associated with hotspots, multiple lines of evidence indicate that factors other than underlying sequences, such as epigenetic modifications, may affect the evolution of recombination hotspots. Thus, identifying epigenetic factors that covary with recombination at fine-scale is a promising step for this important research area. It was previously reported that recombination rates correlate with indirect measures of DNA methylation in the human genome. Here, we analyze experimentally determined DNA methylation and histone modification of human sperms, and show that the correlation between DNA methylation and recombination in long-range windows does not hold with respect to the spatial and temporal variation of recombination at hotspots. On the other hand, two histone modifications (H3K4me3 and H3K27me3) overlap extensively with recombination hotspots. Similar trends were observed in mice. These results indicate that specific histone modifications rather than DNA methylation are associated with the rapid evolution of recombination hotspots. Furthermore, many human recombination hotspots occupy “bivalent” chromatin regions that harbor both active (H3K4me3) and repressive (H3K27me3) marks. This may explain why human recombination hotspots tend to occur in nongenic regions, in contrast to yeast and Arabidopsis hotspots that are characterized by generally active chromatins. Our results highlight the dynamic epigenetic underpinnings of recombination hotspot evolution.
Collapse
Affiliation(s)
- Jia Zeng
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia Tech
| | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia Tech
| |
Collapse
|
11
|
Bigot T, Daubin V, Lassalle F, Perrière G. TPMS: a set of utilities for querying collections of gene trees. BMC Bioinformatics 2013; 14:109. [PMID: 23530580 PMCID: PMC3655882 DOI: 10.1186/1471-2105-14-109] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2012] [Accepted: 03/12/2013] [Indexed: 01/02/2023] Open
Abstract
Background The information in large collections of phylogenetic trees is useful for many comparative genomic studies. Therefore, there is a need for flexible tools that allow exploration of such collections in order to retrieve relevant data as quickly as possible. Results In this paper, we present TPMS (Tree Pattern-Matching Suite), a set of programs for handling and retrieving gene trees according to different criteria. The programs from the suite include utilities for tree collection building, specific tree-pattern search strategies and tree rooting. Use of TPMS is illustrated through three examples: systematic search for incongruencies in a large tree collection, a short study on the Coelomata/Ecdysozoa controversy and an evaluation of the level of support for a recently published Mammal phylogeny. Conclusion TPMS is a powerful suite allowing to quickly retrieve sets of trees matching complex patterns in large collection or to root trees using more rigorous approaches than the classical midpoint method. As it is made of a set of command-line programs, it can be easily integrated in any sequence analysis pipeline for an automated use.
Collapse
Affiliation(s)
- Thomas Bigot
- Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Université Claude Bernard - Lyon 1, 43 bd, du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
| | | | | | | |
Collapse
|
12
|
Bérard J, Guéguen L. Accurate estimation of substitution rates with neighbor-dependent models in a phylogenetic context. Syst Biol 2012; 61:510-21. [PMID: 22331438 DOI: 10.1093/sysbio/sys024] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Most models and algorithms developed to perform statistical inference from DNA data make the assumption that substitution processes affecting distinct nucleotide sites are stochastically independent. This assumption ensures both mathematical and computational tractability but is in disagreement with observed data in many situations--one well-known example being CpG dinucleotide hypermutability in mammalian genomes. In this paper, we consider the class of RN95 + YpR substitution models, which allows neighbor-dependent effects--including CpG hypermutability--to be taken into account, through transitions between pyrimidine-purine dinucleotides. We show that it is possible to adapt inference methods originally developed under the assumption of independence between sites to RN95 + YpR models, using a mathematically rigorous framework provided by specific structural properties of this class of models. We assess how efficient this approach is at inferring the CpG hypermutability rate from aligned DNA sequences. The method is tested on simulated data and compared against several alternatives; the results suggest that it delivers a high degree of accuracy at a low computational cost. We then apply our method to an alignment of 10 DNA sequences from primate species. Model comparisons within the RN95 + YpR class show the importance of taking into account neighbor-dependent effects. An application of the method to the detection of hypomethylated islands is discussed.
Collapse
Affiliation(s)
- Jean Bérard
- Institut Camille Jordan, UMR CNRS 5208, Université Lyon 1, Villeurbanne F-69622 Cedex, Université de Lyon, Lyon 69003, France
| | | |
Collapse
|
13
|
Abstract
There is increasing evidence that epigenetic marks such as DNA methylation contribute to phenotypic variation by regulating gene transcription, developmental plasticity, and interactions with the environment. However, relatively little is known about the relationship between the stability and distribution of DNA methylation within chromosomes and the ability to detect trait loci. Plant genomes have a distinct range of target sites and more extensive DNA methylation than animals. We analyzed the stability and distribution of epialleles within the complex genome of the oilseed crop plant Brassica napus. For methylation sensitive AFLP (MSAP) and retrotransposon (RT) epimarkers, we found a high degree of stability, with 90% of mapped markers retaining their allelic pattern in contrasting environments and developmental stages. Moreover, for two distinct parental lines 97% of epialleles were transmitted through five meioses and segregated in a mapping population. For the first time we have established the genetic position for 17 of the 19 centromeres within this amphidiploid species. Epiloci and genetic loci were distributed within distinct clusters, indicating differential detection of recombination events. This enabled us to identify additional significant QTL associated with seven important agronomic traits in the centromeric regions of five linkage groups.
Collapse
|
14
|
Abstract
Mutation rates vary significantly within the genome and across species. Recent studies revealed a long suspected replication-timing effect on mutation rate, but the mechanisms that regulate the increase in mutation rate as the genome is replicated remain unclear. Evidence is emerging, however, that DNA repair systems, in general, are less efficient in late replicating heterochromatic regions compared to early replicating euchromatic regions of the genome. At the same time, mutation rates in both vertebrates and invertebrates have been shown to vary with generation time (GT). GT is correlated with genome size, which suggests a possible nucleotypic effect on species-specific mutation rates. These and other observations all converge on a role for DNA replication checkpoints in modulating generation times and mutation rates during the DNA synthetic phase (S phase) of the cell cycle. The following will examine the potential role of the intra-S checkpoint in regulating cell cycle times (GT) and mutation rates in eukaryotes. This article was published online on August 5, 2011. An error was subsequently identified. This notice is included in the online and print versions to indicate that both have been corrected October 4, 2011.
Collapse
Affiliation(s)
- John Herrick
- Department of Physics, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada.
| |
Collapse
|
15
|
Mugal CF, Ellegren H. Substitution rate variation at human CpG sites correlates with non-CpG divergence, methylation level and GC content. Genome Biol 2011; 12:R58. [PMID: 21696599 PMCID: PMC3218846 DOI: 10.1186/gb-2011-12-6-r58] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2011] [Revised: 05/04/2011] [Accepted: 06/22/2011] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND A major goal in the study of molecular evolution is to unravel the mechanisms that induce variation in the germ line mutation rate and in the genome-wide mutation profile. The rate of germ line mutation is considerably higher for cytosines at CpG sites than for any other nucleotide in the human genome, an increase commonly attributed to cytosine methylation at CpG sites. The CpG mutation rate, however, is not uniform across the genome and, as methylation levels have recently been shown to vary throughout the genome, it has been hypothesized that methylation status may govern variation in the rate of CpG mutation. RESULTS Here, we use genome-wide methylation data from human sperm cells to investigate the impact of DNA methylation on the CpG substitution rate in introns of human genes. We find that there is a significant correlation between the extent of methylation and the substitution rate at CpG sites. Further, we show that the CpG substitution rate is positively correlated with non-CpG divergence, suggesting susceptibility to factors responsible for the general mutation rate in the genome, and negatively correlated with GC content. We only observe a minor contribution of gene expression level, while recombination rate appears to have no significant effect. CONCLUSIONS Our study provides the first direct empirical support for the hypothesis that variation in the level of germ line methylation contributes to substitution rate variation at CpG sites. Moreover, we show that other genomic features also impact on CpG substitution rate variation.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Evolutionary Biology, Uppsala University, Norbyvägen 18D, Uppsala, Sweden
| | | |
Collapse
|
16
|
Nevarez PA, DeBoever CM, Freeland BJ, Quitt MA, Bush EC. Context dependent substitution biases vary within the human genome. BMC Bioinformatics 2010; 11:462. [PMID: 20843365 PMCID: PMC2945941 DOI: 10.1186/1471-2105-11-462] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2010] [Accepted: 09/15/2010] [Indexed: 01/16/2023] Open
Abstract
Background Models of sequence evolution typically assume that different nucleotide positions evolve independently. This assumption is widely appreciated to be an over-simplification. The best known violations involve biases due to adjacent nucleotides. There have also been suggestions that biases exist at larger scales, however this possibility has not been systematically explored. Results To address this we have developed a method which identifies over- and under-represented substitution patterns and assesses their overall impact on the evolution of genome composition. Our method is designed to account for biases at smaller pattern sizes, removing their effects. We used this method to investigate context bias in the human lineage after the divergence from chimpanzee. We examined bias effects in substitution patterns between 2 and 5 bp long and found significant effects at all sizes. This included some individual three and four base pair patterns with relatively large biases. We also found that bias effects vary across the genome, differing between transposons and non-transposons, between different classes of transposons, and also near and far from genes. Conclusions We found that nucleotides beyond the immediately adjacent one are responsible for substantial context effects, and that these biases vary across the genome.
Collapse
|
17
|
Cantu D, Vanzetti LS, Sumner A, Dubcovsky M, Matvienko M, Distelfeld A, Michelmore RW, Dubcovsky J. Small RNAs, DNA methylation and transposable elements in wheat. BMC Genomics 2010; 11:408. [PMID: 20584339 PMCID: PMC2996936 DOI: 10.1186/1471-2164-11-408] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2010] [Accepted: 06/29/2010] [Indexed: 12/15/2022] Open
Abstract
Background More than 80% of the wheat genome is composed of transposable elements (TEs). Since active TEs can move to different locations and potentially impose a significant mutational load, their expression is suppressed in the genome via small non-coding RNAs (sRNAs). sRNAs guide silencing of TEs at the transcriptional (mainly 24-nt sRNAs) and post-transcriptional (mainly 21-nt sRNAs) levels. In this study, we report the distribution of these two types of sRNAs among the different classes of wheat TEs, the regions targeted within the TEs, and their impact on the methylation patterns of the targeted regions. Results We constructed an sRNA library from hexaploid wheat and developed a database that included our library and three other publicly available sRNA libraries from wheat. For five completely-sequenced wheat BAC contigs, most perfectly matching sRNAs represented TE sequences, suggesting that a large fraction of the wheat sRNAs originated from TEs. An analysis of all wheat TEs present in the Triticeae Repeat Sequence database showed that sRNA abundance was correlated with the estimated number of TEs within each class. Most of the sRNAs perfectly matching miniature inverted repeat transposable elements (MITEs) belonged to the 21-nt class and were mainly targeted to the terminal inverted repeats (TIRs). In contrast, most of the sRNAs matching class I and class II TEs belonged to the 24-nt class and were mainly targeted to the long terminal repeats (LTRs) in the class I TEs and to the terminal repeats in CACTA transposons. An analysis of the mutation frequency in potentially methylated sites revealed a three-fold increase in TE mutation frequency relative to intron and untranslated genic regions. This increase is consistent with wheat TEs being preferentially methylated, likely by sRNA targeting. Conclusions Our study examines the wheat epigenome in relation to known TEs. sRNA-directed transcriptional and post-transcriptional silencing plays important roles in the short-term suppression of TEs in the wheat genome, whereas DNA methylation and increased mutation rates may provide a long-term mechanism to inactivate TEs.
Collapse
Affiliation(s)
- Dario Cantu
- Department of Plant Sciences, University of California Davis, One Shields Ave, Davis, CA, USA
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Matsui T, Leung D, Miyashita H, Maksakova IA, Miyachi H, Kimura H, Tachibana M, Lorincz MC, Shinkai Y. Proviral silencing in embryonic stem cells requires the histone methyltransferase ESET. Nature 2010; 464:927-31. [PMID: 20164836 DOI: 10.1038/nature08858] [Citation(s) in RCA: 597] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2009] [Accepted: 01/16/2010] [Indexed: 12/13/2022]
Abstract
Endogenous retroviruses (ERVs), retrovirus-like elements with long terminal repeats, are widely dispersed in the euchromatic compartment in mammalian cells, comprising approximately 10% of the mouse genome. These parasitic elements are responsible for >10% of spontaneous mutations. Whereas DNA methylation has an important role in proviral silencing in somatic and germ-lineage cells, an additional DNA-methylation-independent pathway also functions in embryonal carcinoma and embryonic stem (ES) cells to inhibit transcription of the exogenous gammaretrovirus murine leukaemia virus (MLV). Notably, a recent genome-wide study revealed that ERVs are also marked by histone H3 lysine 9 trimethylation (H3K9me3) and H4K20me3 in ES cells but not in mouse embryonic fibroblasts. However, the role that these marks have in proviral silencing remains unexplored. Here we show that the H3K9 methyltransferase ESET (also called SETDB1 or KMT1E) and the Krüppel-associated box (KRAB)-associated protein 1 (KAP1, also called TRIM28) are required for H3K9me3 and silencing of endogenous and introduced retroviruses specifically in mouse ES cells. Furthermore, whereas ESET enzymatic activity is crucial for HP1 binding and efficient proviral silencing, the H4K20 methyltransferases Suv420h1 and Suv420h2 are dispensable for silencing. Notably, in DNA methyltransferase triple knockout (Dnmt1(-/-)Dnmt3a(-/-)Dnmt3b(-/-)) mouse ES cells, ESET and KAP1 binding and ESET-mediated H3K9me3 are maintained and ERVs are minimally derepressed. We propose that a DNA-methylation-independent pathway involving KAP1 and ESET/ESET-mediated H3K9me3 is required for proviral silencing during the period early in embryogenesis when DNA methylation is dynamically reprogrammed.
Collapse
Affiliation(s)
- Toshiyuki Matsui
- Experimental Research Center for Infectious Diseases, Institute for Virus Research, Kyoto University, 53 Shogoin, Kawara-cho, Sakyo-ku, Kyoto 606-8507, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Pink CJ, Hurst LD. Timing of replication is a determinant of neutral substitution rates but does not explain slow Y chromosome evolution in rodents. Mol Biol Evol 2009; 27:1077-86. [PMID: 20026481 DOI: 10.1093/molbev/msp314] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Mutation rates, assayed as substitution rates of putatively neutral sites, are highly variable around mammalian genomes: There is heterogeneity between genes, between autosomes, and between X, Y, and autosomes. The differences between X, Y, and autosomes are typically assumed to reflect the greater number of cell divisions in the male germ-line. Such an effect can neither account for within-autosome differences nor does it predict the differences between X, Y, and autosome observed in rodents. It has recently been proposed that in primates, the time during S-phase when a gene is replicated is an important determinant of neutral rates of evolution. Here we ask 1) whether we can replicate this result in rodents, 2) whether different autosomes replicate on average at different times, and 3) whether this might explain differences in their substitution rates. Finally we ask 4) whether X, Y, and autosome replicate at different times and 5) whether any difference might explain why the number of replication events alone cannot explain their substitution rates. We find that, as in primates, autosomal intronic rates of evolution increase significantly during S-phase. Different autosomes do have different average replication times, and together with rearrangement, this is a significant predictor of between-autosome differences in substitution rate. Although we find that autosomal, X-, and Y-linked genes replicate at different times, it is paradoxical that the Y-linked genes replicate latest, and replicate more often, but are not especially fast evolving. These results support the hypothesis that replication timing is an important source of substitution rate heterogeneity.
Collapse
Affiliation(s)
- Catherine J Pink
- Department of Biology and Biochemistry, University of Bath, Somerset, United Kingdom
| | | |
Collapse
|
20
|
Mazin AL. Suicidal function of DNA methylation in age-related genome disintegration. Ageing Res Rev 2009; 8:314-27. [PMID: 19464391 DOI: 10.1016/j.arr.2009.04.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2009] [Revised: 04/17/2009] [Accepted: 04/20/2009] [Indexed: 10/20/2022]
Abstract
This article is dedicated to the 60th anniversary of 5-methylcytosine discovery in DNA. Cytosine methylation can affect genetic and epigenetic processes, works as a part of the genome-defense system and has mutagenic activity; however, the biological functions of this enzymatic modification are not well understood. This review will put forward the hypothesis that the host-defense role of DNA methylation in silencing and mutational destroying of retroviruses and other intragenomic parasites was extended during evolution to most host genes that have to be inactivated in differentiated somatic cells, where it acquired a new function in age-related self-destruction of the genome. The proposed model considers DNA methylation as the generator of 5mC>T transitions that induce 40-70% of all spontaneous somatic mutations of the multiple classes at CpG and CpNpG sites and flanking nucleotides in the p53, FIX, hprt, gpt human genes and some transgenes. The accumulation of 5mC-dependent mutations explains: global changes in the structure of the vertebrate genome throughout evolution; the loss of most 5mC from the DNA of various species over their lifespan and the Hayflick limit of normal cells; the polymorphism of methylation sites, including asymmetric mCpNpN sites; cyclical changes of methylation and demethylation in genes. The suicidal function of methylation may be a special genetic mechanism for increasing DNA damage and the programmed genome disintegration responsible for cell apoptosis and organism aging and death.
Collapse
|
21
|
|
22
|
Jung YC, Hong SJ, Kim YH, Kim SJ, Kang SJ, Choi SW, Rhyu MG. Chromosomal losses are associated with hypomethylation of the gene-control regions in the stomach with a low number of active genes. J Korean Med Sci 2008; 23:1068-89. [PMID: 19119454 PMCID: PMC2612760 DOI: 10.3346/jkms.2008.23.6.1068] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/06/2007] [Accepted: 04/01/2008] [Indexed: 11/20/2022] Open
Abstract
Transitional-CpG methylation between unmethylated promoters and nearby methylated retroelements plays a role in the establishment of tissue-specific transcription. This study examined whether chromosomal losses reducing the active genes in cancers can change transitional-CpG methylation and the transcription activity in a cancer-type-dependent manner. The transitional-CpG sites at the CpG-island margins of nine genes and the non-island-CpG sites round the transcription start sites of six genes lacking CpG islands were examined by methylation-specific polymerase chain reaction (PCR) analysis. The number of active genes in normal and cancerous tissues of the stomach, colon, breast, and nasopharynx were analyzed using the public data in silico. The CpG-island margins and non-island CpG sites tended to be hypermethylated and hypomethylated in all cancer types, respectively. The CpG-island margins were hypermethylated and a low number of genes were active in the normal stomach compared with other normal tissues. In gastric cancers, the CpG-island margins and non-island-CpG sites were hypomethylated in association with high-level chromosomal losses, and the number of active genes increased. Colon, breast, and nasopharyngeal cancers showed no significant association between the chromosomal losses and methylation changes. These findings suggest that chromosomal losses in gastric cancers are associated with the hypomethylation of the gene-control regions and the increased number of active genes.
Collapse
Affiliation(s)
- Yu-Chae Jung
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Seung-Jin Hong
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Young-Ho Kim
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Sung-Ja Kim
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Seok-Jin Kang
- Department of Clinical Pathology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Sang-Wook Choi
- Department of Internal Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Mun-Gan Rhyu
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| |
Collapse
|
23
|
Does selection against transcriptional interference shape retroelement-free regions in mammalian genomes? PLoS One 2008; 3:e3760. [PMID: 19018283 PMCID: PMC2582637 DOI: 10.1371/journal.pone.0003760] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2008] [Accepted: 10/31/2008] [Indexed: 11/29/2022] Open
Abstract
Background Eukaryotic genomes are scattered with retroelements that proliferate through retrotransposition. Although retroelements make up around 40 percent of the human genome, large regions are found to be completely devoid of retroelements. This has been hypothesised to be a result of genomic regions being intolerant to insertions of retroelements. The inadvertent transcriptional activity of retroelements may affect neighbouring genes, which in turn could be detrimental to an organism. We speculate that such retroelement transcription, or transcriptional interference, is a contributing factor in generating and maintaining retroelement-free regions in the human genome. Methodology/Principal Findings Based on the known transcriptional properties of retroelements, we expect long interspersed elements (LINEs) to be able to display a high degree of transcriptional interference. In contrast, we expect short interspersed elements (SINEs) to display very low levels of transcriptional interference. We find that genomic regions devoid of long interspersed elements (LINEs) are enriched for protein-coding genes, but that this is not the case for regions devoid of short interspersed elements (SINEs). This is expected if genes are subject to selection against transcriptional interference. We do not find microRNAs to be associated with genomic regions devoid of either SINEs or LINEs. We further observe an increased relative activity of genes overlapping LINE-free regions during early embryogenesis, where activity of LINEs has been identified previously. Conclusions/Significance Our observations are consistent with the notion that selection against transcriptional interference has contributed to the maintenance and/or generation of retroelement-free regions in the human genome.
Collapse
|
24
|
Peifer M, Karro JE, von Grünberg HH. Is there an acceleration of the CpG transition rate during the mammalian radiation? Bioinformatics 2008; 24:2157-64. [PMID: 18662928 PMCID: PMC2553435 DOI: 10.1093/bioinformatics/btn391] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2008] [Revised: 07/27/2008] [Accepted: 07/27/2008] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION In this article we build a model of the CpG dinucleotide substitution rate and use it to challenge the claim that, that rate underwent a sudden mammalian-specific increase approximately 90 million years ago. The evidence supporting this hypothesis comes from the application of a model of neutral substitution rates able to account for elevated CpG dinucleotide substitution rates. With the initial goal of improving that model's accuracy, we introduced a modification enabling us to account for boundary effects arising by the truncation of the Markov field, as well as improving the optimization procedure required for estimating the substitution rates. RESULTS When using this modified method to reproduce the supporting analysis, the evidence of the rate shift vanished. Our analysis suggests that the CpG-specific rate has been constant over the relevant time period and that the asserted acceleration of the CpG rate is likely an artifact of the original model.
Collapse
Affiliation(s)
- M Peifer
- Institute of Chemistry, Karl-Franzens University Graz, Graz, Austria.
| | | | | |
Collapse
|
25
|
Mekmaysy CS, Petraccone L, Garbett NC, Ragazzon PA, Gray R, Trent JO, Chaires JB. Effect of O6-methylguanine on the stability of G-quadruplex DNA. J Am Chem Soc 2008; 130:6710-1. [PMID: 18447358 DOI: 10.1021/ja801976h] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The effects of substitution of O6-methylguanine on the structure and stability of a human telomere quadruplex was studied by circular dichroism, thermal denaturation, analytical ultracentrifugation, and molecular dynamics simulations. The results show that, while quadruplex structures can form containing the modified base, they are much less stable than the normal unmodified structure. The extent of destabilization is critically dependent on the exact position of the modified base within the quadruplex structure.
Collapse
Affiliation(s)
- Chongkham S Mekmaysy
- James Graham Brown Cancer Center and University of Louisville, 529 South Jackson Street, Louisville, Kentucky 40202, USA
| | | | | | | | | | | | | |
Collapse
|
26
|
Kang MY, Lee BB, Ji YI, Jung EH, Chun HK, Song SY, Park SE, Park J, Kim DH. Association of interindividual differences in p14ARF promoter methylation with single nucleotide polymorphism in primary colorectal cancer. Cancer 2008; 112:1699-707. [PMID: 18327804 DOI: 10.1002/cncr.23335] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
BACKGROUND CpG island hypermethylation has been reported at the promoter region of many tumor suppressor genes in colorectal cancers. However, there are significant interindividual differences in the degree of DNA methylation in colorectal cancers. The objective of the current study was to understand whether single nucleotide polymorphisms (SNPs) around the promoter of a gene are implicated in the interindividual differences of CpG island hypermethylation. METHODS Promoter methylation of the p14(ARF) gene and messenger RNA (mRNA) expression levels of p14(ARF), DNA methyltransferase 1 (DNMT1), and DNMT3b were investigated by using methylation-specific polymerase chain reaction (PCR) analysis (MSP) and quantitative real-time PCR analysis in fresh tissues from 188 patients with colorectal cancer. SNPs around the p14(ARF) promoter were genotyped in DNA from peripheral blood lymphocytes in 300 healthy individuals and in 188 patients with colorectal cancer by using matrix-assisted laser desorption/ionization mass spectrometry. RESULTS p14(ARF) methylation was present in 61 of 188 colorectal cancers (32%). Fourteen SNPs among the 20 candidate SNPs were identified as monomorphic in the Korean population studied. Two individual SNPs (-4256 thymine to cytosine [T-->C] and -1477 guanine to adenine [G-->A]), which were in strong linkage disequilibrium (|D'|=0.99; correlation coefficient [r(2)]=0.95), were associated significantly with p14(ARF) methylation. Patients who had the CC variant at the-4256 locus or the AA variant at the -1477 locus had 2.42 times (95% confidence interval [95% CI], 1.07-5.46; P = .03) and 2.47 times (95% CI, 1.09-5.56; P= .03) greater risk of p14(ARF) methylation than patients who had the TT or GG homozygote, respectively, after adjusting for mRNA levels of DNMTs. Four major haplotypes were identified within a block (-4256 T-->C, -3631 T-->C, -1477 G-->A, and +20,188 T-->C). p14(ARF) promoter methylation also was associated significantly with the CCAT haplotype (odds ratio [OR], 8.31; 95% CI, 2.43-28.41; P= .0007) and the CTAC haplotype (OR, 9.71; 95% CI, 1.09-86.24; P= .04). CONCLUSIONS The current results suggested that SNPs around the p14(ARF) promoter region may be responsible for the interindividual susceptibility to p14(ARF) promoter methylation among individuals with colorectal cancer.
Collapse
Affiliation(s)
- Mi Yeon Kang
- Center for Genome Research, Samsung Biomedical Research Institute, Sungkyunkwan University School of Medicine, Kangnam-Ku, Seoul, South Korea
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Elango N, Kim SH, NISC Comparative Sequencing Program, Vigoda E, Yi SV. Mutations of different molecular origins exhibit contrasting patterns of regional substitution rate variation. PLoS Comput Biol 2008; 4:e1000015. [PMID: 18463707 PMCID: PMC2265638 DOI: 10.1371/journal.pcbi.1000015] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2007] [Accepted: 01/30/2008] [Indexed: 11/19/2022] Open
Abstract
Transitions at CpG dinucleotides, referred to as “CpG substitutions”, are a major mutational input into vertebrate genomes and a leading cause of human genetic disease. The prevalence of CpG substitutions is due to their mutational origin, which is dependent on DNA methylation. In comparison, other single nucleotide substitutions (for example those occurring at GpC dinucleotides) mainly arise from errors during DNA replication. Here we analyzed high quality BAC-based data from human, chimpanzee, and baboon to investigate regional variation of CpG substitution rates. We show that CpG substitutions occur approximately 15 times more frequently than other single nucleotide substitutions in primate genomes, and that they exhibit substantial regional variation. Patterns of CpG rate variation are consistent with differences in methylation level and susceptibility to subsequent deamination. In particular, we propose a “distance-decaying” hypothesis, positing that due to the molecular mechanism of a CpG substitution, rates are correlated with the stability of double-stranded DNA surrounding each CpG dinucleotide, and the effect of local DNA stability may decrease with distance from the CpG dinucleotide. Consistent with our “distance-decaying” hypothesis, rates of CpG substitution are strongly (negatively) correlated with regional G+C content. The influence of G+C content decays as the distance from the target CpG site increases. We estimate that the influence of local G+C content extends up to 1,500∼2,000 bps centered on each CpG site. We also show that the distance-decaying relationship persisted when we controlled for the effect of long-range homogeneity of nucleotide composition. GpC sites, in contrast, do not exhibit such “distance-decaying” relationship. Our results highlight an example of the distinctive properties of methylation-dependent substitutions versus substitutions mostly arising from errors during DNA replication. Furthermore, the negative relationship between G+C content and CpG rates may provide an explanation for the observation that GC-rich SINEs show lower CpG rates than other repetitive elements. Mutations are raw materials of evolution. Earlier studies have shown that mutations occur at different frequencies in different genomic regions. By investigating the patterns and causes of such “regional” variation of mutations, we can better understand the mechanisms of underlying mutagenesis. In the human and other mammalian genomes, the most common type of mutation is caused by DNA methylation, which targets cytosines followed by guanine (CpG dinucleotides). Methylated cytosines are then subject to spontaneous deamination, which will cause a C to T (or G to A) transition (CpG substitution). Because this mutational process is unique to CpG substitutions, we reasoned that they might show different patterns of variability from other substitutions. Using high quality genomic sequences from primates and by separately analyzing variability of CpG substitutions and other substitutions, we demonstrate that CpG substitutions occur approximately 15 times more frequently than other substitutions, and show a distinctive pattern of regional variability. Particularly, we propose and provide evidence that because the deamination step requires temporary strand separation, G+C composition near 1,500–2,000 bps each direction from a target CpG affects the probability of a CpG substitution. Incorporating the difference in CpG and other substitutions discovered in this study will help build more realistic evolutionary models.
Collapse
Affiliation(s)
- Navin Elango
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Seong-Ho Kim
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - NISC Comparative Sequencing Program
- Genome Technology Branch and NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eric Vigoda
- College of Computing, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Soojin V. Yi
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|
28
|
Life-history traits drive the evolutionary rates of mammalian coding and noncoding genomic elements. Proc Natl Acad Sci U S A 2007; 104:20443-8. [PMID: 18077382 DOI: 10.1073/pnas.0705658104] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
A comprehensive phylogenetic framework is indispensable for investigating the evolution of genomic features in mammals as a whole, and particularly in humans. Using the ENCODE sequence data, we estimated mammalian neutral evolutionary rates and selective pressures acting on conserved coding and noncoding elements. We show that neutral evolutionary rates can be explained by the generation time (GT) hypothesis. Accordingly, primates (especially humans), having longer GTs than other mammals, display slower rates of neutral evolution. The evolution of constrained elements, particularly of nonsynonymous sites, is in agreement with the expectations of the nearly neutral theory of molecular evolution. We show that rates of nonsynonymous substitutions (dN) depend on the population size of a species. The results are robust to the exclusion of hypermutable CpG prone sites. The average rate of evolution in conserved noncoding sequences (CNCs) is 1.7 times higher than in nonsynonymous sites. Despite this, CNCs evolve at similar or even lower rates than nonsynonymous sites in the majority of basal branches of the eutherian tree. This observation could be the result of an overall gradual or, alternatively, lineage-specific relaxation of CNCs. The latter hypothesis was supported by the finding that 3 of the 20 longest CNCs displayed significant relaxation of individual branches. This observation may explain why the evolution of CNCs fits the expectations of the nearly neutral theory less well than the evolution of nonsynonymous sites.
Collapse
|
29
|
Kim TM, Chung YJ, Rhyu MG, Jung MH. Germline methylation patterns inferred from local nucleotide frequency of repetitive sequences in the human genome. Mamm Genome 2007; 18:277-85. [PMID: 17514347 DOI: 10.1007/s00335-007-9016-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2007] [Accepted: 03/12/2007] [Indexed: 12/31/2022]
Abstract
Given the genomic abundance and susceptibility to DNA methylation, interspersed repetitive sequences in the human genome can be exploited as valuable resources in genome-wide methylation studies. To learn about the relationships between DNA methylation and repeat sequences, we performed a global measurement of CpG dinucleotide frequencies for interspersed repetitive sequences and inferred germline methylation patterns in the human genome. Although extensive CpG depletion was observed for most repeat sequences, those in the proximity to CpG islands have been relatively removed from germline methylation being the potential source of germline activation. We also investigated the CpG depletion patterns of Alu pairs to see whether they might play an active role in germline methylation. Two kinds of Alu pairs, direct or inverted pairs classified according to the orientation, showed contrast CpG depletion patterns with respect to separating distance of Alus, i.e., as two Alu elements are more closely spaced in a pair, a higher extent of CpG depletion was observed in inverted orientation and vice versa for directly repetitive Alu pairs. This suggests that specific organization of repetitive sequences, such as inverted Alu pairs, might play a role in triggering DNA methylation consistent with a homology-dependent methylation hypothesis.
Collapse
Affiliation(s)
- Tae-Min Kim
- Division of Metabolic Disease, Center for Biomedical Science, National Institute of Health, Nokbun-dong 5, Eunpyung-gu, Seoul 122-701, Korea
| | | | | | | |
Collapse
|
30
|
Boulesteix M, Simard F, Antonio-Nkondjio C, Awono-Ambene HP, Fontenille D, Biémont C. Insertion polymorphism of transposable elements and population structure of Anopheles gambiae M and S molecular forms in Cameroon. Mol Ecol 2007; 16:441-52. [PMID: 17217356 DOI: 10.1111/j.1365-294x.2006.03150.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The insertion polymorphism of five transposable element (TE) families was studied by Southern blots in several populations of the M and S molecular forms of the mosquito Anopheles gambiae sensu stricto from southern Cameroon. We showed that the mean TE insertion site number and the within-population insertion site polymorphism globally differed between the M and S molecular forms. The comparison of the TE insertion profiles of the populations revealed a significant differentiation between these two molecular forms (0.163 < Phi(ST) < 0.371). We cloned several insertions of a non-LTR retrotransposon (Aara8) that were fixed in one form and absent in the other one. The only insertion that could be clearly located on a chromosome arm mapped to cytological division 6 of chromosome X, confirming the importance of this region in the ongoing speciation between the M and S molecular forms.
Collapse
Affiliation(s)
- M Boulesteix
- Laboratoire de Biométrie et Biologie Evolutive, UMR 5558, CNRS, Université Claude Bernard Lyon1, 69622 Villeurbanne Cedex, France
| | | | | | | | | | | |
Collapse
|
31
|
Wood AJ, Roberts RG, Monk D, Moore GE, Schulz R, Oakey RJ. A screen for retrotransposed imprinted genes reveals an association between X chromosome homology and maternal germ-line methylation. PLoS Genet 2007; 3:e20. [PMID: 17291163 PMCID: PMC1796624 DOI: 10.1371/journal.pgen.0030020] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2006] [Accepted: 12/18/2006] [Indexed: 11/24/2022] Open
Abstract
Imprinted genes undergo epigenetic modifications during gametogenesis, which lead to transcriptional silencing of either the maternally or the paternally derived allele in the subsequent generation. Previous work has suggested an association between imprinting and the products of retrotransposition, but the nature of this link is not well defined. In the mouse, three imprinted genes have been described that originated by retrotransposition and overlap CpG islands which undergo methylation during oogenesis. Nap1l5, U2af1-rs1, and Inpp5f_v2 are likely to encode proteins and share two additional genetic properties: they are located within introns of host transcripts and are derived from parental genes on the X chromosome. Using these sequence features alone, we identified Mcts2, a novel candidate imprinted retrogene on mouse Chromosome 2. Mcts2 has been validated as imprinted by demonstrating that it is paternally expressed and undergoes promoter methylation during oogenesis. The orthologous human retrogenes NAP1L5, INPP5F_V2, and MCTS2 are also shown to be paternally expressed, thus delineating novel imprinted loci on human Chromosomes 4, 10, and 20. The striking correlation between imprinting and X chromosome provenance suggests that retrotransposed elements with homology to the X chromosome can be selectively targeted for methylation during mammalian oogenesis.
Collapse
Affiliation(s)
- Andrew J Wood
- Department of Medical and Molecular Genetics, King's College London, London, United Kingdom
| | - Roland G Roberts
- Department of Medical and Molecular Genetics, King's College London, London, United Kingdom
| | - David Monk
- Unit of Clinical and Molecular Genetics, Institute of Child Health, London, United Kingdom
| | - Gudrun E Moore
- Unit of Clinical and Molecular Genetics, Institute of Child Health, London, United Kingdom
| | - Reiner Schulz
- Department of Medical and Molecular Genetics, King's College London, London, United Kingdom
| | - Rebecca J Oakey
- Department of Medical and Molecular Genetics, King's College London, London, United Kingdom
| |
Collapse
|
32
|
Gaffney DJ, Keightley PD. Genomic selective constraints in murid noncoding DNA. PLoS Genet 2006; 2:e204. [PMID: 17166057 PMCID: PMC1657059 DOI: 10.1371/journal.pgen.0020204] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2006] [Accepted: 10/18/2006] [Indexed: 02/04/2023] Open
Abstract
Recent work has suggested that there are many more selectively constrained, functional noncoding than coding sites in mammalian genomes. However, little is known about how selective constraint varies amongst different classes of noncoding DNA. We estimated the magnitude of selective constraint on a large dataset of mouse-rat gene orthologs and their surrounding noncoding DNA. Our analysis indicates that there are more than three times as many selectively constrained, nonrepetitive sites within noncoding DNA as in coding DNA in murids. The majority of these constrained noncoding sites appear to be located within intergenic regions, at distances greater than 5 kilobases from known genes. Our study also shows that in murids, intron length and mean intronic selective constraint are negatively correlated with intron ordinal number. Our results therefore suggest that functional intronic sites tend to accumulate toward the 5′ end of murid genes. Our analysis also reveals that mean number of selectively constrained noncoding sites varies substantially with the function of the adjacent gene. We find that, among others, developmental and neuronal genes are associated with the greatest numbers of putatively functional noncoding sites compared with genes involved in electron transport and a variety of metabolic processes. Combining our estimates of the total number of constrained coding and noncoding bases we calculate that over twice as many deleterious mutations have occurred in intergenic regions as in known genic sequence and that the total genomic deleterious point mutation rate is 0.91 per diploid genome, per generation. This estimated rate is over twice as large as a previous estimate in murids. Most DNA can typically be divided into two categories: regions that encode the instructions for the assembly of a protein molecule (protein-coding genes) and those that do not (noncoding). Although mammalian genomes are primarily noncoding, relatively little is known about how much of this is functional, where such regions are found in the genome, and what functions they are likely to perform. In this study, the authors investigated the quantity and location of functional noncoding DNA in mice and rats. They estimate that functional noncoding DNA is at least three times as common as coding DNA in rodents, and the majority is located large distances from known protein-coding genes. Putatively functional intronic DNA tends to be clustered towards the gene 5′ end, suggesting that much intronic sequence is instrumental in regulating gene expression. This study also finds that genes involved in development and the nervous system are typically associated with much higher quantities of functional noncoding DNA, suggesting that these genes require more finely tuned control of their expression. One implication of this study is the finding that disease-causing mutations have occurred more frequently in noncoding regions and may have affected gene expression, rather than protein structure.
Collapse
Affiliation(s)
- Daniel J Gaffney
- Institute of Evolutionary Biology, Ashworth Laboratories, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
| | | |
Collapse
|
33
|
Kim SH, Elango N, Warden C, Vigoda E, Yi SV. Heterogeneous genomic molecular clocks in primates. PLoS Genet 2006; 2:e163. [PMID: 17029560 PMCID: PMC1592237 DOI: 10.1371/journal.pgen.0020163] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2006] [Accepted: 08/10/2006] [Indexed: 12/22/2022] Open
Abstract
Using data from primates, we show that molecular clocks in sites that have been part of a CpG dinucleotide in recent past (CpG sites) and non-CpG sites are of markedly different nature, reflecting differences in their molecular origins. Notably, single nucleotide substitutions at non-CpG sites show clear generation-time dependency, indicating that most of these substitutions occur by errors during DNA replication. On the other hand, substitutions at CpG sites occur relatively constantly over time, as expected from their primary origin due to methylation. Therefore, molecular clocks are heterogeneous even within a genome. Furthermore, we propose that varying frequencies of CpG dinucleotides in different genomic regions may have contributed significantly to conflicting earlier results on rate constancy of mammalian molecular clock. Our conclusion that different regions of genomes follow different molecular clocks should be considered when inferring divergence times using molecular data and in phylogenetic analysis.
Collapse
Affiliation(s)
- Seong-Ho Kim
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Navin Elango
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Charles Warden
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Eric Vigoda
- College of Computing, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| |
Collapse
|
34
|
Kim YH, Hong SJ, Jung YC, Kim SJ, Seo EJ, Choi SW, Rhyu MG. The 5'-end transitional CpGs between the CpG islands and retroelements are hypomethylated in association with loss of heterozygosity in gastric cancers. BMC Cancer 2006; 6:180. [PMID: 16827945 PMCID: PMC1552088 DOI: 10.1186/1471-2407-6-180] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2006] [Accepted: 07/10/2006] [Indexed: 01/28/2023] Open
Abstract
Background A loss of heterozygosity (LOH) represents a unilateral chromosomal loss that reduces the dose of highly repetitive Alu, L1, and LTR retroelements. The aim of this study was to determine if the LOH events can affect the spread of retroelement methylation in the 5'-end transitional area between the CpG islands and their nearest retroelements. Methods The 5'-transitional area of all human genes (22,297) was measured according to the nearest retroelements to the transcription start sites. For 50 gastric cancer specimens, the level of LOH events on eight cancer-associated chromosomes was estimated using the microsatellite markers, and the 5'-transitional CpGs of 20 selected genes were examined by methylation analysis using the bisulfite-modified DNA. Results The extent of the transitional area was significantly shorter with the nearest Alu elements than with the nearest L1 and LTR elements, as well as in the extragenic regions containing a higher density of retroelements than in the intragenic regions. The CpG islands neighbouring a high density of Alu elements were consistently hypomethylated in both normal and tumor tissues. The 5'-transitional methylated CpG sites bordered by a low density of Alu elements or the L1 and LTR elements were hypomethylated more frequently in the high-level LOH cases than in the low-level LOH cases. Conclusion The 5'-transitional methylated CpG sites not completely protected by the Alu elements were hypomethylated in association with LOH events in gastric cancers. This suggests that an irreversible unbalanced decrease in the genomic dose reduces the spread of L1 methylation in the 5'-end regions of genes.
Collapse
Affiliation(s)
- Young-Ho Kim
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Seung-Jin Hong
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Yu-Chae Jung
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Sung-Ja Kim
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Eun-Joo Seo
- Department of Clinical Pathology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Sang-Wook Choi
- Department of Internal Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Mun-Gan Rhyu
- Department of Microbiology, College of Medicine, The Catholic University of Korea, Seoul, Korea
| |
Collapse
|
35
|
Duret L. The GC content of primates and rodents genomes is not at equilibrium: a reply to Antezana. J Mol Evol 2006; 62:803-6. [PMID: 16752218 DOI: 10.1007/s00239-005-0228-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2005] [Accepted: 01/04/2006] [Indexed: 10/24/2022]
|
36
|
Kouidou S, Malousi A, Maglaveras N. Methylation and repeats in silent and nonsense mutations of p53. Mutat Res 2006; 599:167-77. [PMID: 16620878 DOI: 10.1016/j.mrfmmm.2006.03.002] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2005] [Revised: 02/22/2006] [Accepted: 03/01/2006] [Indexed: 12/16/2022]
Abstract
All exonic CG sequences in p53 are methylated; this epigenetic modification is correlated with frequent G:C-->A:T transitions in p53. Recent reports reveal the presence in p53 of non-CG methylation in CC and CCC sequences, complementary to sites of selective guanosine adduct formation (GG and GGG), and the association of genetic instability with methylation at repetitive sequences. We presently investigated the distribution of methylation sites and repetitive elements in silent and nonsense p53 mutations (2051) among the IARC's TP53 somatic mutation database for exons 5-8. Silent mutations are nonrandom, but mostly involve G:C-->A:T transitions (62%); in particular C-->T mutations (39% of all silent mutations) are mostly correlated with CC and CCC sequences, while G-->A mutations with GG sequences. Sequence analysis of all non-G:C-->A:T silent mutations reveals the frequent formation of new methylation sites (CG), new CCC and GGG sequences in the resulting sequence, refinement of symmetry elements at interrupted microsatellite-like sequences and formation of small repeats (55.3%). The G:C-->A:T silent mutations characterize cancers associated with cigarette smoking (e.g. bladder or lung and bronchus cancer versus colorectal cancer); on the contrary, non-G:C-->A:T silent mutations have similar frequencies in most cancers. Nonsense mutations in exons 5-8, all resulting in mutants lacking amino acids 307-393, which are crucial for p53 activity, were also analyzed. The frequency of nonsense mutations is higher at methylated sites or repeats 1-2 nucleotides removed from methylation sites. Frameshift mutations are also more frequent at repeated sequences. The frequent G:C-->A:T silent mutations could indicate that CC and CCC sequences of exons 5-8 are occasionally targets of non-CpG methylation of cytosine. This process of de novo methylation in the presence of microsatellite-like sequences and small repeats might influence the genetic stability of a variety of genes.
Collapse
Affiliation(s)
- Sofia Kouidou
- Laboratory of Biological Chemistry, School of Medicine, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.
| | | | | |
Collapse
|
37
|
Webster MT, Axelsson E, Ellegren H. Strong Regional Biases in Nucleotide Substitution in the Chicken Genome. Mol Biol Evol 2006; 23:1203-16. [PMID: 16551647 DOI: 10.1093/molbev/msk008] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Interspersed repeats have emerged as a valuable tool for studying neutral patterns of molecular evolution. Here we analyze variation in the rate and pattern of nucleotide substitution across all autosomes in the chicken genome by comparing the present-day CR1 repeat sequences with their ancestral copies and reconstructing nucleotide substitutions with a maximum likelihood model. The results shed light on the origin and evolution of large-scale heterogeneity in GC content found in the genomes of birds and mammals--the isochore structure. In contrast to mammals, where GC content is becoming homogenized, heterogeneity in GC content is being reinforced in the chicken genome. This is also supported by patterns of substitution inferred from alignments of introns in chicken, turkey, and quail. Analysis of individual substitution frequencies is consistent with the biased gene conversion (BGC) model of isochore evolution, and it is likely that patterns of evolution in the chicken genome closely resemble those in the ancestral amniote genome, when it is inferred that isochores originated. Microchromosomes and distal regions of macrochromosomes are found to have elevated substitution rates and a more GC-biased pattern of nucleotide substitution. This can largely be accounted for by a strong correlation between GC content and the rate and pattern of substitution. The results suggest that an interaction between increased mutability at CpG motifs and fixation biases due to BGC could explain increased levels of divergence in GC-rich regions.
Collapse
Affiliation(s)
- Matthew T Webster
- Department of Evolution, Genomics and Systematics, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.
| | | | | |
Collapse
|
38
|
Kang MI, Rhyu MG, Kim YH, Jung YC, Hong SJ, Cho CS, Kim HS. The length of CpG islands is associated with the distribution of Alu and L1 retroelements. Genomics 2006; 87:580-90. [PMID: 16488573 DOI: 10.1016/j.ygeno.2006.01.002] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2005] [Revised: 01/10/2006] [Accepted: 01/10/2006] [Indexed: 11/20/2022]
Abstract
Alu and L1 retroelements have been suggested to initiate the spread of CpG methylation. In this study, the spread of CpG methylation was estimated based on the distance between the CpG islands and the nearest retroelements. All human genes (23,116) were examined and the correlations between the length of the CpG islands and the distance and density of the confronting retroelements were examined using nonoverlapping 5-kb windows. There was a linear relationship between the length of the CpG islands and the density of the Alu elements and an inverse relationship between the CpG islands and the L1 elements located more distantly, suggesting a suppressive effect of the Alu's on the spread of L1 methylation. Methylation analysis of the transitional CpG sites between the CpG islands and the nearest retroelements upstream of 16 genes was then carried out using DNA preparations from 11 different human tissues. Methylation-variable transitional CpGs were observed for the selected genes and the different tissues.
Collapse
Affiliation(s)
- Moo-Il Kang
- Department of Internal Medicine, The Catholic University of Korea, 505 Banpo-dong, Socho-gu, Seoul 137-701, Korea
| | | | | | | | | | | | | |
Collapse
|
39
|
Sironi M, Menozzi G, Comi GP, Cereda M, Cagliani R, Bresolin N, Pozzoli U. Gene function and expression level influence the insertion/fixation dynamics of distinct transposon families in mammalian introns. Genome Biol 2006; 7:R120. [PMID: 17181857 PMCID: PMC1794433 DOI: 10.1186/gb-2006-7-12-r120] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2006] [Revised: 10/25/2006] [Accepted: 12/20/2006] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Transposable elements (TEs) represent more than 45% of the human and mouse genomes. Both parasitic and mutualistic features have been shown to apply to the host-TE relationship but a comprehensive scenario of the forces driving TE fixation within mammalian genes is still missing. RESULTS We show that intronic multispecies conserved sequences (MCSs) have been affecting TE integration frequency over time. We verify that a selective economizing pressure has been acting on TEs to decrease their frequency in highly expressed genes. After correcting for GC content, MCS density and intron size, we identified TE-enriched and TE-depleted gene categories. In addition to developmental regulators and transcription factors, TE-depleted regions encompass loci that might require subtle regulation of transcript levels or precise activation timing, such as growth factors, cytokines, hormones, and genes involved in the immune response. The latter, despite having reduced frequencies of most TE types, are significantly enriched in mammalian-wide interspersed repeats (MIRs). Analysis of orthologous genes indicated that MIR over-representation also occurs in dog and opossum immune response genes, suggesting, given the partially independent origin of MIR sequences in eutheria and metatheria, the evolutionary conservation of a specific function for MIRs located in these loci. Consistently, the core MIR sequence is over-represented in defense response genes compared to the background intronic frequency. CONCLUSION Our data indicate that gene function, expression level, and sequence conservation influence TE insertion/fixation in mammalian introns. Moreover, we provide the first report showing that a specific TE family is evolutionarily associated with a gene function category.
Collapse
Affiliation(s)
- Manuela Sironi
- Scientific Institute IRCCS E Medea, Bioinformatic Lab, Via don L Monza, 23842 Bosisio Parini (LC), Italy
| | - Giorgia Menozzi
- Scientific Institute IRCCS E Medea, Bioinformatic Lab, Via don L Monza, 23842 Bosisio Parini (LC), Italy
| | - Giacomo P Comi
- Dino Ferrari Centre, Department of Neurological Sciences, University of Milan, IRCCS Ospedale Maggiore Policlinico, Mangiagalli and Regina Elena Foundation, 20100 Milan, Italy
| | - Matteo Cereda
- Scientific Institute IRCCS E Medea, Bioinformatic Lab, Via don L Monza, 23842 Bosisio Parini (LC), Italy
| | - Rachele Cagliani
- Scientific Institute IRCCS E Medea, Bioinformatic Lab, Via don L Monza, 23842 Bosisio Parini (LC), Italy
| | - Nereo Bresolin
- Scientific Institute IRCCS E Medea, Bioinformatic Lab, Via don L Monza, 23842 Bosisio Parini (LC), Italy
- Dino Ferrari Centre, Department of Neurological Sciences, University of Milan, IRCCS Ospedale Maggiore Policlinico, Mangiagalli and Regina Elena Foundation, 20100 Milan, Italy
| | - Uberto Pozzoli
- Scientific Institute IRCCS E Medea, Bioinformatic Lab, Via don L Monza, 23842 Bosisio Parini (LC), Italy
| |
Collapse
|
40
|
Taylor J, Tyekucheva S, Zody M, Chiaromonte F, Makova KD. Strong and weak male mutation bias at different sites in the primate genomes: insights from the human-chimpanzee comparison. Mol Biol Evol 2005; 23:565-73. [PMID: 16280537 DOI: 10.1093/molbev/msj060] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Male mutation bias is a higher mutation rate in males than in females thought to result from the greater number of germ line cell divisions in males. If errors in DNA replication cause most mutations, then the magnitude of male mutation bias, measured as the male-to-female mutation rate ratio (alpha), should reflect the relative excess of male versus female germ line cell divisions. Evolutionary rates averaged among all sites in a sequence and compared between mammalian sex chromosomes were shown to be indeed higher in males than in females. However, it is presently unknown whether individual classes of substitutions exhibit such bias. To address this issue, we investigated male mutation bias separately at non-CpG and CpG sites using human-chimpanzee whole-genome alignments. We observed strong male mutation bias at non-CpG sites: alpha in the X-autosome comparison was approximately 6-7, which was similar to the male-to-female ratio in the number of germ line cell divisions. In contrast, mutations at CpG sites exhibited weak male mutation bias: alpha in the X-autosome comparison was only approximately 2-3. This is consistent with the methylation-induced and replication-independent mechanism of CpG transitions, which constitute the majority of mutations at CpG sites. Interestingly, our study also indicated weak male mutation bias for transversions at CpG sites, implying a spontaneous mechanism largely not associated with replication. Male mutation bias was equally strong at CpG and non-CpG sites located within unmethylated "CpG islands," suggesting the replication-dependent origin of these mutations. Thus, we found that the strength of male mutation bias is nonuniform in the primate genomes. Importantly, we discovered that male mutation bias depends on the proportion of CpG sites in the loci compared. This might explain the differences in the magnitude of primate male mutation bias observed among studies.
Collapse
Affiliation(s)
- James Taylor
- Department of Computer Science and Engineering, Penn State University, USA
| | | | | | | | | |
Collapse
|