1
|
Lei H, Tao K. Somatic mutations in colorectal cancer are associated with the epigenetic modifications. J Cell Mol Med 2020; 24:11828-11836. [PMID: 32865336 PMCID: PMC7579689 DOI: 10.1111/jcmm.15799] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 07/22/2020] [Accepted: 08/09/2020] [Indexed: 01/13/2023] Open
Abstract
Colorectal cancer (CRC) mostly arises from progressive accumulation of somatic mutations within cells. Most commonly mutated genes like TP53, APC and KRAS can promote survival and proliferation of cancer cells. Although the molecular alterations and landscape of some specific mutations in CRC are well known, the presence of a somatic mutation signature related to genomic regions and epigenetic markers remain unclear. To find the signatures from a random distribution of somatic mutations in CRCs, we carried out enrichment analysis in different genomic regions and identified peaks of epigenetic markers. We validated that the mutation frequency in miRNA is dramatically higher than in flanking genomic regions. Moreover, we observed that somatic mutations in CRC and colon cancer cell lines are significantly enriched in CTCF binding sites. We also found these mutations are enriched for H3K27me3 in both normal sigmoid colon and colon cancer cell lines. Taken together, our findings suggest that there are some common somatic mutations signatures which provide new directions to study CRC.
Collapse
Affiliation(s)
- Hongwei Lei
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Kaixiong Tao
- Department of Gastrointestinal Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
2
|
Samuel B, Dinka H. In silico analysis of the promoter region of olfactory receptors in cattle ( Bos indicus) to understand its gene regulation. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2020; 39:853-865. [PMID: 32028828 DOI: 10.1080/15257770.2020.1711524] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Identifications of transcription start sites (TSSs) and promoter regions are first step to understand the regulation mechanisms of gene expression and association with genetic variations in the regions. This analysis was conducted with the objectives to identify TSSs, determine the promoter regions, identify common candidate motifs and transcription factors (TFs), and search for CpG islands (CGIs) in cattle olfactory receptors (ORs) genes promoter regions. In the analysis, TSSs of cattle olfactory genes were first identified. The locations for 60% of the TSSs were below -500 bp relative to the start codon and five candidate motifs (MOR1, MOR2, MOR3, MOR4, and MOR5) were identified that are shared by at least 50% of the cattle ORs promoter input sequences from both strands. Among the five candidate motifs, MOR4 was revealed as the common promoter motif for 85.71% of cattle ORs genes that serves as binding sites for TFs involved in the expression regulation of these genes. MOR4 was also compared to registered motifs in publically available databases to see if they are similar to known regulatory motifs for TF by using the TOMTOM web application. Hence, it was revealed that MOR4 may serve as the binding site mainly for the Zinc finger (ZNF) TF gene family to regulate expression of cattle ORs genes. Further gene ontology analysis for MOR4 demonstrated ORs belong to the G-protein-coupled receptor superfamily and MOR4 tend to be located near the genes involved in the detection of chemical stimulus involved in sensory perception and in innate immune responses such as cytokine-mediated signaling. In silico digestion of cattle OR sequences was performed using restriction enzyme MspI. CGIs from OR10K1 and OR2L13 gene was found. In the present analysis, the poor CGIs observed might suggest their gene expression regulation pattern is in tissue specific manner.
Collapse
Affiliation(s)
- Behailu Samuel
- Department of Applied Biology, School of Applied Natural Sciences, Adama Science and Technology University, Adama, Ethiopia.,Department of Animal Science, Faculty of Agriculture, Salale University, Fitche, Ethiopia
| | - Hunduma Dinka
- Department of Applied Biology, School of Applied Natural Sciences, Adama Science and Technology University, Adama, Ethiopia
| |
Collapse
|
3
|
Aliaga B, Bulla I, Mouahid G, Duval D, Grunau C. Universality of the DNA methylation codes in Eucaryotes. Sci Rep 2019; 9:173. [PMID: 30655579 PMCID: PMC6336885 DOI: 10.1038/s41598-018-37407-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 10/24/2018] [Indexed: 12/26/2022] Open
Abstract
Genetics and epigenetics are tightly linked heritable information classes. Question arises if epigenetics provides just a set of environment dependent instructions, or whether it is integral part of an inheritance system. We argued that in the latter case the epigenetic code should share the universality quality of the genetic code. We focused on DNA methylation. Since availability of DNA methylation data is biased towards model organisms we developed a method that uses kernel density estimations of CpG observed/expected ratios to infer DNA methylation types in any genome. We show here that our method allows for robust prediction of mosaic and full gene body methylation with a PPV of 1 and 0.87, respectively. We used this prediction to complement experimental data, and applied hierarchical clustering to identify methylation types in ~150 eucaryotic species covering different body plans, reproduction types and living conditions. Our analysis indicates that there are only four gene body methylation types. These types do not follow phylogeny (i.e. phylogenetically distant clades can have identical methylation types) but they are consistent within clades. We conclude that the gene body DNA methylation codes have universality similar to the universality of the genetic code and should consequently be considered as part of the inheritance system.
Collapse
Affiliation(s)
- Benoît Aliaga
- University Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, University Montpellier, F-66860, Perpignan, France
| | - Ingo Bulla
- University Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, University Montpellier, F-66860, Perpignan, France
- Institute for Mathematics and Informatics, University of Greifswald, Greifswald, Germany
- Department of Computer Science, ETH Zürich, Zürich, Switzerland
| | - Gabriel Mouahid
- University Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, University Montpellier, F-66860, Perpignan, France
| | - David Duval
- University Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, University Montpellier, F-66860, Perpignan, France
| | - Christoph Grunau
- University Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, University Montpellier, F-66860, Perpignan, France.
| |
Collapse
|
4
|
van de Lagemaat LN, Flenley M, Lynch MD, Garrick D, Tomlinson SR, Kranc KR, Vernimmen D. CpG binding protein (CFP1) occupies open chromatin regions of active genes, including enhancers and non-CpG islands. Epigenetics Chromatin 2018; 11:59. [PMID: 30292235 PMCID: PMC6173865 DOI: 10.1186/s13072-018-0230-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Accepted: 09/28/2018] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The mechanism by which protein complexes interact to regulate the deposition of post-translational modifications of histones remains poorly understood. This is particularly important at regulatory regions, such as CpG islands (CGIs), which are known to recruit Trithorax (TrxG) and Polycomb group proteins. The CxxC zinc finger protein 1 (CFP1, also known as CGBP) is a subunit of the TrxG SET1 protein complex, a major catalyst of trimethylation of H3K4 (H3K4me3). RESULTS Here, we used ChIP followed by high-throughput sequencing (ChIP-seq) to analyse genomic occupancy of CFP1 in two human haematopoietic cell types. We demonstrate that CFP1 occupies CGIs associated with active transcription start sites (TSSs), and is mutually exclusive with H3K27 trimethylation (H3K27me3), a marker of polycomb repressive complex 2. Strikingly, rather than being restricted to active CGI TSSs, CFP1 also occupies a substantial fraction of active non-CGI TSSs and enhancers of transcribed genes. However, relative to other TrxG subunits, CFP1 was specialised to TSSs. Finally, we found enrichment of CpG-containing DNA motifs in CFP1 peaks at CGI promoters. CONCLUSIONS We found that CFP1 is not solely recruited to CpG islands as it was originally defined, but also other regions including non-CpG island promoters and enhancers.
Collapse
Affiliation(s)
- Louie N. van de Lagemaat
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - Maria Flenley
- MRC Molecular Haematology Unit, Weatherall Institute for Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS UK
| | - Magnus D. Lynch
- MRC Molecular Haematology Unit, Weatherall Institute for Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, OX3 9DS UK
- Centre for Stem Cells and Regenerative Medicine, 28th Floor Guy’s Tower, Great Maze Pond, London, SE1 9RT UK
- st John’s institute of dermatology, Great Maze Pond, London, SE1 9RT UK
| | - David Garrick
- INSERM, UMRS-1126, Institut Universitaire d’Hématologie, Université Paris Diderot, 75010 Paris, France
| | - Simon R. Tomlinson
- MRC Centre for Regenerative Medicine, University of Edinburgh, 5 Little France Drive, Edinburgh, EH16 4UU UK
| | - Kamil R. Kranc
- MRC Centre for Regenerative Medicine, University of Edinburgh, 5 Little France Drive, Edinburgh, EH16 4UU UK
- Laboratory of Haematopoietic Stem Cell & Leukaemia Biology, Centre for Haemato-Oncology, Barts Cancer Institute, Queen Mary University of London, Charterhouse Square, London, EC1M 6BQ UK
| | - Douglas Vernimmen
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| |
Collapse
|
5
|
Jeong H, Wu X, Smith B, Yi SV. Genomic Landscape of Methylation Islands in Hymenopteran Insects. Genome Biol Evol 2018; 10:2766-2776. [PMID: 30239702 PMCID: PMC6195173 DOI: 10.1093/gbe/evy203] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/13/2018] [Indexed: 01/31/2023] Open
Abstract
Recent genome-wide DNA methylation analyses of insect genomes accentuate an intriguing contrast compared with those in mammals. In mammals, most CpGs are heavily methylated, with the exceptions of clusters of hypomethylated sites referred to as CpG islands. In contrast, DNA methylation in insects is localized to a small number of CpG sites. Here, we refer to clusters of methylated CpGs as “methylation islands (MIs),” and investigate their characteristics in seven hymenopteran insects with high-quality bisulfite sequencing data. Methylation islands were primarily located within gene bodies. They were significantly overrepresented in exon–intron boundaries, indicating their potential roles in splicing. Methylated CpGs within MIs exhibited stronger evolutionary conservation compared with those outside of MIs. Additionally, genes harboring MIs exhibited higher and more stable levels of gene expression compared with those that do not harbor MIs. The effects of MIs on evolutionary conservation and gene expression are independent and stronger than the effect of DNA methylation alone. These results indicate that MIs may be useful to gain additional insights into understanding the role of DNA methylation in gene expression and evolutionary conservation in invertebrate genomes.
Collapse
Affiliation(s)
- Hyeonsoo Jeong
- School of Biological Sciences, Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia
| | - Xin Wu
- School of Biological Sciences, Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia
| | - Brandon Smith
- School of Biological Sciences, Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia
| | - Soojin V Yi
- School of Biological Sciences, Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia
| |
Collapse
|
6
|
Bulla I, Aliaga B, Lacal V, Bulla J, Grunau C, Chaparro C. Notos - a galaxy tool to analyze CpN observed expected ratios for inferring DNA methylation types. BMC Bioinformatics 2018; 19:105. [PMID: 29587630 PMCID: PMC5870242 DOI: 10.1186/s12859-018-2115-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2017] [Accepted: 03/13/2018] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND DNA methylation patterns store epigenetic information in the vast majority of eukaryotic species. The relatively high costs and technical challenges associated with the detection of DNA methylation however have created a bias in the number of methylation studies towards model organisms. Consequently, it remains challenging to infer kingdom-wide general rules about the functions and evolutionary conservation of DNA methylation. Methylated cytosine is often found in specific CpN dinucleotides, and the frequency distributions of, for instance, CpG observed/expected (CpG o/e) ratios have been used to infer DNA methylation types based on higher mutability of methylated CpG. RESULTS Predominantly model-based approaches essentially founded on mixtures of Gaussian distributions are currently used to investigate questions related to the number and position of modes of CpG o/e ratios. These approaches require the selection of an appropriate criterion for determining the best model and will fail if empirical distributions are complex or even merely moderately skewed. We use a kernel density estimation (KDE) based technique for robust and precise characterization of complex CpN o/e distributions without a priori assumptions about the underlying distributions. CONCLUSIONS We show that KDE delivers robust descriptions of CpN o/e distributions. For straightforward processing, we have developed a Galaxy tool, called Notos and available at the ToolShed, that calculates these ratios of input FASTA files and fits a density to their empirical distribution. Based on the estimated density the number and shape of modes of the distribution is determined, providing a rational for the prediction of the number and the types of different methylation classes. Notos is written in R and Perl.
Collapse
Affiliation(s)
- Ingo Bulla
- Institut für Mathematik und Informatik, Universität Greifswald, Walther-Rathenau-Str. 47, Greifswald, 17487 Germany
- Theoretical Biology and Biophysics, Group T-6, Los Alamos National Laboratory, New Mexico, Los Alamos USA
| | - Benoît Aliaga
- Univ. Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, Univ. Montpellier, 58 Avenue Paul Alduy, Perpignan, 66860 France
| | - Virginia Lacal
- Department of Mathematics, University of Bergen, P.O. Box 7803, Bergen, 5020 Norway
| | - Jan Bulla
- Department of Mathematics, University of Bergen, P.O. Box 7803, Bergen, 5020 Norway
| | - Christoph Grunau
- Univ. Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, Univ. Montpellier, 58 Avenue Paul Alduy, Perpignan, 66860 France
| | - Cristian Chaparro
- Univ. Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, Univ. Montpellier, 58 Avenue Paul Alduy, Perpignan, 66860 France
| |
Collapse
|
7
|
Dahlhaus R. Of Men and Mice: Modeling the Fragile X Syndrome. Front Mol Neurosci 2018; 11:41. [PMID: 29599705 PMCID: PMC5862809 DOI: 10.3389/fnmol.2018.00041] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 01/31/2018] [Indexed: 12/26/2022] Open
Abstract
The Fragile X Syndrome (FXS) is one of the most common forms of inherited intellectual disability in all human societies. Caused by the transcriptional silencing of a single gene, the fragile x mental retardation gene FMR1, FXS is characterized by a variety of symptoms, which range from mental disabilities to autism and epilepsy. More than 20 years ago, a first animal model was described, the Fmr1 knock-out mouse. Several other models have been developed since then, including conditional knock-out mice, knock-out rats, a zebrafish and a drosophila model. Using these model systems, various targets for potential pharmaceutical treatments have been identified and many treatments have been shown to be efficient in preclinical studies. However, all attempts to turn these findings into a therapy for patients have failed thus far. In this review, I will discuss underlying difficulties and address potential alternatives for our future research.
Collapse
Affiliation(s)
- Regina Dahlhaus
- Institute for Biochemistry, Emil-Fischer Centre, University of Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
8
|
Dinka H, Le MT. Analysis of Pig Vomeronasal Receptor Type 1 (V1R) Promoter Region Reveals a Common Promoter Motif but Poor CpG Islands. Anim Biotechnol 2017; 29:293-300. [PMID: 29120694 DOI: 10.1080/10495398.2017.1383915] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Promoters are, generally, located immediately upstream of a transcription start site (TSS) and have a variety of regulatory motifs, such as transcription factors (TFs) and CpG islands (CGIs), that participate in the regulation of gene expression. Here analysis of the promoter region for pig vomeronasal receptor type 1 (V1R) was described. In the analysis, TSSs for pig V1R genes was first identified and five motifs (MV1, MV2, MV3, MV4, and MV5) were found that are shared by at least 50% of the pig V1R promoter input sequences from both strands. Among the five motifs, MV2 was identified as a common promoter motif shared by all (100%) pig V1R promoters. For further analysis, to better characterize and get deeper biological insight associated with MV2, TOMTOM web application was used. MV2 was compared to the known motif databases (such as JASPAR) to see if they are similar to a known regulatory motif (transcription factor). Hence, it was revealed that MV2 serves as the binding site mainly for the BetaBetaAlpha-zinc finger (BTB-ZF) transcription factor gene family to regulate expression of pig V1R genes. Moreover, it was shown that pig V1R promoters are CpG poor, suggesting that their gene expression regulation pattern is in tissue specific manner.
Collapse
Affiliation(s)
- Hunduma Dinka
- a Department of Applied Biology, School of Applied Natural Sciences , Adama Science and Technology University , Adama , Ethiopia.,b Department of Animal Biotechnology , Konkuk University , Seoul , South Korea
| | - Minh Thong Le
- b Department of Animal Biotechnology , Konkuk University , Seoul , South Korea
| |
Collapse
|
9
|
Maldonado LL, Assis J, Araújo FMG, Salim ACM, Macchiaroli N, Cucher M, Camicia F, Fox A, Rosenzvit M, Oliveira G, Kamenetzky L. The Echinococcus canadensis (G7) genome: a key knowledge of parasitic platyhelminth human diseases. BMC Genomics 2017; 18:204. [PMID: 28241794 PMCID: PMC5327563 DOI: 10.1186/s12864-017-3574-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2016] [Accepted: 02/09/2017] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The parasite Echinococcus canadensis (G7) (phylum Platyhelminthes, class Cestoda) is one of the causative agents of echinococcosis. Echinococcosis is a worldwide chronic zoonosis affecting humans as well as domestic and wild mammals, which has been reported as a prioritized neglected disease by the World Health Organisation. No genomic data, comparative genomic analyses or efficient therapeutic and diagnostic tools are available for this severe disease. The information presented in this study will help to understand the peculiar biological characters and to design species-specific control tools. RESULTS We sequenced, assembled and annotated the 115-Mb genome of E. canadensis (G7). Comparative genomic analyses using whole genome data of three Echinococcus species not only confirmed the status of E. canadensis (G7) as a separate species but also demonstrated a high nucleotide sequences divergence in relation to E. granulosus (G1). The E. canadensis (G7) genome contains 11,449 genes with a core set of 881 orthologs shared among five cestode species. Comparative genomics revealed that there are more single nucleotide polymorphisms (SNPs) between E. canadensis (G7) and E. granulosus (G1) than between E. canadensis (G7) and E. multilocularis. This result was unexpected since E. canadensis (G7) and E. granulosus (G1) were considered to belong to the species complex E. granulosus sensu lato. We described SNPs in known drug targets and metabolism genes in the E. canadensis (G7) genome. Regarding gene regulation, we analysed three particular features: CpG island distribution along the three Echinococcus genomes, DNA methylation system and small RNA pathway. The results suggest the occurrence of yet unknown gene regulation mechanisms in Echinococcus. CONCLUSIONS This is the first work that addresses Echinococcus comparative genomics. The resources presented here will promote the study of mechanisms of parasite development as well as new tools for drug discovery. The availability of a high-quality genome assembly is critical for fully exploring the biology of a pathogenic organism. The E. canadensis (G7) genome presented in this study provides a unique opportunity to address the genetic diversity among the genus Echinococcus and its particular developmental features. At present, there is no unequivocal taxonomic classification of Echinococcus species; however, the genome-wide SNPs analysis performed here revealed the phylogenetic distance among these three Echinococcus species. Additional cestode genomes need to be sequenced to be able to resolve their phylogeny.
Collapse
Affiliation(s)
- Lucas L. Maldonado
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| | - Juliana Assis
- Genomics and Computational Biology Group, René Rachou Research Center, Oswaldo Cruz Foundation, Belo Horizonte, Brazil
| | - Flávio M. Gomes Araújo
- Genomics and Computational Biology Group, René Rachou Research Center, Oswaldo Cruz Foundation, Belo Horizonte, Brazil
| | - Anna C. M. Salim
- Genomics and Computational Biology Group, René Rachou Research Center, Oswaldo Cruz Foundation, Belo Horizonte, Brazil
| | - Natalia Macchiaroli
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| | - Marcela Cucher
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| | - Federico Camicia
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| | - Adolfo Fox
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| | - Mara Rosenzvit
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| | - Guilherme Oliveira
- Genomics and Computational Biology Group, René Rachou Research Center, Oswaldo Cruz Foundation, Belo Horizonte, Brazil
- Instituto Tecnológico Vale, Belém, Brazil
| | - Laura Kamenetzky
- IMPaM, CONICET, Facultad de Medicina, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
| |
Collapse
|
10
|
Brazel AJ, Vernimmen D. The complexity of epigenetic diseases. J Pathol 2015; 238:333-44. [PMID: 26419725 PMCID: PMC4982038 DOI: 10.1002/path.4647] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Revised: 09/10/2015] [Accepted: 09/21/2015] [Indexed: 12/29/2022]
Abstract
Over the past 30 years, a plethora of pathogenic mutations affecting enhancer regions and epigenetic regulators have been identified. Coupled with more recent genome‐wide association studies (GWAS) and epigenome‐wide association studies (EWAS) implicating major roles for regulatory mutations in disease, it is clear that epigenetic mechanisms represent important biomarkers for disease development and perhaps even therapeutic targets. Here, we discuss the diversity of disease‐causing mutations in enhancers and epigenetic regulators, with a particular focus on cancer. © 2015 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.
Collapse
Affiliation(s)
- Ailbhe Jane Brazel
- The Roslin Institute, Developmental Biology Division, University of Edinburgh, Easter Bush, Midlothian, UK
| | - Douglas Vernimmen
- The Roslin Institute, Developmental Biology Division, University of Edinburgh, Easter Bush, Midlothian, UK
| |
Collapse
|
11
|
Tsiagkas G, Nikolaou C, Almirantis Y. Orphan and gene related CpG Islands follow power-law-like distributions in several genomes: evidence of function-related and taxonomy-related modes of distribution. Comput Biol Chem 2014; 53 Pt A:84-96. [PMID: 25242375 DOI: 10.1016/j.compbiolchem.2014.08.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
CpG Islands (CGIs) are compositionally defined short genomic stretches, which have been studied in the human, mouse, chicken and later in several other genomes. Initially, they were assigned the role of transcriptional regulation of protein-coding genes, especially the house-keeping ones, while more recently there is found evidence that they are involved in several other functions as well, which might include regulation of the expression of RNA genes, DNA replication etc. Here, an investigation of their distributional characteristics in a variety of genomes is undertaken for both whole CGI populations as well as for CGI subsets that lie away from known genes (gene-unrelated or "orphan" CGIs). In both cases power-law-like linearity in double logarithmic scale is found. An evolutionary model, initially put forward for the explanation of a similar pattern found in gene populations is implemented. It includes segmental duplication events and eliminations of most of the duplicated CGIs, while a moderate rate of non-duplicated CGI eliminations is also applied in some cases. Simulations reproduce all the main features of the observed inter-CGI chromosomal size distributions. Our results on power-law-like linearity found in orphan CGI populations suggest that the observed distributional pattern is independent of the analogous pattern that protein coding segments were reported to follow. The power-law-like patterns in the genomic distributions of CGIs described herein are found to be compatible with several other features of the composition, abundance or functional role of CGIs reported in the current literature across several genomes, on the basis of the proposed evolutionary model.
Collapse
Affiliation(s)
- Giannis Tsiagkas
- Institute of Biosciences and Applications, National Center for Scientific Research "Demokritos", 15310 Athens, Greece
| | - Christoforos Nikolaou
- Computational Genomics Group, Department of Biology, University of Crete, 71409 Heraklion, Greece
| | - Yannis Almirantis
- Institute of Biosciences and Applications, National Center for Scientific Research "Demokritos", 15310 Athens, Greece.
| |
Collapse
|
12
|
Liao BY, Chang A. Accumulation of CTCF-binding sites drives expression divergence between tandemly duplicated genes in humans. BMC Genomics 2014; 15 Suppl 1:S8. [PMID: 24564680 PMCID: PMC4046690 DOI: 10.1186/1471-2164-15-s1-s8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Background During eukaryotic genome evolution, tandem gene duplication is the most frequent event giving rise to clustered gene families. However, how expression divergence between tandemly duplicated genes has emerged and maintained remain unclear. In particular, it is unknown if epigenetic regulators have been involved in the process. Results We demonstrate that CCCTC-binding factor (CTCF), the master epigenetic regulator and the only known insulator protein in humans, has played a predominant role in generating divergence in both expression profiles and expression levels between adjacent paralogs in the human genome. This phenomenon was not observed for non-paralogous adjacent genes. After tandem duplication events, CTCF-binding sites gradually accumulate between paralogs. This trend was more prominent for genes involved in particular functions. Conclusions The accumulation of CTCF-binding sites drives expression divergence of tandemly duplicated genes. This process is likely targeted by natural selection. Our study reveals the importance of CTCF to the evolution of animal diversity and complexity. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-S1-S8) contains supplementary material, which is available to authorized users.
Collapse
|
13
|
Abstract
The myelodysplastic syndrome (MDS) is a clonal hematologic disorder that frequently evolves to acute myeloid leukemia (AML). Its pathogenesis remains unclear, but mutations in epigenetic modifiers are common and the disease often responds to DNA methylation inhibitors. We analyzed DNA methylation in the bone marrow and spleen in two mouse models of MDS/AML, the NUP98-HOXD13 (NHD13) mouse and the RUNX1 mutant mouse model. Methylation array analysis showed an average of 512/3445 (14.9%) genes hypermethylated in NHD13 MDS, and 331 (9.6%) genes hypermethylated in RUNX1 MDS. Thirty-two percent of genes in common between the two models (2/3 NHD13 mice and 2/3 RUNX1 mice) were also hypermethylated in at least two of 19 human MDS samples. Detailed analysis of 41 genes in mice showed progressive drift in DNA methylation from young to old normal bone marrow and spleen; to MDS, where we detected accelerated age-related methylation; and finally to AML, which markedly extends DNA methylation abnormalities. Most of these genes showed similar patterns in human MDS and AML. Repeat element hypomethylation was rare in MDS but marked the transition to AML in some cases. Our data show consistency in patterns of aberrant DNA methylation in human and mouse MDS and suggest that epigenetically, MDS displays an accelerated aging phenotype.
Collapse
|
14
|
Feuerbach L, Halachev K, Assenov Y, Müller F, Bock C, Lengauer T. Analyzing epigenome data in context of genome evolution and human diseases. Methods Mol Biol 2012; 856:431-67. [PMID: 22399470 DOI: 10.1007/978-1-61779-585-5_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
This chapter describes bioinformatic tools for analyzing epigenome differences between species and in diseased versus normal cells. We illustrate the interplay of several Web-based tools in a case study of CpG island evolution between human and mouse. Starting from a list of orthologous genes, we use the Galaxy Web service to obtain gene coordinates for both species. These data are further analyzed in EpiGRAPH, a Web-based tool that identifies statistically significant epigenetic differences between genome region sets. Finally, we outline how the use of the statistical programming language R enables deeper insights into the epigenetics of human diseases, which are difficult to obtain without writing custom scripts. In summary, our tutorial describes how Web-based tools provide an easy entry into epigenome data analysis while also highlighting the benefits of learning a scripting language in order to unlock the vast potential of public epigenome datasets.
Collapse
|
15
|
Heller G, Weinzierl M, Noll C, Babinsky V, Ziegler B, Altenberger C, Minichsdorfer C, Lang G, Döme B, End-Pfützenreuter A, Arns BM, Grin Y, Klepetko W, Zielinski CC, Zöchbauer-Müller S. Genome-Wide miRNA Expression Profiling Identifies miR-9-3 and miR-193a as Targets for DNA Methylation in Non–Small Cell Lung Cancers. Clin Cancer Res 2012; 18:1619-29. [DOI: 10.1158/1078-0432.ccr-11-2450] [Citation(s) in RCA: 134] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
16
|
Chang AYF, Liao BY. DNA methylation rebalances gene dosage after mammalian gene duplications. Mol Biol Evol 2011; 29:133-44. [PMID: 21821837 DOI: 10.1093/molbev/msr174] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Although gene duplication plays a major role in organismal evolution, it may also lead to gene dosage imbalance, thereby having an immediate adverse effect on an organism's fitness. Investigating the evolution of the expression patterns of genes that duplicated after the divergence of rodents and primates, we confirm that adaptive evolution has been involved in dosage rebalance after gene duplication. To understand mechanisms underlying this process, we examined 1) microRNA (miRNA)-mediated gene regulation, 2) cis-regulatory sequence modifications, and 3) DNA methylation. Neither miRNA-mediated regulation nor cis-regulatory changes was found to be associated with expression reduction of duplicate genes. By contrast, duplicate genes, especially lowly expressed copies, were heavily methylated in the upstream region. However, for duplicate genes encoding proteins that are members of macromolecular complexes, heavy methylation in the genic region was not consistently observed. This result held after controlling potential confounding factors, such as enrichment in functional categories. Our results suggest that during mammalian evolution, DNA methylation plays a dominant role in dosage rebalance after gene duplication by inhibiting transcription initiation of duplicate genes.
Collapse
Affiliation(s)
- Andrew Ying-Fei Chang
- Division of Biostatistics & Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Taiwan, Republic of China
| | | |
Collapse
|
17
|
Li M, Chen SS. The tendency to recreate ancestral CG dinucleotides in the human genome. BMC Evol Biol 2011; 11:3. [PMID: 21208429 PMCID: PMC3025853 DOI: 10.1186/1471-2148-11-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Accepted: 01/05/2011] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND The CG dinucleotides are known to be deficient in the human genome, due to a high mutation rate from 5-methylated CG to TG and its complementary pair CA. Meanwhile, many cellular functions rely on these CG dinucleotides, such as gene expression controlled by cytosine methylation status. Thus, CG dinucleotides that provide essential functional substrates should be retained in genomes. How these two conflicting processes regarding the fate of CG dinucleotides - i.e., high mutation rate destroying CG dinucleotides, vs. functional processes that require their preservation remains an unsolved question. RESULTS By analyzing the mutation and frequency spectrum of newly derived alleles in the human genome, a tendency towards generating more CGs was observed, which was mainly contributed by an excess number of mutations from CA/TG to CG. Simultaneously, we found a fixation preference for CGs derived from TG/CA rather than CGs generated by other dinucleotides. These tendencies were observed both in intergenic and genic regions. An analysis of Integrated Extended Haplotype Homozygosity provided no evidence of selection for newly derived CGs. CONCLUSIONS Ancestral CG dinucleotides that were subsequently lost by mutation tend to be recreated in the human genome, as indicated by a biased mutation and fixation pattern favoring new CGs that derived from TG/CA.
Collapse
Affiliation(s)
- Mingkun Li
- CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes of Biological Sciences, Chinese Academy of Sciences, 200000 Shanghai, PR China.
| | | |
Collapse
|
18
|
Hutter B, Paulsen M, Helms V. Identifying CpG islands by different computational techniques. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010; 13:153-64. [PMID: 19196100 DOI: 10.1089/omi.2008.0046] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
CpG islands (CGIs) are generally regarded as important epigenetic regulatory elements due to their association with promoter regions. However, identification of functional CGIs is hampered by repetitive elements and species-specific particularities. Here, we compared the performance of different CGI detection programs on genomic sequences of human and mouse genes. Although mouse CGIs are shorter and G+C poorer than their human counterparts, the different tools tested in our study reliably identify CGIs in promoter regions in both species. Our study confirms that substantially fewer murine than human CGIs coincide with repetitive elements and indicates that such CGIs are subject to accelerated cytosine deamination. In addition, CpG depletion appears to anticorrelate with the epigenetic features of functional regulatory CGIs. Taking into account different deamination rates in unmethylated CGIs versus those in methylated CGIs might support the detection of functional CGIs in other species for which there is little epigenetic information available.
Collapse
Affiliation(s)
- Barbara Hutter
- Lehrstuhl für Computational Biology, Universität des Saarlandes, Saarbrücken, Germany
| | | | | |
Collapse
|
19
|
Comparative analysis of CpG islands in four fish genomes. Comp Funct Genomics 2010:565631. [PMID: 18483567 PMCID: PMC2375969 DOI: 10.1155/2008/565631] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2008] [Accepted: 04/01/2008] [Indexed: 11/17/2022] Open
Abstract
There has been much interest in CpG islands (CGIs), clusters of CpG dinucleotides in GC-rich regions, because they are considered gene markers and involved in gene regulation. To date, there has been no genome-wide analysis of CGIs in the fish genome. We first evaluated the performance of three popular CGI identification algorithms in four fish genomes (tetraodon, stickleback, medaka, and zebrafish). Our results suggest that Takai and Jones' (2002) algorithm is most suitable for comparative analysis of CGIs in the fish genome. Then, we performed a systematic analysis of CGIs in the four fish genomes using Takai and Jones' algorithm, compared to other vertebrate genomes. We found that both the number of CGIs and the CGI density vary greatly among these genomes. Remarkably, each fish genome presents a distinct distribution of CGI density with some genomic factors (e.g., chromosome size and chromosome GC content). These findings are helpful for understanding evolution of fish genomes and the features of fish CGIs.
Collapse
|
20
|
Abstract
DNA methylation as part of the epigenetic gene-silencing complex is a universal occurring change in lung cancer. Numerous studies investigated methylation of specific genes in primary tumors, in serum or plasma samples, and in specimens from the aerodigestive tract epithelium of lung cancer patients. In most studies, single genes or small numbers of genes were analyzed. Moreover, it has been observed that methylation of certain genes can already be detected in samples from the upper aerodigestive tract epithelium of cancer-free heavy smokers. These findings indicated that methylation of certain genes may be a useful biomarker for prognosis, disease recurrence, early detection, and lung cancer risk assessment. So far, several genes were identified which seem to be of worse prognostic relevance when they were found to be methylated. In addition, it has been shown that a panel of markers may be relevant to predict disease recurrence after surgery. In comparison to analysis of single or small numbers of genes, methods for genome-wide detection of methylation were developed recently. These approaches are focused on either pharmacological re-activation of methylated genes followed by expression microarray analysis or on microarray analysis of sodium bisulfite-treated or affinity-enriched methylated DNA sequences. With currently available methods for the simultaneous detection of methylation, up to 28,000 CpG islands can be analyzed. Overall, we are just at the beginning of translating these findings into the clinic and there is hope that future patients will benefit from these results.
Collapse
|
21
|
Maegawa S, Hinkal G, Kim HS, Shen L, Zhang L, Zhang J, Zhang N, Liang S, Donehower LA, Issa JPJ. Widespread and tissue specific age-related DNA methylation changes in mice. Genome Res 2010; 20:332-40. [PMID: 20107151 DOI: 10.1101/gr.096826.109] [Citation(s) in RCA: 379] [Impact Index Per Article: 27.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Aberrant methylation of promoter CpG islands in cancer is associated with silencing of tumor-suppressor genes, and age-dependent hypermethylation in normal appearing mucosa may be a risk factor for human colon cancer. It is not known whether this age-related DNA methylation phenomenon is specific to human tissues. We performed comprehensive DNA methylation profiling of promoter regions in aging mouse intestine using methylated CpG island amplification in combination with microarray analysis. By comparing C57BL/6 mice at 3-mo-old versus 35-mo-old for 3627 detectable autosomal genes, we found 774 (21%) that showed increased methylation and 466 (13%) that showed decreased methylation. We used pyrosequencing to quantitatively validate the microarray data and confirmed linear age-related methylation changes for all 12 genomic regions examined. We then examined 11 changed genomic loci for age-related methylation in other tissues. Of these, three of 11 showed similar changes in lung, seven of 11 changed in liver, and six of 11 changed in spleen, though to a lower degree than the changes seen in colon. There was partial conservation between age-related hypermethylation in human and mouse intestines, and Polycomb targets in embryonic stem cells were enriched among the hypermethylated genes. Our findings demonstrate a surprisingly high rate of hyper- and hypomethylation as a function of age in normal mouse small intestine tissues and a strong tissue-specificity to the process. We conclude that epigenetic deregulation is a common feature of aging in mammals.
Collapse
Affiliation(s)
- Shinji Maegawa
- Department of Leukemia, The University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Han L, Zhao Z. Contrast features of CpG islands in the promoter and other regions in the dog genome. Genomics 2009; 94:117-24. [PMID: 19409480 PMCID: PMC2729786 DOI: 10.1016/j.ygeno.2009.04.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 04/21/2009] [Accepted: 04/23/2009] [Indexed: 10/20/2022]
Abstract
The recent release of the domestic dog genome provides us with an ideal opportunity to investigate dog-specific genomic features. In this study, we performed a systematic analysis of CpG islands (CGIs), which are often considered gene markers, in the dog genome. Relative to the human and mouse genomes, the dog genome has a remarkably large number of CGIs and high CGI density, which is contributed by its noncoding sequences. Surprisingly, the dog genome has fewer CGIs associated with the promoter regions of genes than the human or the mouse. Further examination of functional features of dog-human-mouse homologous genes suggests that the dog might have undergone a faster erosion rate of promoter-associated CGIs than the human or mouse. Some genetic or genomic factors such as local recombination rate and karyotype may be related to the unique dog CGI features.
Collapse
Affiliation(s)
- Leng Han
- Department of Psychiatry and Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Graduate School, Chinese Academy of Sciences, Beijing 100039, China
| | - Zhongming Zhao
- Department of Psychiatry and Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
- Department of Human and Molecular Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
23
|
CpG islands: algorithms and applications in methylation studies. Biochem Biophys Res Commun 2009; 382:643-5. [PMID: 19302978 DOI: 10.1016/j.bbrc.2009.03.076] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2009] [Accepted: 03/12/2009] [Indexed: 02/04/2023]
Abstract
Methylation occurs frequently at 5'-cytosine of the CpG dinucleotides in vertebrate genomes; however, this epigenetic feature is rarely observed in CpG islands (CGIs) or CpG clusters in the promoter regions of genes. Aberrant methylation of the promoter-associated CGIs might influence gene expression and cause carcinogenesis. Because of the functional importance, multiple algorithms have been available for identifying CGIs in a genome or a sequence. They can be categorized into the traditional algorithms (e.g., Gardiner-Garden and Frommer (1987), Takai and Jones (2002), and CpGPRoD (2002)) or statistical property based algorithms (CpGcluster (2006) and CG cluster (2007)). We reviewed the features of these algorithms and evaluated their performance on identifying functional CGIs using genome-wide methylation data. Moreover, identification of CGIs is an initial step in many recent studies for predicting methylation status as well as in the design of methylation detection platforms. We reviewed the benchmarks and features used in these studies.
Collapse
|
24
|
Han L, Su B, Li WH, Zhao Z. CpG island density and its correlations with genomic features in mammalian genomes. Genome Biol 2008; 9:R79. [PMID: 18477403 PMCID: PMC2441465 DOI: 10.1186/gb-2008-9-5-r79] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2008] [Revised: 04/08/2008] [Accepted: 05/13/2008] [Indexed: 11/25/2022] Open
Abstract
A systematic analysis of CpG islands in ten mammalian genomes suggests that an increase in chromosome number elevates GC content and prevents loss of CpG islands. Background CpG islands, which are clusters of CpG dinucleotides in GC-rich regions, are considered gene markers and represent an important feature of mammalian genomes. Previous studies of CpG islands have largely been on specific loci or within one genome. To date, there seems to be no comparative analysis of CpG islands and their density at the DNA sequence level among mammalian genomes and of their correlations with other genome features. Results In this study, we performed a systematic analysis of CpG islands in ten mammalian genomes. We found that both the number of CpG islands and their density vary greatly among genomes, though many of these genomes encode similar numbers of genes. We observed significant correlations between CpG island density and genomic features such as number of chromosomes, chromosome size, and recombination rate. We also observed a trend of higher CpG island density in telomeric regions. Furthermore, we evaluated the performance of three computational algorithms for CpG island identifications. Finally, we compared our observations in mammals to other non-mammal vertebrates. Conclusion Our study revealed that CpG islands vary greatly among mammalian genomes. Some factors such as recombination rate and chromosome size might have influenced the evolution of CpG islands in the course of mammalian evolution. Our results suggest a scenario in which an increase in chromosome number increases the rate of recombination, which in turn elevates GC content to help prevent loss of CpG islands and maintain their density. These findings should be useful for studying mammalian genomes, the role of CpG islands in gene function, and molecular evolution.
Collapse
Affiliation(s)
- Leng Han
- Department of Psychiatry, Virginia Commonwealth University, Richmond, VA 23298, USA.
| | | | | | | |
Collapse
|
25
|
DNA sequence and structural properties as predictors of human and mouse promoters. Gene 2007; 410:165-76. [PMID: 18234453 PMCID: PMC2672154 DOI: 10.1016/j.gene.2007.12.011] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Revised: 11/30/2007] [Accepted: 12/05/2007] [Indexed: 11/21/2022]
Abstract
Promoters play a central role in gene regulation, yet our power to discriminate them from non-promoter sequences in higher eukaryotes is mainly restricted to those associated with CpG islands. Here, we examined in silico the promoters of 30,954 human and 18,083 mouse transcripts in the DBTSS database, to assess the impact of particular sequence and structural features (propeller twist, bendability and nucleosome positioning preference) on promoter classification and prediction. Our analysis showed that a stricter-than-traditional definition of CpG islands captures low and high CpG count promoter classes more accurately than the traditional one. We observed that both human and mouse promoter sequences are flexible with the exception of the TATA box and TSS, which are rigid regions irrespective of association with a CpG island. Therefore varying levels of structural flexibility in promoters may affect their accessibility to proteins, and hence their specificity. For all features investigated, averaged values across core promoters discriminated CpG island associated promoters from background, whereas the same did not hold for promoters without a CpG island. However, local changes around - 34 to - 23 (expected position of TATA box) and the TSS were informative in discriminating promoters (both classes) from non-promoter sequences. Additionally, we investigated ATG deserts and observed that they occur in all promoter sets except those with a TATA-box and without a CpG island in human. Interestingly, all mouse promoter sets showed ATG codon depletion irrespective of the presence of a TATA-box, possibly reflecting a weaker contribution to TSS specificity in mouse.
Collapse
|
26
|
Sakamoto H, Suzuki M, Abe T, Hosoyama T, Himeno E, Tanaka S, Greally JM, Hattori N, Yagi S, Shiota K. Cell type-specific methylation profiles occurring disproportionately in CpG-less regions that delineate developmental similarity. Genes Cells 2007; 12:1123-32. [PMID: 17903172 DOI: 10.1111/j.1365-2443.2007.01120.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Our previous studies using restriction landmark genomic scanning (RLGS) defined tissue- or cell-specific DNA methylation profiles. It remains to be determined whether the DNA sequence compositions in the genomic contexts of the NotI loci tested by RLGS influence their tendency to change with differentiation. We carried out 3834 methylation measurements consisting of 213 NotI loci in the mouse genome in 18 different tissues and cell types, using quantitative real-time PCR based on a Virtual image rlgs database. Loci were categorized as CpG islands or other, and as unique or repetitive sequences, each category being associated with a variety of methylation categories. Strikingly, the tissue-dependently and differentially methylated regions (T-DMRs) were disproportionately distributed in the non-CpG island loci. These loci were located not only in 5'-upstream regions of genes but also in intronic and non-genic regions. Hierarchical clustering of the methylation profiles could be used to define developmental similarity and cellular phenotypes. The results show that distinctive tissue- and cell type-specific methylation profiles by RLGS occur mostly at NotI sites located at non-CpG island sequences, which delineate developmental similarity of different cell types. The finding indicates the power of NotI methylation profiles in evaluating the relatedness of different cell types.
Collapse
Affiliation(s)
- Hideki Sakamoto
- Cellular Biochemistry, Animal Resource Sciences/Veterinary Medical Sciences, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Glass JL, Thompson RF, Khulan B, Figueroa ME, Olivier EN, Oakley EJ, Van Zant G, Bouhassira EE, Melnick A, Golden A, Fazzari MJ, Greally JM. CG dinucleotide clustering is a species-specific property of the genome. Nucleic Acids Res 2007; 35:6798-807. [PMID: 17932072 PMCID: PMC2175314 DOI: 10.1093/nar/gkm489] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.
Collapse
Affiliation(s)
- Jacob L. Glass
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Reid F. Thompson
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Batbayar Khulan
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Maria E. Figueroa
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Emmanuel N. Olivier
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Erin J. Oakley
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Gary Van Zant
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Eric E. Bouhassira
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Ari Melnick
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Aaron Golden
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Melissa J. Fazzari
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - John M. Greally
- Department of Molecular Genetics, Department of Developmental and Molecular Biology and Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Division of Hematology/Oncology, University of Kentucky, Markey Cancer Center, 800 Rose Street, Lexington KY 40536, USA, Department of Medicine (Hematology), Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Information Technology, National University of Ireland Galway, Newcastle Road, Galway, Republic of Ireland and Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
- *To whom correspondence should be addressed. +1 718 430 2875+1 718 824 3153
| |
Collapse
|
28
|
Chupov VS, Punina EO, Machs EM, Rodionov AV. Nucleotide composition and CpG and CpNpG content of ITS1, ITS2, and the 5.8S rRNA in representatives of the phylogenetic branches melanthiales-liliales and melanthiales-asparagales (Angiospermae, Monocotyledones) reflect the specifics of their evolution. Mol Biol 2007. [DOI: 10.1134/s002689330705007x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
29
|
Jiang C, Han L, Su B, Li WH, Zhao Z. Features and Trend of Loss of Promoter-Associated CpG Islands in the Human and Mouse Genomes. Mol Biol Evol 2007; 24:1991-2000. [PMID: 17591602 DOI: 10.1093/molbev/msm128] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
CpG islands (CGIs) are often considered as gene markers, but the number of CGIs varies among mammalian genomes that have similar numbers of genes. In this study, we investigated the distribution of CGIs in the promoter regions of 3,197 human-mouse orthologous gene pairs and found that the mouse genome has notably fewer CGIs in the promoter regions and less pronounced CGI characteristics than does the human genome. We further inferred CGI's ancestral state using the dog genome as a reference and examined the nucleotide substitution pattern and the mutational direction in the conserved regions of human and mouse CGIs. The results reveal many losses of CGIs in both genomes but the loss rate in the mouse lineage is two to four times the rate in the human lineage. We found an intriguing feature of CGI loss, namely that the loss of a CGI usually starts from erosion at the both edges and gradually moves towards the center. We found functional bias in the genes that have lost promoter-associated CGIs in the human or mouse lineage. Finally, our analysis indicates that the association of CGIs with housekeeping genes is not as strong as previously estimated. Our study provides a detailed view of the evolution of promoter-associated CGIs in the human and mouse genomes and our findings are helpful for understanding the evolution of mammalian genomes and the role of CGIs in gene function.
Collapse
Affiliation(s)
- Cizhong Jiang
- Department of Psychiatry and Center for the Study of Biological Complexity, Virginia Commonwealth, USA
| | | | | | | | | |
Collapse
|
30
|
Smith B, Fang H, Pan Y, Walker PR, Famili AF, Sikorska M. Evolution of motif variants and positional bias of the cyclic-AMP response element. BMC Evol Biol 2007; 7 Suppl 1:S15. [PMID: 17288573 PMCID: PMC1796609 DOI: 10.1186/1471-2148-7-s1-s15] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Transcription factors regulate gene expression by interacting with their specific DNA binding sites. Some transcription factors, particularly those involved in transcription initiation, always bind close to transcription start sites (TSS). Others have no such preference and are functional on sites even tens of thousands of base pairs (bp) away from the TSS. The Cyclic-AMP response element (CRE) binding protein (CREB) binds preferentially to a palindromic sequence (TGACGTCA), known as the canonical CRE, and also to other CRE variants. CREB can activate transcription at CREs thousands of bp away from the TSS, but in mammals CREs are found far more frequently within 1 to 150 bp upstream of the TSS than in any other region. This property is termed positional bias. The strength of CREB binding to DNA is dependent on the sequence of the CRE motif. The central CpG dinucleotide in the canonical CRE (TGACGTCA) is critical for strong binding of CREB dimers. Methylation of the cytosine in the CpG can inhibit binding of CREB. Deamination of the methylated cytosines causes a C to T transition, resulting in a functional, but lower affinity CRE variant, TGATGTCA. Results We performed genome-wide surveys of CREs in a number of species (from worm to human) and showed that only vertebrates exhibited a CRE positional bias. We performed pair-wise comparisons of human CREs with orthologous sequences in mouse, rat and dog genomes and found that canonical and TGATGTCA variant CREs are highly conserved in mammals. However, when orthologous sequences differ, canonical CREs in human are most frequently TGATGTCA in the other species and vice-versa. We have identified 207 human CREs showing such differences. Conclusion Our data suggest that the positional bias of CREs likely evolved after the separation of urochordata and vertebrata. Although many canonical CREs are conserved among mammals, there are a number of orthologous genes that have canonical CREs in one species but the TGATGTCA variant in another. These differences are likely due to deamination of the methylated cytosines in the CpG and may contribute to differential transcriptional regulation among orthologous genes.
Collapse
Affiliation(s)
- Brandon Smith
- Neurogenomics Group, Institute for Biological Sciences, National Research Council of Canada, Ottawa, Ontario, Canada
| | - Hung Fang
- Glycosyltransferases and Neuroglycomics Group, Institute for Biological Sciences, National Research Council of Canada Ottawa Ontario, Canada
| | - Youlian Pan
- Integrated Reasoning Group, Institute for Information Technology, National Research Council of Canada, Ottawa, Ontario, Canada
| | - P Roy Walker
- Neurogenomics Group, Institute for Biological Sciences, National Research Council of Canada, Ottawa, Ontario, Canada
| | - A Fazel Famili
- Integrated Reasoning Group, Institute for Information Technology, National Research Council of Canada, Ottawa, Ontario, Canada
| | - Marianna Sikorska
- Neurogenesis and Brain Repair Group, Institute for Biological Sciences, National Research Council of Canada, Ottawa, Ontario, Canada
| |
Collapse
|
31
|
Li W, Miramontes P. Large-scale oscillation of structure-related DNA sequence features in human chromosome 21. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2006; 74:021912. [PMID: 17025477 DOI: 10.1103/physreve.74.021912] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2006] [Indexed: 05/12/2023]
Abstract
Human chromosome 21 is the only chromosome in the human genome that exhibits oscillation of the (G+C) content of a cycle length of hundreds kilobases (kb) ( 500 kb near the right telomere). We aim at establishing the existence of a similar periodicity in structure-related sequence features in order to relate this (G+C)% oscillation to other biological phenomena. The following quantities are shown to oscillate with the same 500 kb periodicity in human chromosome 21: binding energy calculated by two sets of dinucleotide-based thermodynamic parameters, AA/TT and AAA/TTT bi- and tri-nucleotide density, 5'-TA-3' dinucleotide density, and signal for 10- or 11-base periodicity of AA/TT or AAA/TTT. These intrinsic quantities are related to structural features of the double helix of DNA molecules, such as base-pair binding, untwisting or unwinding, stiffness, and a putative tendency for nucleosome formation.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore LIJ Health System, 350 Community Drive, Manhasset, New York 11030, USA.
| | | |
Collapse
|
32
|
Das R, Dimitrova N, Xuan Z, Rollins RA, Haghighi F, Edwards JR, Ju J, Bestor TH, Zhang MQ. Computational prediction of methylation status in human genomic sequences. Proc Natl Acad Sci U S A 2006; 103:10713-6. [PMID: 16818882 PMCID: PMC1502297 DOI: 10.1073/pnas.0602949103] [Citation(s) in RCA: 134] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Epigenetic effects in mammals depend largely on heritable genomic methylation patterns. We describe a computational pattern recognition method that is used to predict the methylation landscape of human brain DNA. This method can be applied both to CpG islands and to non-CpG island regions. It computes the methylation propensity for an 800-bp region centered on a CpG dinucleotide based on specific sequence features within the region. We tested several classifiers for classification performance, including K means clustering, linear discriminant analysis, logistic regression, and support vector machine. The best performing classifier used the support vector machine approach. Our program (called hdfinder) presently has a prediction accuracy of 86%, as validated with CpG regions for which methylation status has been experimentally determined. Using hdfinder, we have depicted the entire genomic methylation patterns for all 22 human autosomes.
Collapse
Affiliation(s)
- Rajdeep Das
- *Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724
| | | | - Zhenyu Xuan
- *Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724
| | - Robert A. Rollins
- Department of Genetics and Development, College of Physicians and Surgeons of Columbia University, New York, NY 10032; and
| | | | - John R. Edwards
- Columbia Genome Center and
- Department of Chemical Engineering, Columbia University, New York, NY 10032
| | - Jingyue Ju
- Columbia Genome Center and
- Department of Chemical Engineering, Columbia University, New York, NY 10032
| | - Timothy H. Bestor
- Department of Genetics and Development, College of Physicians and Surgeons of Columbia University, New York, NY 10032; and
| | - Michael Q. Zhang
- *Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
33
|
Hughes JR, Cheng JF, Ventress N, Prabhakar S, Clark K, Anguita E, De Gobbi M, de Jong P, Rubin E, Higgs DR. Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences. Proc Natl Acad Sci U S A 2005; 102:9830-5. [PMID: 15998734 PMCID: PMC1174996 DOI: 10.1073/pnas.0503401102] [Citation(s) in RCA: 110] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
An important step toward improving the annotation of the human genome is to identify cis-acting regulatory elements from primary DNA sequence. One approach is to compare sequences from multiple, divergent species. This approach distinguishes multispecies conserved sequences (MCS) in noncoding regions from more rapidly evolving neutral DNA. Here, we have analyzed a region of approximately 238kb containing the human alpha globin cluster that was sequenced and/or annotated across the syntenic region in 22 species spanning 500 million years of evolution. Using a variety of bioinformatic approaches and correlating the results with many aspects of chromosome structure and function in this region, we were able to identify and evaluate the importance of 24 individual MCSs. This approach sensitively and accurately identified previously characterized regulatory elements but also discovered unidentified promoters, exons, splicing, and transcriptional regulatory elements. Together, these studies demonstrate an integrated approach by which to identify, subclassify, and predict the potential importance of MCSs.
Collapse
Affiliation(s)
- Jim R Hughes
- Medical Research Council Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford OX3 9DS, United Kingdom
| | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
The genomes from three mammals (human, mouse, and rat), two worms, and several yeasts have been sequenced, and more genomes will be completed in the near future for comparison with those of the major model organisms. Scientists have used various methods to align and compare the sequenced genomes to address critical issues in genome function and evolution. This review covers some of the major new insights about gene content, gene regulation, and the fraction of mammalian genomes that are under purifying selection and presumed functional. We review the evolutionary processes that shape genomes, with particular attention to variation in rates within genomes and along different lineages. Internet resources for accessing and analyzing the treasure trove of sequence alignments and annotations are reviewed, and we discuss critical problems to address in new bioinformatic developments in comparative genomics.
Collapse
Affiliation(s)
- Webb Miller
- The Center for Comparative Genomics and Bioinformatics, The Huck Institutes of Life Sciences, Department of Biology, Pennsylvania State University, University Park, Pennsylvania, USA.
| | | | | | | |
Collapse
|
35
|
Li W, Bernaola-Galván P, Haghighi F, Grosse I. Applications of recursive segmentation to the analysis of DNA sequences. COMPUTERS & CHEMISTRY 2002; 26:491-510. [PMID: 12144178 DOI: 10.1016/s0097-8485(02)00010-4] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.
Collapse
Affiliation(s)
- Wentian Li
- Center for Genomics and Human Genetics, North Shore-LIJ Research Institute, Manhasset, NY 11030, USA.
| | | | | | | |
Collapse
|
36
|
Ponger L, Duret L, Mouchiroud D. Determinants of CpG islands: expression in early embryo and isochore structure. Genome Res 2001; 11:1854-60. [PMID: 11691850 PMCID: PMC311164 DOI: 10.1101/gr.174501] [Citation(s) in RCA: 87] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
In an attempt to understand the origin of CpG islands (CGIs) in mammalian genomes, we have studied their location and structure according to the expression pattern of genes and to the G + C content of isochores in which they are embedded. We show that CGIs located over the transcription start site (named start CGIs) are very different structurally from the others (named no-start CGIs): (1) 61.6% of the no-start CGIs are due to repeated sequences (79 % are due to Alus), whereas only 5.6% of the start CGIs are due to such repeats; (2) start CGIs are longer and display a higher CpGo/e ratio and G + C level than no-start CGIs. The frequency of tissue-specific genes associated to a start CGI varies according to the genomic G + C content, from 25% in G + C-poor isochores to 64% in G + C-rich isochores. Conversely, the frequency of housekeeping genes associated to a start CGI (90%) is independent of the isochore context. Interestingly, the structure of start CGIs is very similar for tissue-specific and housekeeping genes. Moreover, 93% of genes expressed in early embryo are found to exhibit a CpG island over their transcription start point. These observations are consistent with the hypothesis that the occurrence of these CGIs is the consequence of gene expression at this stage, when the methylation pattern is installed.
Collapse
Affiliation(s)
- L Ponger
- Laboratoire de Biométrie et Biologie Evolutive, Unité Nixte de Recherche Centre National de la Recherche Scientifique 5558-Université Claude Bernard, 69622 Villeurbanne Cedex, France.
| | | | | |
Collapse
|
37
|
Galgóczy P, Rosenthal A, Platzer M. Human-mouse comparative sequence analysis of the NEMO gene reveals an alternative promoter within the neighboring G6PD gene. Gene 2001; 271:93-8. [PMID: 11410370 DOI: 10.1016/s0378-1119(01)00492-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
NEMO (NFkappaB essential modulator) is a non-catalytic subunit of the cytokine-dependent IkappaB kinase complex that is involved in activation of the transcription factor NFkappaB. The human NEMO gene maps to Xq28 and is arranged head to head with the proximal G6PD gene. Mutations in NEMO have recently been associated with Incontinentia Pigmenti (Smahi et al., Nature 405 (2000) 466), an X-linked dominant disorder. Three alternative transcripts with different non-coding 5' exons (1a, 1b and 1c) of NEMO have been described. In order to identify regulatory elements that control alternative transcription we have established the complete genomic sequence of the murine orthologs Nemo and G6pdx. Sequence comparison suggests the presence of two alternative promoters for NEMO/Nemo. First, a CpG island is shared by both genes driving expression of the NEMO/Nemo transcripts containing exons 1b and 1c in one direction and the housekeeping gene G6PD/G6pdx in the opposite direction. In contrast to human, an additional variant of exon 1c, named 1c+, was identified in several tissues of the mouse. This larger exon utilizes an alternative donor site located 1594 bp within intron 1c. The putative second promoter for NEMO/Nemo transcripts starting with exon 1a is unidirectional, and not associated with a CpG island. Surprisingly, this promoter is located in the second intron of G6PD/G6pdx. It shows very low basal activity and may be involved in stress/time- and/or tissue-dependent expression of NEMO. To our knowledge, an overlapping gene order similar to the G6PD/NEMO complex has not been described before.
Collapse
Affiliation(s)
- P Galgóczy
- Institut für Molekulare Biotechnologie, Abt. Genomanalyse, Beutenbergstrasse 11, 07745, Jena, Germany
| | | | | |
Collapse
|
38
|
Iida K, Akashi H. A test of translational selection at 'silent' sites in the human genome: base composition comparisons in alternatively spliced genes. Gene 2000; 261:93-105. [PMID: 11164041 DOI: 10.1016/s0378-1119(00)00482-0] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Natural selection appears to discriminate among synonymous codons to enhance translational efficiency in a wide range of prokaryotes and eukaryotes. Codon bias is strongly related to gene expression levels in these species. In addition, between-gene variation in silent DNA divergence is inversely correlated with codon bias. However, in mammals, between-gene comparisons are complicated by distinctive nucleotide-content bias (isochores) throughout the genome. In this study, we attempted to identify translational selection by analyzing the DNA sequences of alternatively spliced genes in humans and in Drosophila melanogaster. Among codons in an alternatively spliced gene, those in constitutively expressed exons are translated more often than those in alternatively spliced exons. Thus, translational selection should act more strongly to bias codon usage and reduce silent divergence in constitutive than in alternative exons. By controlling for regional forces affecting base-composition evolution, this within-gene comparison makes it possible to detect codon selection at synonymous sites in mammals. We found that GC-ending codons are more abundant in constitutive than alternatively spliced exons in both Drosophila and humans. Contrary to our expectation, however, silent DNA divergence between mammalian species is higher in constitutive than in alternative exons.
Collapse
Affiliation(s)
- K Iida
- Institute of Molecular Evolutionary Genetics, Department of Biology, 208 Mueller Laboratory, The Pennsylvania State University, University Park, PA 16802, USA
| | | |
Collapse
|
39
|
Abstract
The compositional evolution of vertebrate genomes is characterized: (i) by one predominant conservative mode, in which nucleotide changes occur, but the base composition of DNA sequences in general, and of coding sequences in particular, does not change; and (ii) by three different shifting or transitional modes, in which nucleotide changes are accompanied by changes in the base composition of sequences. Investigations on these evolutionary modes have shed new light on a central problem in molecular evolution, namely the role played by natural selection in modulating the mutational input. This review will present first the intragenomic shifts, the 'major shifts' and the 'minor shift', and then the 'whole-genome', or 'horizontal', shift. In each case, the shifts were preceded and followed by a conservative mode of evolution. This review expands on a previous one [Bernardi, Gene 241 (2000) 3-17], and summarizes the evidence that the changes of the compositional patterns of the genome and their maintenance are controlled by Darwinian natural selection.
Collapse
Affiliation(s)
- G Bernardi
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Napoli 80121, Italy.
| |
Collapse
|
40
|
Douady C, Carels N, Clay O, Catzeflis F, Bernardi G. Diversity and phylogenetic implications of CsCl profiles from rodent DNAs. Mol Phylogenet Evol 2000; 17:219-30. [PMID: 11083936 DOI: 10.1006/mpev.2000.0838] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Buoyant density profiles of high-molecular-weight DNAs sedimented in CsCl gradients, i.e., compositional distributions of 50- to 100-kb genomic fragments, have revealed a clear difference between the murids so far studied and most other mammals, including other rodents. Sequence analyses have revealed other, related, compositional differences between murids and nonmurids. In the present study, we obtained CsCl profiles of 17 rodent species representing 13 families. The modal buoyant densities obtained for rodents span the full range of values observed in other eutherians. More remarkably, the skewness (asymmetry, mean - modal buoyant density) of the rodent profiles extends to values well below those of other eutherians. Scatterplots of these and related CsCl profile parameters show groups of rodent families that agree largely with established rodent taxonomy, in particular with the monophyly of the Geomyoidea superfamily and the position of the Dipodidae family within the Myomorpha. In contrast, while confirming and extending previously reported differences between the profiles of Myomorpha and those of other rodents, the CsCl data question a traditional hypothesis positing Gliridae within Myomorpha, as does the recently sequenced mitochondrial genome of dormouse. Analysis of CsCl profiles is presented here as a rapid, robust method for exploring rodent and other vertebrate systematics.
Collapse
Affiliation(s)
- C Douady
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Tour 43, 2 Place Jussieu, Paris, F-75005, France
| | | | | | | | | |
Collapse
|
41
|
der Maur AA, Belser T, Wang Y, Günes C, Lichtlen P, Georgiev O, Schaffner W. Characterization of the mouse gene for the heavy metal-responsive transcription factor MTF-1. Cell Stress Chaperones 2000; 5:196-206. [PMID: 11005378 PMCID: PMC312886 DOI: 10.1379/1466-1268(2000)005<0196:cotmgf>2.0.co;2] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/1999] [Revised: 02/22/2000] [Accepted: 02/23/2000] [Indexed: 11/24/2022] Open
Abstract
MTF-1 is a zinc finger transcription factor that mediates the cellular response to heavy metal stress; its targeted disruption in the mouse leads to liver decay and embryonic lethality at day E14. Recently, we have sequenced the entire MTF-1 gene in the compact genome of the pufferfish Fugu rubripes. Here we have defined the promoter sequences of human and mouse MTF-1 and the genomic structure of the mouse MTF-1 locus. The transcription unit of MTF-1 spans 42 kb (compared to 8.5 kb in Fugu) and is located downstream of the gene for a phosphatase (INPP5P) in mouse, human, and fish. In all of these species, the MTF promoter region has the features of a CpG island. In both mouse and human, the 5' untranslated region harbors conserved short reading frames of unknown function. RNA mapping experiments revealed that in these two species, MTF-1 mRNA is transcribed from a cluster of multiple initiation sites from a TATA-less promoter without metal-responsive elements. Transcription from endogenous and transfected MTF-1 promoters was not affected by heavy metal load or other stressors, in support of the notion that MTF-1 activity is regulated at the posttranscriptional level. Tissue Northern blots normalized for poly A+ RNA indicate that MTF-1 is expressed at similar levels in all tissues, except in the testes, that contain more than 10-fold higher mRNA levels.
Collapse
Affiliation(s)
- Adrian Auf der Maur
- Institute of Molecular Biology, University of Zürich, Winterthurer St. 190, CH-8057 Zürich, Switzerland
| | - Tanja Belser
- Institute of Molecular Biology, University of Zürich, Winterthurer St. 190, CH-8057 Zürich, Switzerland
| | - Ying Wang
- Institute of Molecular Biology, University of Zürich, Winterthurer St. 190, CH-8057 Zürich, Switzerland
| | - Cagatay Günes
- Institute of Molecular Biology, University of Zürich, Winterthurer St. 190, CH-8057 Zürich, Switzerland
| | - Peter Lichtlen
- Institute of Molecular Biology, University of Zürich, Winterthurer St. 190, CH-8057 Zürich, Switzerland
| | - Oleg Georgiev
- Institute of Molecular Biology, University of Zürich, Winterthurer St. 190, CH-8057 Zürich, Switzerland
| | - Walter Schaffner
- Institute of Molecular Biology, University of Zürich, Winterthurer St. 190, CH-8057 Zürich, Switzerland
- Correspondence to: W. Schaffner, Tel: +41-1-635-3151; Fax: +41-1-635-6811; .
| |
Collapse
|
42
|
Stuart GR, Glickman BW. Through a glass, darkly: reflections of mutation from lacI transgenic mice. Genetics 2000; 155:1359-67. [PMID: 10880494 PMCID: PMC1461138 DOI: 10.1093/genetics/155.3.1359] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The study of mutational frequency (Mf) and specificity in aging Big Blue lacI transgenic mice provides a unique opportunity to determine mutation rates (MR) in vivo in different tissues. We found that MR are not static, but rather, vary with the age or developmental stage of the tissue. Although Mf increase more rapidly early in life, MR are actually lower in younger animals than in older animals. For example, we estimate that the changes in Mf are 4.9x10(-8) and 1.1 x 10(-8) mutations/base pair/month in the livers of younger mice (<1. 5 months old) and older mice (> or =1.5 months old), respectively (a 4-fold decrease), and that the MR are 3.9 x 10(-9) and 1.3 x 10(-7) mutations/base pair/cell division, respectively ( approximately 30-fold increase). These data also permit an estimate of the MR of GC --> AT transitions occurring at 5'-CpG-3' (CpG) dinucleotide sequences. Subsequently, the contribution of these transitions to age-related demethylation of genomic DNA can be evaluated. Finally, to better understand the origin of observed Mf, we consider the contribution of various factors, including DNA damage and repair, by constructing a descriptive mutational model. We then apply this model to estimate the efficiency of repair of deaminated 5-methylcytosine nucleosides occurring at CpG dinucleotide sequences, as well as the influence of the Msh2(-/-) DNA repair defect on overall DNA repair efficiency in Big Blue mice. We conclude that even slight changes in DNA repair efficiency could lead to significant increases in mutation frequencies, potentially contributing significantly to human pathogenesis, including cancer.
Collapse
Affiliation(s)
- G R Stuart
- Centre for Environmental Health and the Department of Biology, University of Victoria, Victoria, British Columbia V8W 3N5, Canada.
| | | |
Collapse
|
43
|
Abstract
The nuclear genomes of vertebrates are mosaics of isochores, very long stretches (>>300kb) of DNA that are homogeneous in base composition and are compositionally correlated with the coding sequences that they embed. Isochores can be partitioned in a small number of families that cover a range of GC levels (GC is the molar ratio of guanine+cytosine in DNA), which is narrow in cold-blooded vertebrates, but broad in warm-blooded vertebrates. This difference is essentially due to the fact that the GC-richest 10-15% of the genomes of the ancestors of mammals and birds underwent two independent compositional transitions characterized by strong increases in GC levels. The similarity of isochore patterns across mammalian orders, on the one hand, and across avian orders, on the other, indicates that these higher GC levels were then maintained, at least since the appearance of ancestors of warm-blooded vertebrates. After a brief review of our current knowledge on the organization of the vertebrate genome, evidence will be presented here in favor of the idea that the generation and maintenance of the GC-richest isochores in the genomes of warm-blooded vertebrates were due to natural selection.
Collapse
Affiliation(s)
- G Bernardi
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Napoli, Italy.
| |
Collapse
|
44
|
Auf der Maur A, Belser T, Elgar G, Georgiev O, Schaffner W. Characterization of the transcription factor MTF-1 from the Japanese pufferfish (Fugu rubripes) reveals evolutionary conservation of heavy metal stress response. Biol Chem 1999; 380:175-85. [PMID: 10195425 DOI: 10.1515/bc.1999.026] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
The pufferfish Fugu rubripes was recently introduced as a new model organism for genomic studies, since it contains a full set of vertebrate genes but only 13% as much DNA as a mammal. Fugu genes tend to be smaller and densely spaced due to shortening of introns and intergenic spacers. We isolated the Fugu gene for the metal-responsive transcription factor MTF-1 (MTF1), a mediator of heavy metal regulation and oxidative stress response previously characterized in mammals. In addition, most of the cDNA sequence was also determined. The 780 amino acid MTF-1 protein of Fugu is very similar to that of mouse and human, with 90% amino acid identity in the DNA binding zinc finger domain and 57% overall identity. Expression of the pufferfish cDNA in mammalian cells shows that Fugu MTF-1 has the same DNA binding specificity as its mammalian counterpart and also induces transcription in response to zinc and cadmium. The protein-coding part of the Fugu MTF-1 gene spans 6.4 kb and consists of 11 exons. Upstream region and first exon constitute a CpG island. The distance between stop codon and polyadenylation motifs is >2 kb, suggesting a very long 3' untranslated mRNA region, followed by another CpG island which may represent the promoter of the next gene downstream. Part of the MTF-1 genomic structure was also determined in the mouse, and some striking similarities were found: for example, the upstream adjacent gene in both species is INPP5P, encoding a phosphatase. The mouse MTF-1 promoter is also embedded in a CpG island, which however shares no sequence similarity to the one of Fugu. The Fugu CpG island is shorter than the one of the mouse and has no elevated [G+C] content; these and other data indicate that CpG islands of fish may represent a primordial stage of CpG island evolution.
Collapse
Affiliation(s)
- A Auf der Maur
- Institut für Molekularbiologie der Universität Zürich, Switzerland
| | | | | | | | | |
Collapse
|
45
|
Abstract
Codon usage in mammals is mainly determined by the spatial arrangement of genomic G + C-content, i.e., the isochore structure. Ancestral G + C-content at third codon positions of 27 nuclear protein-coding genes of eutherian mammals was estimated by maximum-likelihood analysis on the basis of a nonhomogeneous DNA substitution model, accounting for variable base compositions among present-day sequences. Data consistently supported a human-like ancestral pattern, i.e., highly variable G + C-content among genes. The mouse genomic structure-more narrow G + C-content distribution-would be a derived state. The circumstances of isochore evolution are discussed with respect to this result. A possible relationship between G + C-content homogenization in murid genomes and high mutation rate is proposed, consistent with the negative selection hypothesis for isochore maintenance in mammals.
Collapse
Affiliation(s)
- N Galtier
- Centre National de la Recherche Scientifique, Unité Mixte de Recherche 5558, Biométrie, Génétique et Biologie des Populations, Université Claude Bernard Lyon 1, 69622 Villeurbanne Cedex, France
| | | |
Collapse
|
46
|
McQueen HA, Siriaco G, Bird AP. Chicken microchromosomes are hyperacetylated, early replicating, and gene rich. Genome Res 1998; 8:621-30. [PMID: 9647637 PMCID: PMC310741 DOI: 10.1101/gr.8.6.621] [Citation(s) in RCA: 63] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The chicken karyotype consists of 39 chromosomes of which 33 are classed as microchromosomes (MICs). MICs contain about one third of genomic DNA. The majority of mapped chicken genes are assigned to macrochromosomes (MACs), but a recent study indicated that CpG islands (CGIs), which are associated with most vertebrate genes, map predominantly to MICs. The present work establishes that chicken genes are concentrated on MICs by several criteria. Acetylated (lysine 5) histone H4, which is strongly correlated with the presence of genes, is highly enriched on MICs by immunocytochemistry. In addition, detailed analysis of chicken cosmids shows that CGI-like fragments are approximately six times denser on MICs than on MACs. Published mapping of randomly chosen genes by fluorescent in situ hybridization (FISH) also shows a significant excess of microchromosomal assignments. Finally, the finding that MICs replicate during the first half of S phase is also compatible with the suggestion that MICs represent gene-rich DNA. We use the cosmid data to predict that approximately 75% of chicken genes are located on microchromosomes. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AJ001643 and AJ001644.]
Collapse
Affiliation(s)
- H A McQueen
- Institute of Cell and Molecular Biology, University of Edinburgh, Darwin Building, King's Buildings, Edinburgh EH9 3JR, UK.
| | | | | |
Collapse
|
47
|
Mouse Hypoxia-Inducible Factor-1α Is Encoded by Two Different mRNA Isoforms: Expression From a Tissue-Specific and a Housekeeping-Type Promoter. Blood 1998. [DOI: 10.1182/blood.v91.9.3471.3471_3471_3480] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Hypoxic induction of erythropoietin (Epo) and other oxygen-dependent genes is mediated by the hypoxia-inducible factor-1 (HIF-1), a heterodimeric transactivator consisting of an α and a β subunit. We previously found that the mouse gene encoding HIF-1α harbors two alternative first exons (I.1 and I.2), giving rise to two different HIF-1α mRNA isoforms. Here, we show by RNase protection analysis that the exon I.1-derived mRNA isoform is differentially expressed in mouse tissues, being highest in kidney, tongue, stomach, and testis, but undetectable in liver, whereas the exon I.2 mRNA isoform is ubiquitously expressed. Sequence and methylation analysis showed that, in contrast to exon I.1, exon I.2 resides within a region showing typical features of a CpG island, known to be associated with the 5′ end of housekeeping genes. We identified a 232-bp minimal exon I.2 promoter that strongly induced reporter gene expression in mouse L929 fibroblasts and Hepa1 hepatoma cells. In contrast to L929 cells, the exon I.1 promoter was inactive in Hepa1 cells and hypoxic exposure (1% O2) markedly reduced exon I.2 promoter activity in Hepa1 cells. Prolonged exposure of mice to hypoxia (7.5% O2 for up to 72 hours) also caused a decrease in liver HIF-1α mRNA, whereas aldolase mRNA levels increased. These findings might be related to the relatively low Epo levels in the adult liver.
Collapse
|
48
|
Mouse Hypoxia-Inducible Factor-1α Is Encoded by Two Different mRNA Isoforms: Expression From a Tissue-Specific and a Housekeeping-Type Promoter. Blood 1998. [DOI: 10.1182/blood.v91.9.3471] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Abstract
Hypoxic induction of erythropoietin (Epo) and other oxygen-dependent genes is mediated by the hypoxia-inducible factor-1 (HIF-1), a heterodimeric transactivator consisting of an α and a β subunit. We previously found that the mouse gene encoding HIF-1α harbors two alternative first exons (I.1 and I.2), giving rise to two different HIF-1α mRNA isoforms. Here, we show by RNase protection analysis that the exon I.1-derived mRNA isoform is differentially expressed in mouse tissues, being highest in kidney, tongue, stomach, and testis, but undetectable in liver, whereas the exon I.2 mRNA isoform is ubiquitously expressed. Sequence and methylation analysis showed that, in contrast to exon I.1, exon I.2 resides within a region showing typical features of a CpG island, known to be associated with the 5′ end of housekeeping genes. We identified a 232-bp minimal exon I.2 promoter that strongly induced reporter gene expression in mouse L929 fibroblasts and Hepa1 hepatoma cells. In contrast to L929 cells, the exon I.1 promoter was inactive in Hepa1 cells and hypoxic exposure (1% O2) markedly reduced exon I.2 promoter activity in Hepa1 cells. Prolonged exposure of mice to hypoxia (7.5% O2 for up to 72 hours) also caused a decrease in liver HIF-1α mRNA, whereas aldolase mRNA levels increased. These findings might be related to the relatively low Epo levels in the adult liver.
Collapse
|
49
|
Abstract
This review is intended to provide an overview of techniques and a source of reagents for physical mapping of the mouse genome. It focuses on those applications, methods, or resources unique to the mouse and on the generation of comparative physical maps. The reference list is not comprehensive; rather, recent reviews on each topic and selected representative examples are given.
Collapse
Affiliation(s)
- G E Herman
- Department of Pediatrics, Ohio State University, Columbus, USA
| |
Collapse
|
50
|
Abstract
To characterize the extent of DNA methylation and its possible biological roles in a wide variety of organisms, we have analyzed gene sequences extracted from the GenBank database. Sequences of both methylated and non-methylated species were used for comparative analysis. The local CpG dinucleotide distribution near the 5' ends of genes as well as the degree of overall CpG suppression/depletion in the entire gene region were examined in all complete gene sequences for each species. We show that the distribution patterns of CpG near the 5' region of genes differ among vertebrates, invertebrates, plants and bacteria. CpG island-like peaks in CpG O/E (observed/expected ratio) were observed not only in methylated species, but also in non-methylated species. In methylated non-vertebrates, overall CpG O/E values were lower, and peaks in the CpG profile of 5' regions were larger than in non-methylated species. We discuss the implications of such biases with respect to DNA methylation.
Collapse
Affiliation(s)
- T S Shimizu
- Laboratory for Bioinformatics, Keio University, Fujisawa, Kanagawa, Japan
| | | | | |
Collapse
|