1
|
Uemura K, Ohyama T. Distinctive physical properties of DNA shared by RNA polymerase II gene promoters and 5'-flanking regions of tRNA genes. J Biochem 2024; 175:395-404. [PMID: 38102732 PMCID: PMC11005993 DOI: 10.1093/jb/mvad111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 10/30/2023] [Accepted: 11/26/2023] [Indexed: 12/17/2023] Open
Abstract
Numerous noncoding (nc)RNAs have been identified. Similar to the transcription of protein-coding (mRNA) genes, long noncoding (lnc)RNA genes and most of micro (mi)RNA genes are transcribed by RNA polymerase II (Pol II). In the transcription of mRNA genes, core promoters play an indispensable role; they support the assembly of the preinitiation complex (PIC). However, the structural and/or physical properties of the core promoters of lncRNA and miRNA genes remain largely unexplored, in contrast with those of mRNA genes. Using the core promoters of human genes, we analyzed the repertoire and population ratios of residing core promoter elements (CPEs) and calculated the following five DNA physical properties (DPPs): duplex DNA free energy, base stacking energy, protein-induced deformability, rigidity and stabilizing energy of Z-DNA. Here, we show that their CPE and DPP profiles are similar to those of mRNA gene promoters. Importantly, the core promoters of these three classes of genes have two highly distinctive sites in their DPP profiles around the TSS and position -27. Similar characteristics in DPPs are also found in the 5'-flanking regions of tRNA genes, indicating their common essential roles in transcription initiation over the kingdom of RNA polymerases.
Collapse
Affiliation(s)
- Kohei Uemura
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
| | - Takashi Ohyama
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
| |
Collapse
|
2
|
Uemura K, Ohyama T. Physical Peculiarity of Two Sites in Human Promoters: Universality and Diverse Usage in Gene Function. Int J Mol Sci 2024; 25:1487. [PMID: 38338773 PMCID: PMC10855393 DOI: 10.3390/ijms25031487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/15/2024] [Accepted: 01/18/2024] [Indexed: 02/12/2024] Open
Abstract
Since the discovery of physical peculiarities around transcription start sites (TSSs) and a site corresponding to the TATA box, research has revealed only the average features of these sites. Unsettled enigmas include the individual genes with these features and whether they relate to gene function. Herein, using 10 physical properties of DNA, including duplex DNA free energy, base stacking energy, protein-induced deformability, and stabilizing energy of Z-DNA, we clarified for the first time that approximately 97% of the promoters of 21,056 human protein-coding genes have distinctive physical properties around the TSS and/or position -27; of these, nearly 65% exhibited such properties at both sites. Furthermore, about 55% of the 21,056 genes had a minimum value of regional duplex DNA free energy within TSS-centered ±300 bp regions. Notably, distinctive physical properties within the promoters and free energies of the surrounding regions separated human protein-coding genes into five groups; each contained specific gene ontology (GO) terms. The group represented by immune response genes differed distinctly from the other four regarding the parameter of the free energies of the surrounding regions. A vital suggestion from this study is that physical-feature-based analyses of genomes may reveal new aspects of the organization and regulation of genes.
Collapse
Affiliation(s)
- Kohei Uemura
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan;
| | - Takashi Ohyama
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan;
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
| |
Collapse
|
3
|
iEnhancer-RD: Identification of enhancers and their strength using RKPK features and deep neural networks. Anal Biochem 2021; 630:114318. [PMID: 34364858 DOI: 10.1016/j.ab.2021.114318] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 07/02/2021] [Accepted: 07/27/2021] [Indexed: 11/20/2022]
Abstract
Enhancers are regulatory elements involved in gene expression.It is a part of DNA, which can enhance the transcription rate of gene. However, the identification of enhancer by biological experimental methods is time-consuming and expensive. Therefore, there is an urgent need for more efficient methods to identify them.In this study, we propose a new feature extraction method RKPK, which combines three feature methods and uses the recursive feature elimination algorithm for feature selection, and apply deep neural network as classifier to construct the iEnhancer-RD calculation method for enhancer identification. It is a two-layer classification architecture in which the first layer(layer I) identifies enhancers from a set of DNA sequences, and the second layer(layer II) divides the identified enhancers into two subgroups, namely strong and weak enhancers. Independent dataset test indicates that the proposed method is significantly better than most existing methods, and attains the accuracy of 78.8% and 70.5% in the two layers, respectively. Our iEnhancer-RD architecture is implemented in Python and is available at https://github.com/YangHuan639/iEnhancer-RD.
Collapse
|
4
|
iterb-PPse: Identification of transcriptional terminators in bacterial by incorporating nucleotide properties into PseKNC. PLoS One 2020; 15:e0228479. [PMID: 32413030 PMCID: PMC7228126 DOI: 10.1371/journal.pone.0228479] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 05/01/2020] [Indexed: 11/19/2022] Open
Abstract
Terminator is a DNA sequence that gives the RNA polymerase the transcriptional termination signal. Identifying terminators correctly can optimize the genome annotation, more importantly, it has considerable application value in disease diagnosis and therapies. However, accurate prediction methods are deficient and in urgent need. Therefore, we proposed a prediction method "iterb-PPse" for terminators by incorporating 47 nucleotide properties into PseKNC-Ⅰ and PseKNC-Ⅱ and utilizing Extreme Gradient Boosting to predict terminators based on Escherichia coli and Bacillus subtilis. Combing with the preceding methods, we employed three new feature extraction methods K-pwm, Base-content, Nucleotidepro to formulate raw samples. The two-step method was applied to select features. When identifying terminators based on optimized features, we compared five single models as well as 16 ensemble models. As a result, the accuracy of our method on benchmark dataset achieved 99.88%, higher than the existing state-of-the-art predictor iTerm-PseKNC in 100 times five-fold cross-validation test. Its prediction accuracy for two independent datasets reached 94.24% and 99.45% respectively. For the convenience of users, we developed a software on the basis of "iterb-PPse" with the same name. The open software and source code of "iterb-PPse" are available at https://github.com/Sarahyouzi/iterb-PPse.
Collapse
|
5
|
Miura O, Ogake T, Yoneyama H, Kikuchi Y, Ohyama T. A strong structural correlation between short inverted repeat sequences and the polyadenylation signal in yeast and nucleosome exclusion by these inverted repeats. Curr Genet 2018; 65:575-590. [PMID: 30498953 PMCID: PMC6420913 DOI: 10.1007/s00294-018-0907-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 11/14/2018] [Accepted: 11/15/2018] [Indexed: 11/22/2022]
Abstract
DNA sequences that read the same from 5′ to 3′ in either strand are called inverted repeat sequences or simply IRs. They are found throughout a wide variety of genomes, from prokaryotes to eukaryotes. Despite extensive research, their in vivo functions, if any, remain unclear. Using Saccharomyces cerevisiae, we performed genome-wide analyses for the distribution, occurrence frequency, sequence characteristics and relevance to chromatin structure, for the IRs that reportedly have a cruciform-forming potential. Here, we provide the first comprehensive map of these IRs in the S. cerevisiae genome. The statistically significant enrichment of the IRs was found in the close vicinity of the DNA positions corresponding to polyadenylation [poly(A)] sites and ~ 30 to ~ 60 bp downstream of start codon-coding sites (referred to as ‘start codons’). In the former, ApT- or TpA-rich IRs and A-tract- or T-tract-rich IRs are enriched, while in the latter, different IRs are enriched. Furthermore, we found a strong structural correlation between the former IRs and the poly(A) signal. In the chromatin formed on the gene end regions, the majority of the IRs causes low nucleosome occupancy. The IRs in the region ~ 30 to ~ 60 bp downstream of start codons are located in the + 1 nucleosomes. In contrast, fewer IRs are present in the adjacent region downstream of start codons. The current study suggests that the IRs play similar roles in Escherichia coli and S. cerevisiae to regulate or complete transcription at the RNA level.
Collapse
Affiliation(s)
- Osamu Miura
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Toshihiro Ogake
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Hiroki Yoneyama
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Yo Kikuchi
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Takashi Ohyama
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan. .,Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
| |
Collapse
|
6
|
Liu B. iEnhancer-PsedeKNC: Identification of enhancers and their subgroups based on Pseudo degenerate kmer nucleotide composition. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.12.138] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
7
|
Liu B, Fang L, Long R, Lan X, Chou KC. iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition. Bioinformatics 2015; 32:362-9. [PMID: 26476782 DOI: 10.1093/bioinformatics/btv604] [Citation(s) in RCA: 274] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 10/12/2015] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Enhancers are of short regulatory DNA elements. They can be bound with proteins (activators) to activate transcription of a gene, and hence play a critical role in promoting gene transcription in eukaryotes. With the avalanche of DNA sequences generated in the post-genomic age, it is a challenging task to develop computational methods for timely identifying enhancers from extremely complicated DNA sequences. Although some efforts have been made in this regard, they were limited at only identifying whether a query DNA element being of an enhancer or not. According to the distinct levels of biological activities and regulatory effects on target genes, however, enhancers should be further classified into strong and weak ones in strength. RESULTS In view of this, a two-layer predictor called ' IENHANCER-2L: ' was proposed by formulating DNA elements with the 'pseudo k-tuple nucleotide composition', into which the six DNA local parameters were incorporated. To the best of our knowledge, it is the first computational predictor ever established for identifying not only enhancers, but also their strength. Rigorous cross-validation tests have indicated that IENHANCER-2L: holds very high potential to become a useful tool for genome analysis. AVAILABILITY AND IMPLEMENTATION For the convenience of most experimental scientists, a web server for the two-layer predictor was established at http://bioinformatics.hitsz.edu.cn/iEnhancer-2L/, by which users can easily get their desired results without the need to go through the mathematical details. CONTACT bliu@gordonlifescience.org, bliu@insun.hit.edu.cn, xlan@stanford.edu, kcchou@gordonlifescience.org SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bin Liu
- School of Computer Science and Technology, Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China, Computational Biology, Gordon Life Science Institute, Belmont, MA 02478, USA
| | | | - Ren Long
- School of Computer Science and Technology
| | - Xun Lan
- Department of Genetics, Stanford University, Stanford, CA 94305, USA and
| | - Kuo-Chen Chou
- Computational Biology, Gordon Life Science Institute, Belmont, MA 02478, USA, Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
8
|
MicroRNA Promoter Identification in Arabidopsis Using Multiple Histone Markers. BIOMED RESEARCH INTERNATIONAL 2015; 2015:861402. [PMID: 26425556 PMCID: PMC4573627 DOI: 10.1155/2015/861402] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 03/12/2015] [Indexed: 11/18/2022]
Abstract
A microRNA is a small noncoding RNA molecule, which functions in RNA silencing and posttranscriptional regulation of gene expression. To understand the mechanism of the activation of microRNA genes, the location of promoter regions driving their expression is required to be annotated precisely. Only a fraction of microRNA genes have confirmed transcription start sites (TSSs), which hinders our understanding of the transcription factor binding events. With the development of the next generation sequencing technology, the chromatin states can be inferred precisely by virtue of a combination of specific histone modifications. Using the genome-wide profiles of nine histone markers including H3K4me2, H3K4me3, H3K9Ac, H3K9me2, H3K18Ac, H3K27me1, H3K27me3, H3K36me2, and H3K36me3, we developed a computational strategy to identify the promoter regions of most microRNA genes in Arabidopsis, based upon the assumption that the distribution of histone markers around the TSSs of microRNA genes is similar to the TSSs of protein coding genes. Among 298 miRNA genes, our model identified 42 independent miRNA TSSs and 132 miRNA TSSs, which are located in the promoters of upstream genes. The identification of promoters will provide better understanding of microRNA regulation and can play an important role in the study of diseases at genetic level.
Collapse
|
9
|
Regions of Unusually High Flexibility Occur Frequently in Human Genomic DNA. Biosci Biotechnol Biochem 2014; 77:612-7. [DOI: 10.1271/bbb.120850] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
10
|
Chen W, Lei TY, Jin DC, Lin H, Chou KC. PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition. Anal Biochem 2014; 456:53-60. [PMID: 24732113 DOI: 10.1016/j.ab.2014.04.001] [Citation(s) in RCA: 334] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2013] [Revised: 03/20/2014] [Accepted: 04/01/2014] [Indexed: 10/25/2022]
Abstract
The pseudo oligonucleotide composition, or pseudo K-tuple nucleotide composition (PseKNC), can be used to represent a DNA or RNA sequence with a discrete model or vector yet still keep considerable sequence order information, particularly the global or long-range sequence order information, via the physicochemical properties of its constituent oligonucleotides. Therefore, the PseKNC approach may hold very high potential for enhancing the power in dealing with many problems in computational genomics and genome sequence analysis. However, dealing with different DNA or RNA problems may need different kinds of PseKNC. Here, we present a flexible and user-friendly web server for PseKNC (at http://lin.uestc.edu.cn/pseknc/default.aspx) by which users can easily generate many different modes of PseKNC according to their need by selecting various parameters and physicochemical properties. Furthermore, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the current web server to generate their desired PseKNC without the need to follow the complicated mathematical equations, which are presented in this article just for the integrity of PseKNC formulation and its development. It is anticipated that the PseKNC web server will become a very useful tool in computational genomics and genome sequence analysis.
Collapse
Affiliation(s)
- Wei Chen
- School of Sciences, and Center for Genomics and Computational Biology, Hebei United University, Tangshan 063000, China; Gordon Life Science Institute, Belmont, MA 02478, USA.
| | - Tian-Yu Lei
- School of Sciences, and Center for Genomics and Computational Biology, Hebei United University, Tangshan 063000, China
| | - Dian-Chuan Jin
- School of Sciences, and Center for Genomics and Computational Biology, Hebei United University, Tangshan 063000, China
| | - Hao Lin
- Gordon Life Science Institute, Belmont, MA 02478, USA; Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Kuo-Chen Chou
- School of Sciences, and Center for Genomics and Computational Biology, Hebei United University, Tangshan 063000, China; Gordon Life Science Institute, Belmont, MA 02478, USA; Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah 21589, Saudi Arabia.
| |
Collapse
|
11
|
Kumari S, Ware D. Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots. PLoS One 2013; 8:e79011. [PMID: 24205361 PMCID: PMC3812177 DOI: 10.1371/journal.pone.0079011] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 09/18/2013] [Indexed: 01/22/2023] Open
Abstract
Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the computational prediction of CPEs across eight plant genomes to help better understand the transcription initiation complex assembly. The distribution of thirteen known CPEs across four monocots (Brachypodium distachyon, Oryza sativa ssp. japonica, Sorghum bicolor, Zea mays) and four dicots (Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera, Glycine max) reveals the structural organization of the core promoter in relation to the TATA-box as well as with respect to other CPEs. The distribution of known CPE motifs with respect to transcription start site (TSS) exhibited positional conservation within monocots and dicots with slight differences across all eight genomes. Further, a more refined subset of annotated genes based on orthologs of the model monocot (O. sativa ssp. japonica) and dicot (A. thaliana) genomes supported the positional distribution of these thirteen known CPEs. DNA free energy profiles provided evidence that the structural properties of promoter regions are distinctly different from that of the non-regulatory genome sequence. It also showed that monocot core promoters have lower DNA free energy than dicot core promoters. The comparison of monocot and dicot promoter sequences highlights both the similarities and differences in the core promoter architecture irrespective of the species-specific nucleotide bias. This study will be useful for future work related to genome annotation projects and can inspire research efforts aimed to better understand regulatory mechanisms of transcription.
Collapse
Affiliation(s)
- Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America,
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America,
- United States Department of Agriculture-Agriculture Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, New York, United States of America
| |
Collapse
|
12
|
Kimura H, Shimooka Y, Nishikawa JI, Miura O, Sugiyama S, Yamada S, Ohyama T. The genome folding mechanism in yeast. ACTA ACUST UNITED AC 2013; 154:137-47. [DOI: 10.1093/jb/mvt033] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
|
13
|
Yokoyama KD, Pollock DD. SP transcription factor paralogs and DNA-binding sites coevolve and adaptively converge in mammals and birds. Genome Biol Evol 2013; 4:1102-17. [PMID: 23019068 PMCID: PMC3514965 DOI: 10.1093/gbe/evs085] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Functional modification of regulatory proteins can affect hundreds of genes throughout the genome, and is therefore thought to be almost universally deleterious. This belief, however, has recently been challenged. A potential example comes from transcription factor SP1, for which statistical evidence indicates that motif preferences were altered in eutherian mammals. Here, we set out to discover possible structural and theoretical explanations, evaluate the role of selection in SP1 evolution, and discover effects on coregulatory proteins. We show that SP1 motif preferences were convergently altered in birds as well as mammals, inducing coevolutionary changes in over 800 regulatory regions. Structural and phylogenic evidence implicates a single causative amino acid replacement at the same SP1 position along both lineages. Furthermore, paralogs SP3 and SP4, which coregulate SP1 target genes through competitive binding to the same sites, have accumulated convergent replacements at the homologous position multiple times during eutherian and bird evolution, presumably to preserve competitive binding. To determine plausibility, we developed and implemented a simple model of transcription factor and binding site coevolution. This model predicts that, in contrast to prevailing beliefs, even small selective benefits per locus can drive concurrent fixation of transcription factor and binding site mutants under a broad range of conditions. Novel binding sites tend to arise de novo, rather than by mutation from ancestral sites, a prediction substantiated by SP1-binding site alignments. Thus, multiple lines of evidence indicate that selection has driven convergent evolution of transcription factors along with their binding sites and coregulatory proteins.
Collapse
Affiliation(s)
- Ken Daigoro Yokoyama
- Department of Biochemistry and Molecular Genetics, University of Colorado, Denver School of Medicine, USA
| | | |
Collapse
|
14
|
Shimooka Y, Nishikawa JI, Ohyama T. Most methylation-susceptible DNA sequences in human embryonic stem cells undergo a change in conformation or flexibility upon methylation. Biochemistry 2013; 52:1344-53. [PMID: 23356538 DOI: 10.1021/bi301319y] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
DNA methylation in eukaryotes occurs on the cytosine bases in CG, CHG, and CHH (where H indicates non-G nucleotides) contexts and provides an important epigenetic mark in various biological processes. However, the structural and physical properties of methylated DNA are poorly understood. Using nondenaturing polyacrylamide gel electrophoresis, we performed a systematic study of the influence of DNA methylation on the conformation and physical properties of DNA for all CG, CHG, and CHH contexts. In the CG context, methylated multimers of the CG/CG-containing unit fragment migrated in gels slightly faster than their unmethylated counterparts. In the CHG context, both homo- and hemimethylation caused retarded migration of multimers of the CAG/CTG-containing fragment. In the CHH context, methylation caused or enhanced retarded migration of the multimers of CAA/TTG-, CAT/ATG-, CAC/GTG-, CTA/TAG-, or CTT/AAG-containing fragments. These results suggest that methylation increases DNA rigidity in the CG context and introduces distortions into several CHG and CHH sequences. More interestingly, we found that nearly all of the methylation repertoires in the CHG context and 98% of those in the CHH context in human embryonic stem cells were species that undergo conformational changes upon methylation. Similarly, most of the methylation repertoires in the Arabidopsis CHG and CHH contexts were sequences with methylation-induced distortion. We hypothesize that the methylation-induced properties or conformational changes in DNA may facilitate nucleosome formation, which provides the essential mechanism for alterations of chromatin density.
Collapse
Affiliation(s)
- Yasutoshi Shimooka
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
| | | | | |
Collapse
|
15
|
Meysman P, Marchal K, Engelen K. DNA structural properties in the classification of genomic transcription regulation elements. Bioinform Biol Insights 2012; 6:155-68. [PMID: 22837642 PMCID: PMC3399529 DOI: 10.4137/bbi.s9426] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
It has been long known that DNA molecules encode information at various levels. The most basic level comprises the base sequence itself and is primarily important for the encoding of proteins and direct base recognition by DNA-binding proteins. A more elusive level consists of the local structural properties of the DNA molecule wherein the DNA sequence only plays an indirect supportive role. These properties are nevertheless an important factor in a large number of biomolecular processes and can be considered as informative signals for the presence of a variety of genomic features. Several recent studies have unequivocally shown the benefit of relying on such DNA properties for modeling and predicting genomic features as diverse as transcription start sites, transcription factor binding sites, or nucleosome occupancy. This review is meant to provide an overview of the key aspects of these DNA conformational and physicochemical properties. To illustrate their potential added value compared to relying solely on the nucleotide sequence in genomics studies, we discuss their application in research on transcription regulation mechanisms as representative cases.
Collapse
Affiliation(s)
- Pieter Meysman
- Department of Molecular and Microbial Systems, KULeuven, Kasteelpark Arenberg 20, 3001 Leuven, Belgium
| | | | | |
Collapse
|
16
|
Wu Q, Zhou W, Wang J, Yan H. Correlation between the flexibility and periodic dinucleotide patterns in yeast nucleosomal DNA sequences. J Theor Biol 2011; 284:92-8. [DOI: 10.1016/j.jtbi.2011.06.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2010] [Revised: 06/20/2011] [Accepted: 06/21/2011] [Indexed: 12/25/2022]
|
17
|
Zeng J, Zhao XY, Cao XQ, Yan H. SCS: signal, context, and structure features for genome-wide human promoter recognition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2010; 7:550-562. [PMID: 20671324 DOI: 10.1109/tcbb.2008.95] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
This paper integrates the signal, context, and structure features for genome-wide human promoter recognition, which is important in improving genome annotation and analyzing transcriptional regulation without experimental supports of ESTs, cDNAs, or mRNAs. First, CpG islands are salient biological signals associated with approximately 50 percent of mammalian promoters. Second, the genomic context of promoters may have biological significance, which is based on n-mers (sequences of n bases long) and their statistics estimated from training samples. Third, sequence-dependent DNA flexibility originates from DNA 3D structures and plays an important role in guiding transcription factors to the target site in promoters. Employing decision trees, we combine above signal, context, and structure features to build a hierarchical promoter recognition system called SCS. Experimental results on controlled data sets and the entire human genome demonstrate that SCS is significantly superior in terms of sensitivity and specificity as compared to other state-of-the-art methods. The SCS promoter recognition system is available online as supplemental materials for academic use and can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.95.
Collapse
Affiliation(s)
- Jia Zeng
- School of Computer Science and Technology, Soochow University, Suzhou, China.
| | | | | | | |
Collapse
|
18
|
Effects of non-B DNA sequences on transgene expression. J Biosci Bioeng 2009; 108:20-3. [PMID: 19577186 DOI: 10.1016/j.jbiosc.2009.02.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2009] [Revised: 02/13/2009] [Accepted: 02/16/2009] [Indexed: 11/21/2022]
Abstract
DNA conformation may be an important factor affecting gene transcription. In this study, we examined how DNA sequences with unusual conformations affect transgene expression. A(30) and (CG)(15) sequences that can adopt the B' and Z conformations, respectively, were introduced into a beta-actin promoter. Luciferase plasmids containing the manipulated promoter were transfected into NIH3T3 cells by electroporation and were delivered into mouse livers with a hydrodynamics-based injection. Expression from plasmid with the (CG)(15) sequence was multiple times higher than expression from control plasmid DNA. The A(30) sequence also tended to enhance expression. These results suggest that non-B DNA sequences could improve transgene expression in cells.
Collapse
|
19
|
Zeng J, Cao XQ, Zhao H, Yan H. Finding human promoter groups based on DNA physical properties. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 80:041917. [PMID: 19905352 DOI: 10.1103/physreve.80.041917] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2009] [Revised: 08/24/2009] [Indexed: 05/28/2023]
Abstract
DNA rigidity is an important physical property originating from the DNA three-dimensional structure. Although the general DNA rigidity patterns in human promoters have been investigated, their distinct roles in transcription are largely unknown. In this paper, we discover four highly distinct human promoter groups based on similarity of their rigidity profiles. First, we find that all promoter groups conserve relatively rigid DNAs at the canonical TATA box [a consensus TATA(A/T)A(A/T) sequence] position, which are important physical signals in binding transcription factors. Second, we find that the genes activated by each group of promoters share significant biological functions based on their gene ontology annotations. Finally, we find that these human promoter groups correlate with the tissue-specific gene expression.
Collapse
Affiliation(s)
- Jia Zeng
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong.
| | | | | | | |
Collapse
|
20
|
Zeng J, Zhu S, Yan H. Towards accurate human promoter recognition: a review of currently used sequence features and classification methods. Brief Bioinform 2009; 10:498-508. [PMID: 19531545 DOI: 10.1093/bib/bbp027] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
This review describes important advances that have been made during the past decade for genome-wide human promoter recognition. Interest in promoter recognition algorithms on a genome-wide scale is worldwide and touches on a number of practical systems that are important in analysis of gene regulation and in genome annotation without experimental support of ESTs, cDNAs or mRNAs. The main focus of this review is on feature extraction and model selection for accurate human promoter recognition, with descriptions of what they are, what has been accomplished, and what remains to be done.
Collapse
Affiliation(s)
- Jia Zeng
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong.
| | | | | |
Collapse
|
21
|
Abstract
This paper discovers consensus physical signals around eukaryotic splice sites, transcription start sites, and replication origin start and end sites on a genome-wide scale based on their DNA flexibility profiles calculated by three different flexibility models. These salient physical signals are localized highly rigid and flexible DNAs, which may play important roles in protein-DNA recognition by the sliding search mechanism. The found physical signals lead us to a detailed hypothetical view of the search process in which a DNA-binding protein first finds a genomic region close to the target site from an arbitrary starting location by three-dimensional (3D) hopping and intersegment transfer mechanisms for long distances, and subsequently uses the one-dimensional (1D) sliding mechanism facilitated by the localized highly rigid DNAs to accurately locate the target flexible binding site within 30 bp (base pair) short distances. Guided by these physical signals, DNA-binding proteins rapidly search the entire genome to recognize a specific target site from the 3D to 1D pathway. Our findings also show that current promoter prediction programs (PPPs) based on DNA physical properties may suffer from lots of false positives because other functional sites such as splice sites and replication origins have similar physical signals as promoters do.
Collapse
|
22
|
Abstract
Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex.
Collapse
Affiliation(s)
- Xiao-Qin Cao
- School of Creative Media, City University of Hong Kong, Tat Chee Avenue 83, Hong Kong
| | | | | |
Collapse
|
23
|
Kurz M. Compatible solute influence on nucleic acids: many questions but few answers. SALINE SYSTEMS 2008; 4:6. [PMID: 18522725 PMCID: PMC2430576 DOI: 10.1186/1746-1448-4-6] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2008] [Accepted: 06/03/2008] [Indexed: 12/21/2022]
Abstract
Compatible solutes are small organic osmolytes including but not limited to sugars, polyols, amino acids, and their derivatives. They are compatible with cell metabolism even at molar concentrations. A variety of organisms synthesize or take up compatible solutes for adaptation to extreme environments. In addition to their protective action on whole cells, compatible solutes display significant effects on biomolecules in vitro. These include stabilization of native protein and nucleic acid structures. They are used as additives in polymerase chain reactions to increase product yield and specificity, but also in other nucleic acid and protein applications. Interactions of compatible solutes with nucleic acids and protein-nucleic acid complexes are much less understood than the corresponding interactions of compatible solutes with proteins. Although we may begin to understand solute/nucleic acid interactions there are only few answers to the many questions we have. I summarize here the current state of knowledge and discuss possible molecular mechanisms and thermodynamics.
Collapse
Affiliation(s)
- Matthias Kurz
- Institut für Mikrobiologie & Biotechnologie, Rheinische Friedrich Wilhelms-Universität Bonn, Bonn, Germany.
| |
Collapse
|
24
|
Abstract
As the number of sequenced genomes increases, the ability to deduce genome function becomes increasingly salient. For many genome sequences, the only annotation that will be available for the foreseeable future will be based on computational predictions and comparisons with functional elements in related species. Here we discuss computational approaches for automated genome-wide annotation of functional elements in mammalian genomes. These include methods for ab initio and comparative gene-structure predictions. Gene features such as intron splice sites, 3' untranslated regions, promoters, and cis-regulatory elements are discussed, as is a novel method for predicting DNaseI hypersensitive sites. Recent methodologies for predicting noncoding RNA genes, including microRNA genes and their targets, are also reviewed.
Collapse
Affiliation(s)
- Steven J M Jones
- Genome Sciences Centre, British Columbia Cancer Research Center, Vancouver, British Columbia, V5Z 1L3, Canada.
| |
Collapse
|
25
|
Cao XQ, Zeng J, Yan H. Structural property of regulatory elements in human promoters. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2008; 77:041908. [PMID: 18517657 DOI: 10.1103/physreve.77.041908] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2007] [Indexed: 05/26/2023]
Abstract
The capacity of transcription factors to activate gene expression is encoded in the promoter sequences, which are composed of short regulatory motifs that function as transcription factor binding sites (TFBSs) for specific proteins. To the best of our knowledge, the structural property of TFBSs that controls transcription is still poorly understood. Rigidity is one of the important structural properties of DNA, and plays an important role in guiding DNA-binding proteins to the target sites efficiently. After analyzing the rigidity of 2897 TFBSs in 1871 human promoters, we show that TFBSs are generally more flexible than other genomic regions such as exons, introns, 3' untranslated regions, and TFBS-poor promoter regions. Furthermore, we find that the density of TFBSs is consistent with the average rigidity profile of human promoters upstream of the transcription start site, which implies that TFBSs directly influence the promoter structure. We also examine the local rigid regions probably caused by specific TFBSs such as the DNA sequence TATA(A/T)A(A/T) box, which may inhibit nucleosomes and thereby facilitate the access of transcription factors bound nearby. Our results suggest that the structural property of TFBSs accounts for the promoter structure as well as promoter activity.
Collapse
Affiliation(s)
- Xiao-Qin Cao
- Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue 83, Hong Kong
| | | | | |
Collapse
|
26
|
Abeel T, Saeys Y, Bonnet E, Rouzé P, Van de Peer Y. Generic eukaryotic core promoter prediction using structural features of DNA. Genes Dev 2008; 18:310-23. [PMID: 18096745 PMCID: PMC2203629 DOI: 10.1101/gr.6991408] [Citation(s) in RCA: 133] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2007] [Accepted: 11/14/2007] [Indexed: 11/24/2022]
Abstract
Despite many recent efforts, in silico identification of promoter regions is still in its infancy. However, the accurate identification and delineation of promoter regions is important for several reasons, such as improving genome annotation and devising experiments to study and understand transcriptional regulation. Current methods to identify the core region of promoters require large amounts of high-quality training data and often behave like black box models that output predictions that are difficult to interpret. Here, we present a novel approach for predicting promoters in whole-genome sequences by using large-scale structural properties of DNA. Our technique requires no training, is applicable to many eukaryotic genomes, and performs extremely well in comparison with the best available promoter prediction programs. Moreover, it is fast, simple in design, and has no size constraints, and the results are easily interpretable. We compared our approach with 14 current state-of-the-art implementations using human gene and transcription start site data and analyzed the ENCODE region in more detail. We also validated our method on 12 additional eukaryotic genomes, including vertebrates, invertebrates, plants, fungi, and protists.
Collapse
Affiliation(s)
- Thomas Abeel
- Department of Plant Systems Biology, Flanders Institute for Biotechnology (VIB), 9052 Gent, Belgium
- Department of Molecular Genetics, Ghent University, 9052 Gent, Belgium
| | - Yvan Saeys
- Department of Plant Systems Biology, Flanders Institute for Biotechnology (VIB), 9052 Gent, Belgium
- Department of Molecular Genetics, Ghent University, 9052 Gent, Belgium
| | - Eric Bonnet
- Department of Plant Systems Biology, Flanders Institute for Biotechnology (VIB), 9052 Gent, Belgium
- Department of Molecular Genetics, Ghent University, 9052 Gent, Belgium
| | - Pierre Rouzé
- Department of Plant Systems Biology, Flanders Institute for Biotechnology (VIB), 9052 Gent, Belgium
- Department of Molecular Genetics, Ghent University, 9052 Gent, Belgium
- Laboratoire Associé de l’INRA (France), Ghent University, 9052 Gent, Belgium
| | - Yves Van de Peer
- Department of Plant Systems Biology, Flanders Institute for Biotechnology (VIB), 9052 Gent, Belgium
- Department of Molecular Genetics, Ghent University, 9052 Gent, Belgium
| |
Collapse
|
27
|
DNA sequence and structural properties as predictors of human and mouse promoters. Gene 2007; 410:165-76. [PMID: 18234453 PMCID: PMC2672154 DOI: 10.1016/j.gene.2007.12.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Revised: 11/30/2007] [Accepted: 12/05/2007] [Indexed: 11/21/2022]
Abstract
Promoters play a central role in gene regulation, yet our power to discriminate them from non-promoter sequences in higher eukaryotes is mainly restricted to those associated with CpG islands. Here, we examined in silico the promoters of 30,954 human and 18,083 mouse transcripts in the DBTSS database, to assess the impact of particular sequence and structural features (propeller twist, bendability and nucleosome positioning preference) on promoter classification and prediction. Our analysis showed that a stricter-than-traditional definition of CpG islands captures low and high CpG count promoter classes more accurately than the traditional one. We observed that both human and mouse promoter sequences are flexible with the exception of the TATA box and TSS, which are rigid regions irrespective of association with a CpG island. Therefore varying levels of structural flexibility in promoters may affect their accessibility to proteins, and hence their specificity. For all features investigated, averaged values across core promoters discriminated CpG island associated promoters from background, whereas the same did not hold for promoters without a CpG island. However, local changes around - 34 to - 23 (expected position of TATA box) and the TSS were informative in discriminating promoters (both classes) from non-promoter sequences. Additionally, we investigated ATG deserts and observed that they occur in all promoter sets except those with a TATA-box and without a CpG island in human. Interestingly, all mouse promoter sets showed ATG codon depletion irrespective of the presence of a TATA-box, possibly reflecting a weaker contribution to TSS specificity in mouse.
Collapse
|
28
|
Zhao X, Xuan Z, Zhang MQ. Boosting with stumps for predicting transcription start sites. Genome Biol 2007; 8:R17. [PMID: 17274821 PMCID: PMC1852414 DOI: 10.1186/gb-2007-8-2-r17] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2006] [Revised: 12/01/2006] [Accepted: 02/02/2007] [Indexed: 12/05/2022] Open
Abstract
CoreBoost applies a boosting technique to select important features for predicting core promoters with diverse patterns. Promoter prediction is a difficult but important problem in gene finding, and it is critical for elucidating the regulation of gene expression. We introduce a new promoter prediction program, CoreBoost, which applies a boosting technique with stumps to select important small-scale as well as large-scale features. CoreBoost improves greatly on locating transcription start sites. We also demonstrate that by further utilizing some tissue-specific information, better accuracy can be achieved.
Collapse
Affiliation(s)
- Xiaoyue Zhao
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Zhenyu Xuan
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | - Michael Q Zhang
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA
| |
Collapse
|
29
|
Kundu P, Alioua A, Stefani E, Toro L. Regulation of mouse Slo gene expression: multiple promoters, transcription start sites, and genomic action of estrogen. J Biol Chem 2007; 282:27478-27492. [PMID: 17635926 DOI: 10.1074/jbc.m704777200] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The large conductance, voltage- and Ca(2+)-activated K(+) channel plays key roles in diverse body functions influenced by estrogen, including smooth muscle and neural activities. In mouse (m), estrogen up-regulates the transcript levels of its pore-forming alpha-subunit (Slo, KCNMA1), yet the underlying genomic mechanism(s) is (are) unknown. We first mapped the promoters and regulatory motifs within the mSlo 5'-flanking sequence to subsequently identify genomic regions and mechanisms required for estrogen regulation. mSlo gene has at least two TATA-less promoters with distinct potencies that may direct mSlo transcription from multiple transcription start sites. These qualities mark mSlo as a prototype gene with promoter plasticity capable of generating multiple mRNAs and the potential to adapt to organismal needs. mSlo promoters contain multiple estrogen-responsive sequences, e.g. two quasi-perfect estrogen-responsive elements, ERE1 and ERE2, and Sp1 sites. Accordingly, mSlo promoter activity was highly enhanced by estrogen and blocked by estrogen antagonist ICI 182,780. When promoters are embedded in a 4.91-kb backbone, estrogen responsiveness involves a classical genomic mechanism, via ERE1 and ERE2, that may be complemented by Sp factors, particularly Sp1. Simultaneous but not individual ERE1 and ERE2 mutations caused significant loss of estrogen action. ERE2, which is closer to the proximal promoter, up-regulates this promoter via a classical genomic mechanism. ERE2 strategic position together with ERE1 and ERE2 independence and Sp contribution should ensure mSlo estrogen responsiveness. Thus, the mSlo gene seems to have uniquely evolved to warrant estrogen regulation. Estrogen-mediated mSlo genomic regulation has important implications on long term estrogenic effects affecting smooth muscle and neural functions.
Collapse
Affiliation(s)
- Pallob Kundu
- Department of Anesthesiology, Division of Molecular Medicine, the.
| | | | - Enrico Stefani
- Department of Anesthesiology, Division of Molecular Medicine, the; Department of Physiology, UCLA, Los Angeles, California 90095; Cardiovascular Research Laboratories and Brain Research Institute, UCLA, Los Angeles, California 90095
| | - Ligia Toro
- Department of Anesthesiology, Division of Molecular Medicine, the; Cardiovascular Research Laboratories and Brain Research Institute, UCLA, Los Angeles, California 90095; Department of Molecular and Medical Pharmacology and UCLA, Los Angeles, California 90095
| |
Collapse
|
30
|
Kausel G, Salazar M, Castro L, Vera T, Romero A, Muller M, Figueroa J. Modular changes of cis-regulatory elements from two functional Pit1 genes in the duplicated genome of Cyprinus carpio. J Cell Biochem 2007; 99:905-21. [PMID: 16724305 DOI: 10.1002/jcb.20987] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The pituitary-specific transcription factor Pit1 is involved in its own regulation and in a network of transcriptional regulation of hypothalamo-hypophyseal factors including prolactin (PRL) and growth hormone (GH). In the ectotherm teleost Cyprinus carpio, Pit1 plays an important role in regulation of the adaptive response to seasonal environmental changes. Two Pit1 genes exist in carp, a tetraploid vertebrate and transcripts of both genes were detected by RT-PCR analysis. Powerful comparative analyses of the 5'-flanking regions revealed copy specific changes comprising modular functional units in the naturally evolved promoters. These include the precise replacement of four nucleotides around the transcription start site embedded in completely conserved regions extending upstream of the TATA-box, an additional transcription factor binding site in the 5'-UTR of gene-I and, instead, duplication of a 9 bp element in gene-II. Binding of nuclear factors was assessed by electro mobility shift assays using extracts from rat pituitary cells and carp pituitary. Binding was confirmed at one conserved Pit1, one conserved CREB and one consensus MTF1. Interestingly, two functional Pit1 sites and one putative MTF1 binding site are unique to the Pit1 gene-I. In situ hybridization experiments revealed that the expression of gene-I in winter carp was significantly stronger than that of gene-II. Our data suggest that the specific control elements identified in the proximal regulatory region are physiologically relevant for the function of the duplicated Pit1 genes in carp and highlight modular changes in the architecture of two Pit1 genes that evolved for at least 12 MYA in the same organism.
Collapse
Affiliation(s)
- G Kausel
- Instituto de Bioquímica, Universidad Austral de Chile, Casilla 567, Valdivia, Chile.
| | | | | | | | | | | | | |
Collapse
|
31
|
Milani P, Marilley M, Rocca-Serra J. TBP binding capacity of the TATA box is associated with specific structural properties: AFM study of the IL-2R alpha gene promoter. Biochimie 2006; 89:528-33. [PMID: 17336441 DOI: 10.1016/j.biochi.2006.12.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2006] [Accepted: 12/12/2006] [Indexed: 11/27/2022]
Abstract
DNA is not only a nucleotide sequence which allows the binding of regulators but its intrinsic structural properties such as curvature and flexibility are also viewed as playing an active role in the regulation of transcription. Our combination of computer modelling and AFM imaging allow direct access to DNA curvature and flexibility. We have searched for these DNA structural features involved in transcription regulation within the IL-2Ralpha gene promoter. Investigation of these structural characteristics shows concordant results. First, in the core promoter, the region containing the functional TATA box shows intrinsic curvature associated with a peculiar distribution of flexibility. Both these inherent properties are characteristic of this region as compared with the other parts of the promoter. Second, the proximal promoter exhibits two important regions: a first one flexible and curved, followed by a segment of rigid linear DNA, each localised within one of the two Positive Regulatory Regions PRRI and PRRII respectively. Based on these observations, we propose different roles for DNA curvature and/or flexibility in promoter sequences.
Collapse
Affiliation(s)
- Pascale Milani
- RGFCP EA 3290, Faculté de Médecine, Université de la Méditerranée, 27, Bvd Jean Moulin, 13385 Marseille cedex 5, France.
| | | | | |
Collapse
|