1
|
Costantini M, Musto H. The Isochores as a Fundamental Level of Genome Structure and Organization: A General Overview. J Mol Evol 2017; 84:93-103. [PMID: 28243687 DOI: 10.1007/s00239-017-9785-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 02/15/2017] [Indexed: 11/30/2022]
Abstract
The recent availability of a number of fully sequenced genomes (including marine organisms) allowed to map very precisely the isochores, based on DNA sequences, confirming the results obtained before genome sequencing by the ultracentrifugation in CsCl. In fact, the analytical profile of human DNA showed that the vertebrate genome is a mosaic of isochores, typically megabase-size DNA segments that belong to a small number of families characterized by different GC levels. In this review, we will concentrate on some general genome features regarding the compositional organization from different organisms and their evolution, ranging from vertebrates to invertebrates until unicellular organisms. Since isochores are tightly linked to biological properties such as gene density, replication timing, and recombination, the new level of detail provided by the isochore map helped the understanding of genome structure, function, and evolution. All the findings reported here confirm the idea that the isochores can be considered as a "fundamental level of genome structure and organization." We stress that we do not discuss in this review the origin of isochores, which is still a matter of controversy, but we focus on well established structural and physiological aspects.
Collapse
Affiliation(s)
- Maria Costantini
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Napoli, Italy.
| | - Héctor Musto
- Laboratorio de Organización y Evolución del Genoma, Unidad de Genómica Evolutiva, Facultad de Ciencias, 11400, Montevideo, Uruguay
| |
Collapse
|
2
|
Costantini M. An overview on genome organization of marine organisms. Mar Genomics 2015; 24 Pt 1:3-9. [PMID: 25899406 DOI: 10.1016/j.margen.2015.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Revised: 03/17/2015] [Accepted: 03/17/2015] [Indexed: 11/16/2022]
Abstract
In this review we will concentrate on some general genome features of marine organisms and their evolution, ranging from vertebrate to invertebrates until unicellular organisms. Before genome sequencing, the ultracentrifugation in CsCl led to high resolution of mammalian DNA (without seeing at the sequence). The analytical profile of human DNA showed that the vertebrate genome is a mosaic of isochores, typically megabase-size DNA segments that belong in a small number of families characterized by different GC levels. The recent availability of a number of fully sequenced genomes allowed mapping very precisely the isochores, based on DNA sequences. Since isochores are tightly linked to biological properties such as gene density, replication timing and recombination, the new level of detail provided by the isochore map helped the understanding of genome structure, function and evolution. This led the current level of knowledge and to further insights.
Collapse
Affiliation(s)
- Maria Costantini
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy.
| |
Collapse
|
3
|
Costantini M, Alvarez-Valin F, Costantini S, Cammarano R, Bernardi G. Compositional patterns in the genomes of unicellular eukaryotes. BMC Genomics 2013; 14:755. [PMID: 24188247 PMCID: PMC4007698 DOI: 10.1186/1471-2164-14-755] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Accepted: 10/31/2013] [Indexed: 11/29/2022] Open
Abstract
Background The genomes of multicellular eukaryotes are compartmentalized in mosaics of isochores, large and fairly homogeneous stretches of DNA that belong to a small number of families characterized by different average GC levels, by different gene concentration (that increase with GC), different chromatin structures, different replication timing in the cell cycle, and other different properties. A question raised by these basic results concerns how far back in evolution the compartmentalized organization of the eukaryotic genomes arose. Results In the present work we approached this problem by studying the compositional organization of the genomes from the unicellular eukaryotes for which full sequences are available, the sample used being representative. The average GC levels of the genomes from unicellular eukaryotes cover an extremely wide range (19%-60% GC) and the compositional patterns of individual genomes are extremely different but all genomes tested show a compositional compartmentalization. Conclusions The average GC range of the genomes of unicellular eukaryotes is very broad (as broad as that of prokaryotes) and individual compositional patterns cover a very broad range from very narrow to very complex. Both features are not surprising for organisms that are very far from each other both in terms of phylogenetic distances and of environmental life conditions. Most importantly, all genomes tested, a representative sample of all supergroups of unicellular eukaryotes, are compositionally compartmentalized, a major difference with prokaryotes.
Collapse
Affiliation(s)
- Maria Costantini
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, Naples 80121, Italy.
| | | | | | | | | |
Collapse
|
4
|
Segal MR, Xiao Y, Huffer FW. Clustering with exclusion zones: genomic applications. Biostatistics 2010; 12:234-46. [PMID: 21051753 DOI: 10.1093/biostatistics/kxq066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Methods for formally evaluating the clustering of events in space or time, notably the scan statistic, have been richly developed and widely applied. In order to utilize the scan statistic and related approaches, it is necessary to know the extent of the spatial or temporal domains wherein the events arise. Implicit in their usage is that these domains have no "holes"-hereafter "exclusion zones"-regions in which events a priori cannot occur. However, in many contexts, this requirement is not met. When the exclusion zones are known, it is straightforward to correct the scan statistic for their occurrence by simply adjusting the extent of the domain. Here, we tackle the more ambitious objective of formally evaluating clustering in the presence of "unknown" exclusion zones. We develop an algorithm for estimating total exclusion zone extent, the quantity needed to correct scan statistic-based inference, using distributional properties of "spacings," and show how bias correction for this estimator can be effected. Performance of the algorithm is assessed via simulation study. We showcase applications to genomic settings for differing marker (event) types-binding sites, housekeeping genes, and microRNAs-wherein exclusion zones can arise through a variety of mechanisms. In several instances, dramatic changes to unadjusted inference that does not accommodate exclusions are evidenced.
Collapse
Affiliation(s)
- Mark R Segal
- Division of Biostatistics, University of California, San Francisco, CA 94107, USA.
| | | | | |
Collapse
|
5
|
Provata A, Katsaloulis P. Hierarchical multifractal representation of symbolic sequences and application to human chromosomes. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 81:026102. [PMID: 20365626 DOI: 10.1103/physreve.81.026102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2009] [Indexed: 05/29/2023]
Abstract
The two-dimensional density correlation matrix is constructed for symbolic sequences using contiguous segments of arbitrary size. The multifractal spectrum obtained from this matrix motif is shown to characterize the correlations in the symbolic sequences. This method is applied to entire human chromosomes, shuffled human chromosomes, reconstructed human genomic sequences and to artificial random sequences. It is shown that all human chromosomes have common characteristics in their multifractal spectrum and deviate substantially from random and uncorrelated sequences of the same size. Small deviations are observed between the longer and the shorter chromosomes, especially for the higher (in absolute values) statistical moments. The correlations are crucial for the form of the multifractal spectrum; surrogate shuffled chromosomes present randomlike spectrum, distinctly different from the actual chromosomes. Analytical approaches based on hierarchical superposition of tensor products show that retaining pair correlations in the sequences leads to a closer representation of the genomic multifractal spectra, especially in the region of negative exponents, due to the underrepresentation of various functional units (such as the cytosine-guanine CG combination and its complementary GC complex). Retaining higher-order correlations in the construction of the tensor products is a way to approach closer the structure of the multifractal spectra of the actual genomic sequences. This hierarchical approach is generic and is applicable to other correlated symbolic sequences.
Collapse
Affiliation(s)
- A Provata
- Institute of Physical Chemistry, National Center for Scientific Research Demokritos, 15310 Athens, Greece
| | | |
Collapse
|
6
|
Abstract
AbstractShort runs of adenines are a ubiquitous DNA element in regulatory regions of many organisms. When runs of 4–6 adenine base pairs (‘A-tracts’) are repeated with the helical periodicity, they give rise to global curvature of the DNA double helix, which can be macroscopically characterized by anomalously slow migration on polyacrylamide gels. The molecular structure of these DNA tracts is unusual and distinct from that of canonical B-DNA. We review here our current knowledge about the molecular details of A-tract structure and its interaction with sequences flanking them of either side and with the environment. Various molecular models were proposed to describe A-tract structure and how it causes global deflection of the DNA helical axis. We review old and recent findings that enable us to amalgamate the various findings to one model that conforms to the experimental data. Sequences containing phased repeats of A-tracts have from the very beginning been synonymous with global intrinsic DNA bending. In this review, we show that very often it is the unique structure of A-tracts that is at the basis of their widespread occurrence in regulatory regions of many organisms. Thus, the biological importance of A-tracts may often be residing in their distinct structure rather than in the global curvature that they induce on sequences containing them.
Collapse
|
7
|
Mrázek J, Guo X, Shah A. Simple sequence repeats in prokaryotic genomes. Proc Natl Acad Sci U S A 2007; 104:8472-7. [PMID: 17485665 PMCID: PMC1895974 DOI: 10.1073/pnas.0702412104] [Citation(s) in RCA: 108] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2006] [Indexed: 11/18/2022] Open
Abstract
Simple sequence repeats (SSRs) in DNA sequences are composed of tandem iterations of short oligonucleotides and may have functional and/or structural properties that distinguish them from general DNA sequences. They are variable in length because of slip-strand mutations and may also affect local structure of the DNA molecule or the encoded proteins. Long SSRs (LSSRs) are common in eukaryotes but rare in most prokaryotes. In pathogens, SSRs can enhance antigenic variance of the pathogen population in a strategy that counteracts the host immune response. We analyze representations of SSRs in >300 prokaryotic genomes and report significant differences among different prokaryotes as well as among different types of SSRs. LSSRs composed of short oligonucleotides (1-4 bp length, designated LSSR(1-4)) are often found in host-adapted pathogens with reduced genomes that are not known to readily survive in a natural environment outside the host. In contrast, LSSRs composed of longer oligonucleotides (5-11 bp length, designated LSSR(5-11)) are found mostly in nonpathogens and opportunistic pathogens with large genomes. Comparisons among SSRs of different lengths suggest that LSSR(1-4) are likely maintained by selection. This is consistent with the established role of some LSSR(1-4) in enhancing antigenic variance. By contrast, abundance of LSSR(5-11) in some genomes may reflect the SSRs' general tendency to expand rather than their specific role in the organisms' physiology. Differences among genomes in terms of SSR representations and their possible interpretations are discussed.
Collapse
Affiliation(s)
- Jan Mrázek
- Department of Microbiology, University of Georgia, Athens, GA 30602, USA.
| | | | | |
Collapse
|
8
|
Patel MM, Anchordoquy TJ. Ability of spermine to differentiate between DNA sequences--preferential stabilization of A-tracts. Biophys Chem 2006; 122:5-15. [PMID: 16504371 DOI: 10.1016/j.bpc.2006.02.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2006] [Revised: 02/07/2006] [Accepted: 02/07/2006] [Indexed: 11/16/2022]
Abstract
The regulatory roles fulfilled by polyamines by governance of chromatin structure are made possible by their strong association with cellular DNA, and hence by their ability to modulate DNA structure and function. Towards this end, it is crucial to understand the manifestation of sequence-dependent polyamine binding at the secondary and tertiary structural levels of DNA. This study utilizes circular dichroism (CD) and isothermal titration calorimetry (ITC) to address this relationship by using 20bp oligonucleotides with sequences-poly(dA):poly(dT), poly(dAdT):poly(dAdT), poly(dG):poly(dC), poly(dGdC):poly(dGdC)-that yield physiologically relevant structures, and poly(dIdC):poly(dIdC). CD studies show that at physiological ionic strength (150mM NaCl), spermine preferentially stabilizes A-tracts, and increases flexibility of the G-tract oligomer; the latter is also suggested by the larger change in entropy (DeltaS) of spermine binding to G-tracts. Given the chromatin destabilizing property of these sequences, these findings suggest a role for spermine in stabilization of non-nucleosomal A-tracts, and a compensating mechanism for incorporation of G-tracts in the chromatin structure. Other implications of these findings in sequence dependent DNA packaging are discussed.
Collapse
Affiliation(s)
- Mayank M Patel
- Department of Pharmaceutical Sciences, School of Pharmacy--C238, University of Colorado Health Sciences Center, 4200 E. Ninth Avenue, Denver, CO 80262, USA.
| | | |
Collapse
|
9
|
Yuan GC, Liu YJ, Dion MF, Slack MD, Wu LF, Altschuler SJ, Rando OJ. Genome-scale identification of nucleosome positions in S. cerevisiae. Science 2005; 309:626-30. [PMID: 15961632 DOI: 10.1126/science.1112178] [Citation(s) in RCA: 891] [Impact Index Per Article: 44.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
The positioning of nucleosomes along chromatin has been implicated in the regulation of gene expression in eukaryotic cells, because packaging DNA into nucleosomes affects sequence accessibility. We developed a tiled microarray approach to identify at high resolution the translational positions of 2278 nucleosomes over 482 kilobases of Saccharomyces cerevisiae DNA, including almost all of chromosome III and 223 additional regulatory regions. The majority of the nucleosomes identified were well-positioned. We found a stereotyped chromatin organization at Pol II promoters consisting of a nucleosome-free region approximately 200 base pairs upstream of the start codon flanked on both sides by positioned nucleosomes. The nucleosome-free sequences were evolutionarily conserved and were enriched in poly-deoxyadenosine or poly-deoxythymidine sequences. Most occupied transcription factor binding motifs were devoid of nucleosomes, strongly suggesting that nucleosome positioning is a global determinant of transcription factor access.
Collapse
Affiliation(s)
- Guo-Cheng Yuan
- Bauer Center for Genomics Research, Harvard University, 7 Divinity Avenue, Cambridge, MA 02138, USA
| | | | | | | | | | | | | |
Collapse
|
10
|
Abstract
Palindromes are symmetrical words of DNA in the sense that they read exactly the same as their reverse complementary sequences. Representing the occurrences of palindromes in a DNA molecule as points on the unit interval, the scan statistics can be used to identify regions of unusually high concentration of palindromes. These regions have been associated with the replication origins on a few herpesviruses in previous studies. However, the use of scan statistics requires the assumption that the points representing the palindromes are independently and uniformly distributed on the unit interval. In this paper, we provide a mathematical basis for this assumption by showing that in randomly generated DNA sequences, the occurrences of palindromes can be approximated by a Poisson process. An easily computable upper bound on the Wasserstein distance between the palindrome process and the Poisson process is obtained. This bound is then used as a guide to choose an optimal palindrome length in the analysis of a collection of 16 herpesvirus genomes. Regions harboring significant palindrome clusters are identified and compared to known locations of replication origins. This analysis brings out a few interesting extensions of the scan statistics that can help formulate an algorithm for more accurate prediction of replication origins.
Collapse
Affiliation(s)
- Ming-Ying Leung
- Department of Mathematical Sciences, University of Texas at El Paso, El Paso, TX 79968-0514, USA.
| | | | | | | |
Collapse
|
11
|
Anderson JD, Widom J. Poly(dA-dT) promoter elements increase the equilibrium accessibility of nucleosomal DNA target sites. Mol Cell Biol 2001; 21:3830-9. [PMID: 11340174 PMCID: PMC87046 DOI: 10.1128/mcb.21.11.3830-3839.2001] [Citation(s) in RCA: 129] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Polypurine tracts are important elements of eukaryotic promoters. They are believed to somehow destabilize chromatin, but the mechanism of their action is not known. We show that incorporating an A(16) element at an end of the nucleosomal DNA and further inward destabilizes histone-DNA interactions by 0.1 +/- 0.03 and 0.35 +/- 0.04 kcal mol(-1), respectively, and is accompanied by 1.5- +/- 0.1-fold and 1.7- +/- 0.1-fold increases in position-averaged equilibrium accessibility of nucleosomal DNA target sites. These effects are comparable in magnitude to effects of A(16) elements that correlate with transcription in vivo, suggesting that our system may capture most of their physiological role. These results point to two distinct but interrelated models for the mechanism of action of polypurine tract promoter elements in vivo. Given a nucleosome positioned over a promoter region, the presence of a polypurine tract in that nucleosome's DNA decreases the stability of the DNA wrapping, increasing the equilibrium accessibility of other DNA target sites buried inside that nucleosome. Alternatively (if nucleosomes are freely mobile), the presence of a polypurine tract provides a free energy bias for the nucleosome to move to alternative locations, thereby changing the equilibrium accessibilities of other nearby DNA target sites.
Collapse
Affiliation(s)
- J D Anderson
- Department of Biochemistry, Molecular Biology, and Cell Biology, Illinois 60208, USA
| | | |
Collapse
|
12
|
Shimizu M, Mori T, Sakurai T, Shindo H. Destabilization of nucleosomes by an unusual DNA conformation adopted by poly(dA) small middle dotpoly(dT) tracts in vivo. EMBO J 2000; 19:3358-65. [PMID: 10880448 PMCID: PMC313933 DOI: 10.1093/emboj/19.13.3358] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Poly(dA) small middle dotpoly(dT) tracts are common and often found upstream of genes in eukaryotes. It has been suggested that poly(dA) small middle dotpoly(dT) promotes transcription in vivo by affecting nucleosome formation. On the other hand, in vitro studies show that poly(dA) small middle dotpoly(dT) can be easily incorporated into nucleosomes. Therefore, the roles of these tracts in nucleosome organization in vivo remain to be established. We have developed an assay system that can evaluate nucleosome formation in yeast cells, and demonstrated that relatively longer tracts such as A(15)TATA(16) and A(34) disrupt an array of positioned nucleosomes, whereas a shorter A(5)TATA(4) tract is incorporated in positioned nucleosomes of yeast minichromosomes. Thus, nucleosomes are destabilized by poly(dA) small middle dotpoly(dT) in vivo in a length-dependent manner. Furthermore, in vivo UV footprinting revealed that the longer tracts adopt an unusual DNA structure in yeast cells that corresponds to the B' conformation described in vitro. Our results support a mechanism in which a unique poly(dA) small middle dot poly(dT) conformation presets chromatin structure to which transcription factors are accessible.
Collapse
Affiliation(s)
- M Shimizu
- Department of Chemistry, Meisei University, Hino, Tokyo 191-8506 and School of Pharmacy, Tokyo University of Pharmacy and Life Science, Hachioji, Tokyo 192-0392, Japan.
| | | | | | | |
Collapse
|
13
|
Hénaut A, Lisacek F, Nitschké P, Moszer I, Danchin A. Global analysis of genomic texts: the distribution of AGCT tetranucleotides in the Escherichia coli and Bacillus subtilis genomes predicts translational frameshifting and ribosomal hopping in several genes. Electrophoresis 1998; 19:515-27. [PMID: 9588797 DOI: 10.1002/elps.1150190411] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Present availability of the genomic text of bacteria allows assignment of biological known functions to many genes (typically, half of the genome's gene content). It is now time to try and predict new unexpected functions, using inductive procedures that allow correlating the content of the genomic text to possible biological functions. We show here that analysis of the genomes of Escherichia coli and Bacillus subtilis for the distribution of AGCT motifs predicts that genes exist for which the mRNA molecule can be translated as several different proteins synthesized after ribosomal frameshifting or hopping. Among these genes we found that several coded for the same function in E. coli and B. subtilis. We analyzed in depth the situation of the infB gene (experimentally known to specify synthesis of several proteins differing in their translation starts), the aceF/pdhC gene, the eno gene, and the rplI gene. In addition, genes specific to E. coli were also studied: ompA, ompFand tolA (predicting epigenetic variation that could help escape infection by phages or colicins).
Collapse
Affiliation(s)
- A Hénaut
- Université de Versailles Saint Quentin, France
| | | | | | | | | |
Collapse
|
14
|
Field D, Wills C. Abundant microsatellite polymorphism in Saccharomyces cerevisiae, and the different distributions of microsatellites in eight prokaryotes and S. cerevisiae, result from strong mutation pressures and a variety of selective forces. Proc Natl Acad Sci U S A 1998; 95:1647-52. [PMID: 9465070 PMCID: PMC19132 DOI: 10.1073/pnas.95.4.1647] [Citation(s) in RCA: 142] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/1997] [Indexed: 02/06/2023] Open
Abstract
We examined the distributions of short tandemly repeated DNAs (microsatellites) in nine complete microbial genomes (Saccharomyces cerevisiae, Archaeoglobus fulgidus, Escherichia coli, Haemophilus influenzae, Helicobacter pylori, Methanococcus jannaschii, Mycoplasma pneumoniae, M. genitalium, and Synechocystis PCC6803.) These repeats contribute differently to the global features of these genomes, and we explore the evolutionary implications of these differences by empirical examination of length polymorphisms at 20 long triplet-repeats repeats in S. cerevisiae, and by comparison of observed and expected repeat distributions. All of a sample of 20 microsatellites found in S. cerevisiae are highly polymorphic in length, suggesting that mutation pressure overcomes overall selection for small genome size that will tend to shorten or eliminate unnecessary DNA. By comparison, prokaryotes have fewer long repeats than expected, except for a few statistically improbable repeats that appear to function in gene regulation. Finally, we find that in all these genomes there is an excess of repeats shorter than those traditionally considered to be microsatellites. This finding suggests that even in prokaryotes these repeats are being generated by mutational pressures. These results have important potential implications for understanding genome stability and evolution in these microbial species.
Collapse
Affiliation(s)
- D Field
- Department of Biology, University of California at San Diego, La Jolla, CA 92093-0116, USA
| | | |
Collapse
|
15
|
Rubbi L, Camilloni G, Caserta M, Di Mauro E, Venditti S. Chromatin structure of the Saccharomyces cerevisiae DNA topoisomerase I promoter in different growth phases. Biochem J 1997; 328 ( Pt 2):401-7. [PMID: 9371694 PMCID: PMC1218934 DOI: 10.1042/bj3280401] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
We have determined the chromatin organization of the Saccharomyces cerevisiae DNA topoisomerase I promoter. Three nucleosomal core particles have been mapped at nucleotide level over the promoter region, encompassing the presumptive TATA sequence and the two RNA initiation sites; the most upstream nucleosome particle forms on to a 29 bp-long poly(dA-dT) element. This simple organization remains constant throughout both the logarithmic and the linear phase of growth, with the exception of an increased accessibility to micrococcal nuclease of the nucleosome covering the TATA box and the RNA initiation sites during the diauxic shift (the switching from the fermentative to the respiratory metabolism) in parallel with an increase of the DNA topoisomerase I mRNA. In addition, a strong disorganization of the bulk chromatin structure in the late stationary phase is also reported.
Collapse
Affiliation(s)
- L Rubbi
- Fondazione Istituto Pasteur-Fondazione Cenci-Bolognetti, c/o Dipartimento di Genetica e Biologia Molecolare, Università 'La Sapienza', P.le A. Moro 5, 00185 Rome, Italy
| | | | | | | | | |
Collapse
|
16
|
Abstract
The yeast genome exhibits a variety of trinucleotide repeat arrays within protein-coding genes and intergenic regions. In the first situation, repeats are often not random relative to the translational frame, resulting preferably in long stretches of the two acidic amino acids or of their corresponding amine forms. Interestingly, the longest trinucleotide repeats are often found in genes encoding nuclearly located proteins. Repeats tend to be more frequent in long genes, but less frequent among members of gene families compared to unique genes. In the latter case, repeat arrays often differ in length or composition between the gene homologs, indicating their instability.
Collapse
Affiliation(s)
- G F Richard
- Unité de Génétique moléculaire des Levures (UMR1300 CNRS and UFR927 Univ. P. M. Curie, Paris), Institut Pasteur
| | | |
Collapse
|
17
|
Hill KA, Singh SM. The evolution of species-type specificity in the global DNA sequence organization of mitochondrial genomes. Genome 1997; 40:342-56. [PMID: 9202414 DOI: 10.1139/g97-047] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Prokaryote genomes and nuclear genomes of eukaryotes have a global DNA sequence organization that is species type specific, determined primarily by nearest-neighbor nucleotide associations, and independent of gene function and sequence length. The determinants of such a global structure have remained largely uncharacterized. The monophyletic and endosymbiotic origin of mitochondria permit examination of the influence of evolutionary time and host species type. Different global structures were seen among (i) protozoan and plant (ii) fungal, (iii) algal (iv) nematode, (v) echinoderm, (vi) insect, and (vii) vertebrate species followed examination of 28 complete mitochondrial genomes using chaos representation and measure of short-sequence representation. The mitochondrial genomes have biases in single-nucleotide and dinucleotide representation, specifically, an overrepresentation of A and T nucleotides and CC/GG and AG/CT dinucleotides and a deficiency of CG dinucleotides, in all but one genome. Dinucleotide representation is similar among (i) mitochondrial genomes of more closely related species; (ii) mitochondrial genomes and the Mycoplasma capricolum genome, a proposed progenitor of mitochondrial genomes; and (iii) mitochondrial genomes of diverse species, more so than between the mitochondrial and the nuclear genome of the same or a closely related species. It is hypothesized that sufficient evolutionary time has permitted host-specific constraints to affect nuclear and mitochondrial genomes and that different species type specific constraints influence nuclear and mitochondrial genome global structure.
Collapse
Affiliation(s)
- K A Hill
- Molecular Genetics Laboratory, Department of Zoology, London, ON, Canada
| | | |
Collapse
|
18
|
Viswanathan GM, Buldyrev SV, Havlin S, Stanley HE. Quantification of DNA patchiness using long-range correlation measures. Biophys J 1997; 72:866-75. [PMID: 9017212 PMCID: PMC1185610 DOI: 10.1016/s0006-3495(97)78721-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
We introduce and develop new techniques to quantify DNA patchiness, and to quantify characteristics of its mosaic structure. These techniques, which involve calculating two functions, alpha(l) and beta(l), measure correlations at length scale l and detect distinct characteristic patch sizes embedded in scale-invariant patch size distributions. Using these new methods, we address a number of issues relating to the mosaic structure of genomic DNA. We find several distinct characteristic patch sizes in certain genomic sequences, and compare, contrast, and quantify the correlation properties of different sequences, including a number of yeast, human, and prokaryotic sequences. We exclude the possibility that the correlation properties and the known mosaic structure of DNA can be explained either by simple Markov processes or by tandem repeats of dinucleotides. We find that the distinct patch sizes in all 16 yeast chromosomes are similar. Furthermore, we test the hypothesis that, for yeast, patchiness is caused by the alternation of coding and noncoding regions, and the hypothesis that in human sequences patchiness is related to repetitive sequences. We find that, by themselves, neither the alternation of coding and noncoding regions, nor repetitive sequences, can fully explain the long-range correlation properties of DNA.
Collapse
Affiliation(s)
- G M Viswanathan
- Center for Polymer Studies, Boston University, Massachusetts 02215, USA.
| | | | | | | |
Collapse
|
19
|
Richard GF, Dujon B. Distribution and variability of trinucleotide repeats in the genome of the yeast Saccharomyces cerevisiae. Gene X 1996; 174:165-74. [PMID: 8863744 DOI: 10.1016/0378-1119(96)00514-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
We have examined the distribution of trinucleotide repeats in the yeast genome. Perfect and imperfect repeats, ranging from four to 130 triplets were recognized and the repartition of different triplet combinations was found to differ between Open Reading Frames and Intergenic Regions. Examination of different laboratory strains, revealed polymorphic size variations for all perfect repeats studied, compared to an absence of variation for the imperfect ones. Size variations were found discrete in the range of 6-18 triplets, each strain showing one allelic form for a given repeat array. The distribution and stability of trinucleotide repeats in the yeast genome resembles that of humans and may provide an experimental approach to study the mechanisms of their expansion.
Collapse
Affiliation(s)
- G F Richard
- Unité de Génétique Moléculaire des Levures (URA1149 du CNRS and UFR927, Univ. P. & M. Curie), Institut Pasteur, Paris, France
| | | |
Collapse
|
20
|
Abstract
The bakers' yeast, Saccharomyces cerevisiae, a microorganism of major importance for bioindustries, and one of the favored model organisms for basic biological research, is the first eukaryote whose genome is entirely sequenced. Beyond the wealth of novel biological information, it is the extent of what remains to be understood in the genome of a simple unicellular organism that is the most striking result: a significant proportion of yeast genes are orphans of unpredictable function. Offering the possibility of large-scale reverse genetics, yeast will be a powerful model for post-sequencing studies. But geneticists are now faced with the difficulty of asking novel questions.
Collapse
Affiliation(s)
- B Dujon
- Unité de Génétique Moléculaire des Levures (URA 1149 CNRS), Paris, France.
| |
Collapse
|
21
|
Reardon BJ, Gordon D, Ballard MJ, Winter E. DNA binding properties of the Saccharomyces cerevisiae DAT1 gene product. Nucleic Acids Res 1995; 23:4900-6. [PMID: 8532535 PMCID: PMC307481 DOI: 10.1093/nar/23.23.4900] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The DAT1 gene of Saccharomyces cerevisiae encodes a DNA binding protein (Dat1p) that specifically recognizes the minor groove of non-alternating oligo(A).oligo(T) tracts. Sequence-specific recognition requires arginine residues found within three perfectly repeated pentads (G-R-K-P-G) of the Dat1p DNA binding domain [Reardon, B. J., Winters, R. S., Gordon, D., and Winter, E. (1993) Proc. Natl. Acad. Sci. USA 90, 11327-1131]. This report describes a rapid and simple method for purifying the Dat1p DNA binding domain and the biochemical characterization of its interaction with oligo(A).oligo(T) tracts. Oligonucleotide binding experiments and the characterization of yeast genomic Dat1p binding sites show that Dat1p specifically binds to any 11 base sequence in which 10 bases conform to an oligo(A).oligo(T) tract. Binding studies of different sized Dat1p derivatives show that the Dat1p DNA binding domain can function as a monomer. Competition DNA binding assays using poly(I).poly(C), demonstrate that the minor groove oligo(A).oligo(T) constituents are not sufficient for high specificity DNA binding. These data constrain the possible models for Dat1p/oligo(A).oligo(T) complexes, suggest that the DNA binding domain is in an extended structure when complexed to its cognate DNA, and show that Dat1p binding sites are more prevalent than previously thought.
Collapse
Affiliation(s)
- B J Reardon
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | | | | | | |
Collapse
|
22
|
Duret L, Mouchiroud D, Gautier C. Statistical analysis of vertebrate sequences reveals that long genes are scarce in GC-rich isochores. J Mol Evol 1995; 40:308-17. [PMID: 7723057 DOI: 10.1007/bf00163235] [Citation(s) in RCA: 186] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We compared the exon/intron organization of vertebrate genes belonging to different isochore classes, as predicted by their GC content at third codon position. Two main features have emerged from the analysis of sequences published in GenBank: (1) genes coding for long proteins (i.e., > or = 500 aa) are almost two times more frequent in GC-poor than in GC-rich isochores; (2) intervening sequences (= sum of introns) are on average three times longer in GC-poor than in GC-rich isochores. These patterns are observed among human, mouse, rat, cow, and even chicken genes and are therefore likely to be common to all warm-blooded vertebrates. Analysis of Xenopus sequences suggests that the same patterns exist in cold-blooded vertebrates. It could be argued that such results do not reflect the reality because sequence databases are not representative of entire genomes. However, analysis of biases in GenBank revealed that the observed discrepancies between GC-rich and GC-poor isochores are not artifactual, and are probably largely underestimated. We investigated the distribution of microsatellites and interspersed repeats in introns of human and mouse genes from different isochores. This analysis confirmed previous studies showing that L1 repeats are almost absent from GC-rich isochores. Microsatellites and SINES (Alu, B1, B2) are found at roughly equal frequencies in introns from all isochore classes. Globally, the presence of repeated sequences does not account for the increased intron length in GC-poor isochores. The relationships between gene structure and global genome organization and evolution are discussed.
Collapse
Affiliation(s)
- L Duret
- Laboratoire de Biométrie, Génétique et Biologie des Populations, Université Claude Bernard, Lyon I, URA-CNRS 243, Villeurbanne, France
| | | | | |
Collapse
|
23
|
Musto H, Rodríguez-Maseda H, Alvarez F. Compositional correlations in the nuclear genes of the flatworm Schistosoma mansoni. J Mol Evol 1995; 40:343-6. [PMID: 7723062 DOI: 10.1007/bf00163240] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We have investigated the genome organization in the flatworm Schistosoma mansoni. First, we analyzed the compositional distributions of the three codon positions. Second, we investigated the correlations that exist between (1) the GC levels of exons against flanking regions, (2) the GC levels of third codon positions against flanking regions, (3) the dinucleotide frequencies of exons against flanking regions, and (4) the GC levels of 5' against 3' regions. The modality of the distribution of third codon positions, together with the significant correlations found, leads us to propose that the nuclear genome of this species is compositionally compartmentalized.
Collapse
Affiliation(s)
- H Musto
- Sección Bioquímica, Instituto de Biología, Facultad de Ciencias, Montevideo, Uruguay
| | | | | |
Collapse
|
24
|
Abstract
The length of an open reading frame (ORF) is one important piece of evidence often used in locating new genes, particularly in organisms where splicing is rare. However, there have been no systematic studies quantifying the degree of correlation between length of ORF, on the one hand, and likelihood of gene function, on the other. In this paper, techniques are derived to estimate the conditional probability of gene function, given ORF length, based on evidence both from the databases and from simulation. Several complete chromosomes of Saccharomyces cerevisiae have now been sequenced, and considerable effort is being expended on locating and characterizing the genes in these sequences. Thus, we illustrate the techniques for this organism.
Collapse
Affiliation(s)
- J W Fickett
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, NM 87545, USA
| |
Collapse
|
25
|
Fairhead C, Dujon B. Transcript map of two regions from chromosome XI of Saccharomyces cerevisiae for interpretation of systematic sequencing results. Yeast 1994; 10:1403-13. [PMID: 7871880 DOI: 10.1002/yea.320101103] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
A detailed and systematic transcript map is a first and necessary step to characterize new genes revealed by systematic sequencing. Chromosome XI of Saccharomyces cerevisiae contains 331 open reading frames (ORFs) of which 44% are of unknown function (Dujon et al., 1994). As a first study towards complete transcript analysis of chromosome XI, we have extracted RNA from three isogenic strains (a, alpha and 2n) grown in three standard laboratory media, and have analysed them using contiguous probes covering two regions of 17 and 19 kilobases, respectively. All 20 predicted ORFs in the sequences correspond to expressed genes, six of which have no predicted function. Four short ORFs which were suspected as not being real genes on the basis of their sequence are not expressed in our growth conditions. An additional transcript which does not correspond to a large ORF was found. Steady-state RNA level of most ORFs is 10 to 100 times than that of the actin gene, only three are transcribed in comparable amounts. Three ORFs show variable levels of transcripts in the different growth conditions, all patterns being different from one another. Extrapolation of these results to systematic transcript analysis of chromosome XI and other yeast chromosomes is presented.
Collapse
Affiliation(s)
- C Fairhead
- Unité de Génétique Moléculaire des Levures (URA 1149 du CNRS), Institut Pasteur, Paris, France
| | | |
Collapse
|
26
|
Abstract
One expects that in DNA without protein coding function, stop codons (which constitute three of the 64 possible codons) should occur frequently in all reading frames, and that a long open reading frame (ORF) can be interpreted as a sign for the existence of a gene. We make a beginning on introducing quantitative measures of confidence into this inference--taking Saccharomyces cerevisiae as a sample case--and show that some common assumptions can reasonably be questioned. In particular we show that statistical support for the biological function of shorter ORFs listed as putative genes in recent papers is in fact very weak. This is an issue of practical as well as theoretical interest, since researching the function of a putative gene is difficult and expensive.
Collapse
Affiliation(s)
- J W Fickett
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, NM 87545
| |
Collapse
|
27
|
Dujon B, Alexandraki D, André B, Ansorge W, Baladron V, Ballesta JP, Banrevi A, Bolle PA, Bolotin-Fukuhara M, Bossier P, Bou G, Boyer J, Bultrago MJ, Cheret G, Colleaux L, Dalgnan-Fornler B, del Rey F, Dlon C, Domdey H, Düsterhoft A, Düsterhus S, Entlan KD, Erfle H, Esteban PF, Feldmann H, Fernandes L, Robo GM, Fritz C, Fukuhara H, Gabel C, Gaillon L, Carcia-Cantalejo JM, Garcia-Ramirez JJ, Gent NE, Ghazvini M, Goffeau A, Gonzaléz A, Grothues D, Guerreiro P, Hegemann J, Hewitt N, Hilger F, Hollenberg CP, Horaitis O, Indge KJ, Jacquier A, James CM, Jauniaux C, Jimenez A, Keuchel H, Kirchrath L, Kleine K, Kötter P, Legrain P, Liebl S, Louis EJ, Maia e Silva A, Marck C, Monnier AL, Möstl D, Müller S, Obermaier B, Oliver SG, Pallier C, Pascolo S, Pfeiffer F, Philippsen P, Planta RJ, Pohl FM, Pohl TM, Pöhlmann R, Portetelle D, Purnelle B, Puzos V, Ramezani Rad M, Rasmussen SW, Remacha M, Revuelta JL, Richard GF, Rieger M, Rodrigues-Pousada C, Rose M, Rupp T, Santos MA, Schwager C, Sensen C, Skala J, Soares H, Sor F, Stegemann J, Tettelin H, Thierry A, Tzermia M, Urrestarazu LA, van Dyck L, Van Vliet-Reedijk JC, Valens M, Vandenbo M, Vilela C, Vissers S, et alDujon B, Alexandraki D, André B, Ansorge W, Baladron V, Ballesta JP, Banrevi A, Bolle PA, Bolotin-Fukuhara M, Bossier P, Bou G, Boyer J, Bultrago MJ, Cheret G, Colleaux L, Dalgnan-Fornler B, del Rey F, Dlon C, Domdey H, Düsterhoft A, Düsterhus S, Entlan KD, Erfle H, Esteban PF, Feldmann H, Fernandes L, Robo GM, Fritz C, Fukuhara H, Gabel C, Gaillon L, Carcia-Cantalejo JM, Garcia-Ramirez JJ, Gent NE, Ghazvini M, Goffeau A, Gonzaléz A, Grothues D, Guerreiro P, Hegemann J, Hewitt N, Hilger F, Hollenberg CP, Horaitis O, Indge KJ, Jacquier A, James CM, Jauniaux C, Jimenez A, Keuchel H, Kirchrath L, Kleine K, Kötter P, Legrain P, Liebl S, Louis EJ, Maia e Silva A, Marck C, Monnier AL, Möstl D, Müller S, Obermaier B, Oliver SG, Pallier C, Pascolo S, Pfeiffer F, Philippsen P, Planta RJ, Pohl FM, Pohl TM, Pöhlmann R, Portetelle D, Purnelle B, Puzos V, Ramezani Rad M, Rasmussen SW, Remacha M, Revuelta JL, Richard GF, Rieger M, Rodrigues-Pousada C, Rose M, Rupp T, Santos MA, Schwager C, Sensen C, Skala J, Soares H, Sor F, Stegemann J, Tettelin H, Thierry A, Tzermia M, Urrestarazu LA, van Dyck L, Van Vliet-Reedijk JC, Valens M, Vandenbo M, Vilela C, Vissers S, von Wettstein D, Voss H, Wiemann S, Xu G, Zimmermann J, Haasemann M, Becker I, Mewes HW. Complete DNA sequence of yeast chromosome XI. Nature 1994; 369:371-8. [PMID: 8196765 DOI: 10.1038/369371a0] [Show More Authors] [Citation(s) in RCA: 308] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The complete DNA sequence of the yeast Saccharomyces cerevisiae chromosome XI has been determined. In addition to a compact arrangement of potential protein coding sequences, the 666,448-base-pair sequence has revealed general chromosome patterns; in particular, alternating regional variations in average base composition correlate with variations in local gene density along the chromosome. Significant discrepancies with the previously published genetic map demonstrate the need for using independent physical mapping criteria.
Collapse
Affiliation(s)
- B Dujon
- Unité de Génétique Moléculaire des Levures (URA 1149 du CNRS and UFR927 University P.M. Curie), Départment de Biologie Moléculaire, Insitut Pasteur, Paris, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Yagil G. The frequency of oligopurine.oligopyrimidine and other two-base tracts in yeast chromosome III. Yeast 1994; 10:603-11. [PMID: 7941745 DOI: 10.1002/yea.320100505] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
The TRACTS program was employed to map the occurrence of base tracts composed of only two bases in Saccharomyces cerevisiae chromosome III. The observed frequencies were compared with those expected in random DNA. A vast excess of long base tracts of the three possible two-base combinations, namely, purine.pyrimidine (R.Y), keto.imino (K.M) and weak;strong (W;S, mainly A,T rich), was documented. The observed excess places yeast in the same category as other eukaryote and organelle genomes analysed. The excess of the two-base tracts was considerably larger in the 1/3 of the chromosome not coding for a protein, in particular proximal to coding initiation and termination sites, but was observed for coding regions as well. A functional role for the excessive tracts, possibly as unwinding centers of particular genes, is proposed. Multiple occurrence of long two-base tracts is offered as another diagnostic to determine whether an open reading frame (ORF), or an ORF subregion, is an actually translated gene region.
Collapse
Affiliation(s)
- G Yagil
- Department of Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
29
|
Lefèvre C, Ikeda JE. A fast word search algorithm for the representation of sequence similarity in genomic DNA. Nucleic Acids Res 1994; 22:404-11. [PMID: 8127677 PMCID: PMC523596 DOI: 10.1093/nar/22.3.404] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Representation of sequence similarity by dot matrix plots is a method widely used for comparing biological sequences. The user is presented with an overall view of similarity between two sequences. Computation of this plot has been reconsidered here. An improvement is proposed through the preprocessing of the data into an automation recognizing the word structure of a sequence. The main advantage of this approach is to systematically eliminate the repetitions during word comparison. Simple heuristics are also considered to greatly speed up pattern matching. As a result, large sequences are handled very efficiently. This is illustrated by a comparison of large genomic DNA. The algorithm has been implemented in an interactive application on a microcomputer.
Collapse
Affiliation(s)
- C Lefèvre
- Genosphere Project, ERATO, JRDC, Tokai University School of Medicine, Kanagawa, Japan
| | | |
Collapse
|
30
|
Abstract
This review will first present some properties (including compositional pattern, correlations between isochores and chromosomal bands, and gene distribution) of the human genome, the most extensively studied among vertebrate genomes. It will then explain how these properties came about during the evolution of the vertebrates.
Collapse
Affiliation(s)
- G Bernardi
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| |
Collapse
|
31
|
Isacchi A, Bernardi G, Bernardi G. Compositional compartmentalization of the nuclear genomes of Trypanosoma brucei and Trypanosoma equiperdum. FEBS Lett 1993; 335:181-3. [PMID: 8253192 DOI: 10.1016/0014-5793(93)80725-a] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
High molecular weight DNA preparations from Trypanosoma brucei and Trypanosoma equiperdum were fractionated by preparative centrifugation in a Cs2SO4 density gradient in the presence of BAMD, bis(acetatomercurimethyl)dioxane, a sequence-specific DNA ligand. Analytical centrifugation in CsCl of the DNA fractions so obtained showed that both DNAs had a bimodal distribution with two major peaks banding at 1.702-1.703 and 1.708 g/cm3 and representing 1/3 and 2/3 of total DNA, respectively. Several minor components were also detected. These results indicate that a compositional compartmentalization is not only found in the genome of vertebrates and plants, as already described, but also in those of protozoa such as Trypanosomes.
Collapse
Affiliation(s)
- A Isacchi
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | |
Collapse
|
32
|
Sapolsky RJ, Brendel V, Karlin S. A comparative analysis of distinctive features of yeast protein sequences. Yeast 1993; 9:1287-98. [PMID: 8154180 DOI: 10.1002/yea.320091202] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The recently published sequence of yeast chromosome III (YCIII) provides the longest continuous stretch of a eukaryotic DNA molecule sequenced to date (315 kb). The sequence contains 116 distinct AUG-initiated open reading frames of at least 200 codons in length, more than 50 of which had not been described previously nor bear significant similarity to known proteins. We have analysed the YCIII known and putative protein sequences with respect to significant statistical features which might reflect on structural and functional characteristics. The YCIII proteins have striking similarities and differences in their sequence attribute distributions compared to the corresponding distributions for all available yeast sequences and other protein collections. Nine examples of YCIII proteins with distinctive sequence features are discussed in detail.
Collapse
Affiliation(s)
- R J Sapolsky
- Department of Mathematics, Stanford University, CA 94305-2125
| | | | | |
Collapse
|
33
|
Reardon BJ, Winters RS, Gordon D, Winter E. A peptide motif that recognizes A.T tracts in DNA. Proc Natl Acad Sci U S A 1993; 90:11327-31. [PMID: 8248247 PMCID: PMC47975 DOI: 10.1073/pnas.90.23.11327] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The DAT1 gene of Saccharomyces cerevisiae encodes a DNA binding protein that specifically interacts with nonalternating oligo(A).oligo(T) tracts (A.T tracts). Deletion analysis of DAT1 coding information showed that the amino-terminal 36 residues are sufficient for specific DNA binding activity. Furthermore, a 35-residue synthetic peptide corresponding to amino acids 2-36 bound to A.T tracts with an equilibrium dissociation constant of 4 x 10(-10) M. Within this region the pentad Gly-Arg-Lys-Pro-Gly is repeated three times. Mutational analysis revealed that the Arg side chains are required for high-affinity binding, whereas the other pentad side chains are dispensable. Chemical interference experiments showed that the DAT1 protein interacts with the minor groove of the double helix. The data suggest that the pentad arginines interact in a cooperative manner with a repeated minor groove feature of A.T tract DNA to achieve high-affinity recognition. Amino acid similarities with other DNA binding proteins suggest that the DAT1 protein pentad represents a specialized example of a widespread motif used by proteins to recognize A.T base pairs.
Collapse
Affiliation(s)
- B J Reardon
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA 19107
| | | | | | | |
Collapse
|
34
|
Abstract
The complete sequence of yeast chromosome III provides a model for studies relating DNA sequence and structure at different levels of organisation in eukaryotic chromosomes. DNA helical stability, intrinsic curvature and sequence complexity have been calculated for the complete chromosome. These features are compartmentalised at different levels of organisation. Compartmentalisation of thermal stability is observed from the level delineating coding/non-coding sequences, to higher levels of organisation which correspond to regions varying in G + C content. The three-dimensional path reveals a symmetrical structure for the chromosome, with a densely packed central region and more diffuse and linear subtelomeric regions. This interspersion of regions of high and low curvature is reflected at lower levels of organisation. Complexity of n-tuplets (n = 1 to 6) also reveals compartmentalisation of the chromosome at different levels of organisation, in many cases corresponding to the structural features. DNA stability, conformation and complexity delineate telomeres, centromere, autonomous replication sequences (ARS), transposition hotspots, recombination hotspots and the mating-type loci.
Collapse
Affiliation(s)
- G J King
- Breeding and Genetics Department, Horticulture Research International, Wellesbourne, Warwick, UK
| |
Collapse
|
35
|
Cardon LR, Burge C, Schachtel GA, Blaisdell BE, Karlin S. Comparative DNA sequence features in two long Escherichia coli contigs. Nucleic Acids Res 1993; 21:3875-84. [PMID: 8367304 PMCID: PMC309913 DOI: 10.1093/nar/21.16.3875] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
The recent sequencing of two relatively long (approximately 100 kb) contigs of E.coli presents unique opportunities for investigating heterogeneity and genomic organization of the E.coli chromosome. We have evaluated a number of common and contrasting sequence features in the two new contigs with comparisons to all available E.coli sequences (> 1.6 Mb). Our analyses include assessments of: (i) counts and distributions of restriction sites, special oligonucleotides (e.g., Chi sites, Dam and Dcm methylase targets), and other marker arrays; (ii) significant distant and close direct and inverted repeat sequences; (iii) sequence similarities between the long contigs and other E.coli sequences; (iv) characterization and identification of rare and frequent oligonucleotides; (v) compositional biases in short oligonucleotides; and (vi) position-dependent fluctuations in sequence composition. The two contigs reveal a number of distinctive features, including: a cluster of five repeat/dyad elements with very regular spacings resembling a transcription attenuator in one of the contigs; REP elements, ERICs, and other long repeats; distinction of the Chi sequence as the most frequent oligonucleotide; regions of clustering, overdispersion, and regularity of certain restriction sites and short palindromes; and comparative domains of inhomogeneities in the two long contigs. These and other features are discussed in relation to the organization of the E.coli chromosome.
Collapse
Affiliation(s)
- L R Cardon
- Department of Mathematics, Stanford University, CA 94035
| | | | | | | | | |
Collapse
|