201
|
Impaired uptake and/or utilization of leucine by Saccharomyces cerevisiae is suppressed by the SPT15-300 allele of the TATA-binding protein gene. Appl Environ Microbiol 2009; 75:6055-61. [PMID: 19666729 DOI: 10.1128/aem.00989-09] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Successful fermentations to produce ethanol require microbial strains that have a high tolerance to glucose and ethanol. Enhanced glucose/ethanol tolerance of the laboratory yeast Saccharomyces cerevisiae strain BY4741 under certain growth conditions as a consequence of the expression of a dominant mutant allele of the SPT15 gene (SPT15-300) corresponding to the three amino acid changes F177S, Y195H, and K218R has been reported (H. Alper, J. Moxley, E. Nevoigt, G. R. Fink, and G. Stephanopoulos, Science 314:1565-1568, 2006). The SPT15 gene codes for the TATA-binding protein. This finding prompted us to examine the effect of expression of the SPT15-300 allele in various yeast species of industrial importance. Expression of SPT15-300 in leucine-prototrophic strains of S. cerevisiae, Saccharomyces bayanus, or Saccharomyces pastorianus (lager brewing yeast), however, did not improve tolerance to ethanol on complex rich medium (yeast extract-peptone-dextrose). The enhanced growth of the laboratory yeast strain BY4741 expressing the SPT15-300 mutant allele was seen only on defined media with low concentrations of leucine, indicating that the apparent improved growth in the presence of ethanol was indeed associated with enhanced uptake and/or utilization of leucine. Reexamination of the microarray data published by Alper and coworkers likewise suggested that expression of genes coding for the leucine permeases, Tat1p and Bap3p, were upregulated in the SPT15-300 mutant, as was expression of the genes ARO10, ADH3, ADH5, and SFA1, involved in leucine degradation.
Collapse
|
202
|
Evolutionary capture of viral and plasmid DNA by yeast nuclear chromosomes. EUKARYOTIC CELL 2009; 8:1521-31. [PMID: 19666779 DOI: 10.1128/ec.00110-09] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
A 10-kb region of the nuclear genome of the yeast Vanderwaltozyma polyspora contains an unusual cluster of five pseudogenes homologous to five different genes from yeast killer viruses, killer plasmids, the 2microm plasmid, and a Penicillium virus. By further database searches, we show that this phenomenon is not unique to V. polyspora but that about 40% of the sequenced genomes of Saccharomycotina species contain integrated copies of genes from DNA plasmids or RNA viruses. We propose the name NUPAVs (nuclear sequences of plasmid and viral origin) for these objects, by analogy to NUMTs (nuclear copies of mitochondrial DNA) and NUPTs (nuclear copies of plastid DNA, in plants) of organellar origin. Although most of the NUPAVs are pseudogenes, one intact and active gene that was formed in this way is the KHS1 chromosomal killer locus of Saccharomyces cerevisiae. We show that KHS1 is a NUPAV related to M2 killer virus double-stranded RNA. Many NUPAVs are located beside tRNA genes, and some contain sequences from a mixture of different extrachromosomal sources. We propose that NUPAVs are sequences that were captured by the nuclear genome during the repair of double-strand breaks that occurred during evolution and that some of their properties may be explained by repeated breakage at fragile chromosomal sites.
Collapse
|
203
|
HOU L, QIAN MP, ZHU YP, DENG MH. Advances on bioinformatic research in transcription factor binding sites. YI CHUAN = HEREDITAS 2009; 31:365-73. [DOI: 10.3724/sp.j.1005.2009.00365] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
204
|
Payen C, Fischer G, Marck C, Proux C, Sherman DJ, Coppée JY, Johnston M, Dujon B, Neuvéglise C. Unusual composition of a yeast chromosome arm is associated with its delayed replication. Genome Res 2009; 19:1710-21. [PMID: 19592681 DOI: 10.1101/gr.090605.108] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
The 11.3-Mb genome of the yeast Lachancea (Saccharomyces) kluyveri displays an intriguing compositional heterogeneity: a region of approximately 1 Mb, covering almost the whole left arm of chromosome C (C-left), has an average GC content of 52.9%, which is significantly higher than the 40.4% global GC content of the rest of the genome. This region contains the MAT locus, which remains normal in composition. The excess of GC base pairs affects both coding and noncoding sequences, and thus is not due to selective pressure acting on protein sequences. It leads to a strong codon usage bias and alters the amino acid composition of the 457 proteins encoded on C-left that do not show obvious bias for functional categories, or the presence of paralogs or orthologs of essential genes of Saccharomyces cerevisiae. They share significant synteny conservation with other species of the Saccharomycetaceae, and phylogenetic analysis indicates that C-left originates from a Lachancea species. In contrast, there is a complete absence of transposable elements in C-left, whereas 18 elements per megabase are distributed across the rest of the genome. Comparative hybridization of synchronized cells using high-density genome arrays reveals that C-left is replicated later during S phase than the rest of the genome. Two possible primary causes of this major compositional heterogeneity are discussed: an ancient hybridization of two related species with very distinct GC composition, or an intrinsic mechanism, possibly associated with the loss of the silent cassettes from C-left that progressively increased the GC content and generated the delayed replication of this chromosomal arm.
Collapse
Affiliation(s)
- Célia Payen
- Institut Pasteur, CNRS, URA, Université Pierre et Marie Curie, Paris, France
| | | | | | | | | | | | | | | | | |
Collapse
|
205
|
Polymorphisms in multiple genes contribute to the spontaneous mitochondrial genome instability of Saccharomyces cerevisiae S288C strains. Genetics 2009; 183:365-83. [PMID: 19581448 DOI: 10.1534/genetics.109.104497] [Citation(s) in RCA: 139] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The mitochondrial genome (mtDNA) is required for normal cellular function; inherited and somatic mutations in mtDNA lead to a variety of diseases. Saccharomyces cerevisiae has served as a model to study mtDNA integrity, in part because it can survive without mtDNA. A measure of defective mtDNA in S. cerevisiae is the formation of petite colonies. The frequency at which spontaneous petite colonies arise varies by approximately 100-fold between laboratory and natural isolate strains. To determine the genetic basis of this difference, we applied quantitative trait locus (QTL) mapping to two strains at the opposite extremes of the phenotypic spectrum: the widely studied laboratory strain S288C and the vineyard isolate RM11-1a. Four main genetic determinants explained the phenotypic difference. Alleles of SAL1, CAT5, and MIP1 contributed to the high petite frequency of S288C and its derivatives by increasing the formation of petite colonies. By contrast, the S288C allele of MKT1 reduced the formation of petite colonies and compromised the growth of petite cells. The former three alleles were found in the EM93 strain, the founder that contributed approximately 88% of the S288C genome. Nearly all of the phenotypic difference between S288C and RM11-1a was reconstituted by introducing the common alleles of these four genes into the S288C background. In addition to the nuclear gene contribution, the source of the mtDNA influenced its stability. These results demonstrate that a few rare genetic variants with individually small effects can have a profound phenotypic effect in combination. Moreover, the polymorphisms identified in this study open new lines of investigation into mtDNA maintenance.
Collapse
|
206
|
Computational analysis of the interaction between transcription factors and the predicted secreted proteome of the yeast Kluyveromyces lactis. BMC Bioinformatics 2009; 10:194. [PMID: 19555482 PMCID: PMC2711083 DOI: 10.1186/1471-2105-10-194] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Accepted: 06/25/2009] [Indexed: 11/13/2022] Open
Abstract
Background Protein secretion is a cell translocation process of major biological and technological significance. The secretion and downstream processing of proteins by recombinant cells is of great commercial interest. The yeast Kluyveromyces lactis is considered a promising host for heterologous protein production. Because yeasts naturally do not secrete as many proteins as filamentous fungi, they can produce secreted recombinant proteins with few contaminants in the medium. An ideal system to address the secretion of a desired protein could be exploited among the native proteins in certain physiological conditions. By applying algorithms to the completed K. lactis genome sequence, such a system could be selected. To this end, we predicted protein subcellular locations and correlated the resulting extracellular secretome with the transcription factors that modulate the cellular response to a particular environmental stimulus. Results To explore the potential Kluyveromyces lactis extracellular secretome, four computational prediction algorithms were applied to 5076 predicted K. lactis proteins from the genome database. SignalP v3 identified 418 proteins with N-terminal signal peptides. From these 418 proteins, the Phobius algorithm predicted that 176 proteins have no transmembrane domains, and the big-PI Predictor identified 150 proteins as having no glycosylphosphatidylinositol (GPI) modification sites. WoLF PSORT predicted that the K. lactis secretome consists of 109 putative proteins, excluding subcellular targeting. The transcription regulators of the putative extracellular proteins were investigated by searching for DNA binding sites in their putative promoters. The conditions to favor expression were obtained by searching Gene Ontology terms and using graph theory. Conclusion A public database of K. lactis secreted proteins and their transcription factors are presented. It consists of 109 ORFs and 23 transcription factors. A graph created from this database shows 134 nodes and 884 edges, suggesting a vast number of relationships to be validated experimentally. Most of the transcription factors are related to responses to stress such as drug, acid and heat resistance, as well as nitrogen limitation, and may be useful for inducing maximal expression of potential extracellular proteins.
Collapse
|
207
|
Souciet JL, Dujon B, Gaillardin C, Johnston M, Baret PV, Cliften P, Sherman DJ, Weissenbach J, Westhof E, Wincker P, Jubin C, Poulain J, Barbe V, Ségurens B, Artiguenave F, Anthouard V, Vacherie B, Val ME, Fulton RS, Minx P, Wilson R, Durrens P, Jean G, Marck C, Martin T, Nikolski M, Rolland T, Seret ML, Casarégola S, Despons L, Fairhead C, Fischer G, Lafontaine I, Leh V, Lemaire M, de Montigny J, Neuvéglise C, Thierry A, Blanc-Lenfle I, Bleykasten C, Diffels J, Fritsch E, Frangeul L, Goëffon A, Jauniaux N, Kachouri-Lafond R, Payen C, Potier S, Pribylova L, Ozanne C, Richard GF, Sacerdot C, Straub ML, Talla E. Comparative genomics of protoploid Saccharomycetaceae. Genome Res 2009; 19:1696-709. [PMID: 19525356 DOI: 10.1101/gr.091546.109] [Citation(s) in RCA: 171] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Our knowledge of yeast genomes remains largely dominated by the extensive studies on Saccharomyces cerevisiae and the consequences of its ancestral duplication, leaving the evolution of the entire class of hemiascomycetes only partly explored. We concentrate here on five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that we call "protoploid" because they diverged from the S. cerevisiae lineage prior to its genome duplication. We determined the complete genome sequences of three of these species: Kluyveromyces (Lachancea) thermotolerans and Saccharomyces (Lachancea) kluyveri (two members of the newly described Lachancea clade), and Zygosaccharomyces rouxii. We included in our comparisons the previously available sequences of Kluyveromyces lactis and Ashbya (Eremothecium) gossypii. Despite their broad evolutionary range and significant individual variations in each lineage, the five protoploid Saccharomycetaceae share a core repertoire of approximately 3300 protein families and a high degree of conserved synteny. Synteny blocks were used to define gene orthology and to infer ancestors. Far from representing minimal genomes without redundancy, the five protoploid yeasts contain numerous copies of paralogous genes, either dispersed or in tandem arrays, that, altogether, constitute a third of each genome. Ancient, conserved paralogs as well as novel, lineage-specific paralogs were identified.
Collapse
Affiliation(s)
-
- Université de Strasbourg, CNRS UMR, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
208
|
Shen S, Tobery CE, Rose MD. Prm3p is a pheromone-induced peripheral nuclear envelope protein required for yeast nuclear fusion. Mol Biol Cell 2009; 20:2438-50. [PMID: 19297527 PMCID: PMC2675623 DOI: 10.1091/mbc.e08-10-0987] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2008] [Revised: 02/05/2009] [Accepted: 03/09/2009] [Indexed: 11/11/2022] Open
Abstract
Nuclear membrane fusion is the last step in the mating pathway of the yeast Saccharomyces cerevisiae. We adapted a bioinformatics approach to identify putative pheromone-induced membrane proteins potentially required for nuclear membrane fusion. One protein, Prm3p, was found to be required for nuclear membrane fusion; disruption of PRM3 caused a strong bilateral defect, in which nuclear congression was completed but fusion did not occur. Prm3p was localized to the nuclear envelope in pheromone-responding cells, with significant colocalization with the spindle pole body in zygotes. A previous report, using a truncated protein, claimed that Prm3p is localized to the inner nuclear envelope. Based on biochemistry, immunoelectron microscopy and live cell microscopy, we find that functional Prm3p is a peripheral membrane protein exposed on the cytoplasmic face of the outer nuclear envelope. In support of this, mutations in a putative nuclear localization sequence had no effect on full-length protein function or localization. In contrast, point mutations and deletions in the highly conserved hydrophobic carboxy-terminal domain disrupted both protein function and localization. Genetic analysis, colocalization, and biochemical experiments indicate that Prm3p interacts directly with Kar5p, suggesting that nuclear membrane fusion is mediated by a protein complex.
Collapse
Affiliation(s)
- Shu Shen
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544-1014
| | - Cynthia E. Tobery
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544-1014
| | - Mark D. Rose
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544-1014
| |
Collapse
|
209
|
Ogata T, Izumikawa M, Tadami H. Chimeric types of chromosome X in bottom-fermenting yeasts. J Appl Microbiol 2009; 107:1098-107. [DOI: 10.1111/j.1365-2672.2009.04289.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
210
|
Zhu C, Byers KJ, McCord RP, Shi Z, Berger MF, Newburger DE, Saulrieta K, Smith Z, Shah MV, Radhakrishnan M, Philippakis AA, Hu Y, De Masi F, Pacek M, Rolfs A, Murthy T, LaBaer J, Bulyk ML. High-resolution DNA-binding specificity analysis of yeast transcription factors. Genome Res 2009; 19:556-66. [PMID: 19158363 PMCID: PMC2665775 DOI: 10.1101/gr.090233.108] [Citation(s) in RCA: 316] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2008] [Accepted: 01/14/2009] [Indexed: 12/22/2022]
Abstract
Transcription factors (TFs) regulate the expression of genes through sequence-specific interactions with DNA-binding sites. However, despite recent progress in identifying in vivo TF binding sites by microarray readout of chromatin immunoprecipitation (ChIP-chip), nearly half of all known yeast TFs are of unknown DNA-binding specificities, and many additional predicted TFs remain uncharacterized. To address these gaps in our knowledge of yeast TFs and their cis regulatory sequences, we have determined high-resolution binding profiles for 89 known and predicted yeast TFs, over more than 2.3 million gapped and ungapped 8-bp sequences ("k-mers"). We report 50 new or significantly different direct DNA-binding site motifs for yeast DNA-binding proteins and motifs for eight proteins for which only a consensus sequence was previously known; in total, this corresponds to over a 50% increase in the number of yeast DNA-binding proteins with experimentally determined DNA-binding specificities. Among other novel regulators, we discovered proteins that bind the PAC (Polymerase A and C) motif (GATGAG) and regulate ribosomal RNA (rRNA) transcription and processing, core cellular processes that are constituent to ribosome biogenesis. In contrast to earlier data types, these comprehensive k-mer binding data permit us to consider the regulatory potential of genomic sequence at the individual word level. These k-mer data allowed us to reannotate in vivo TF binding targets as direct or indirect and to examine TFs' potential effects on gene expression in approximately 1,700 environmental and cellular conditions. These approaches could be adapted to identify TFs and cis regulatory elements in higher eukaryotes.
Collapse
Affiliation(s)
- Cong Zhu
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Kelsey J.R.P. Byers
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Rachel Patton McCord
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Zhenwei Shi
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Michael F. Berger
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Daniel E. Newburger
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Katrina Saulrieta
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Zachary Smith
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Mita V. Shah
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Biology, Wellesley College, Wellesley, Massachusetts 02481, USA
| | - Mathangi Radhakrishnan
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Anthony A. Philippakis
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, Massachusetts 02138, USA
- Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Yanhui Hu
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Federico De Masi
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Marcin Pacek
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Andreas Rolfs
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Tal Murthy
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Joshua LaBaer
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, Massachusetts 02138, USA
- Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| |
Collapse
|
211
|
Hesselberth JR, Chen X, Zhang Z, Sabo PJ, Sandstrom R, Reynolds AP, Thurman RE, Neph S, Kuehn MS, Noble WS, Fields S, Stamatoyannopoulos JA. Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat Methods 2009; 6:283-9. [PMID: 19305407 PMCID: PMC2668528 DOI: 10.1038/nmeth.1313] [Citation(s) in RCA: 457] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2008] [Accepted: 02/19/2009] [Indexed: 11/26/2022]
Abstract
The orchestrated binding of transcriptional activators and repressors to specific DNA sequences in the context of chromatin defines the regulatory program of eukaryotic genomes. We developed a digital approach to assay regulatory protein occupancy on genomic DNA in vivo by dense mapping of individual DNase I cleavages from intact nuclei using massively parallel DNA sequencing. Analysis of >23 million cleavages across the Saccharomyces cerevisiae genome revealed thousands of protected regulatory protein footprints, enabling de novo derivation of factor binding motifs and the identification of hundreds of new binding sites for major regulators. We observed striking correspondence between single-nucleotide resolution DNase I cleavage patterns and protein-DNA interactions determined by crystallography. The data also yielded a detailed view of larger chromatin features including positioned nucleosomes flanking factor binding regions. Digital genomic footprinting should be a powerful approach to delineate the cis-regulatory framework of any organism with an available genome sequence.
Collapse
Affiliation(s)
| | - Xiaoyu Chen
- Dept. of Computer Science, University of Washington, Seattle, WA 98195
| | - Zhihong Zhang
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
| | - Peter J. Sabo
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
| | - Richard Sandstrom
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
| | - Alex P. Reynolds
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
| | - Robert E. Thurman
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
| | - Shane Neph
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
| | - Michael S. Kuehn
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
| | - William S. Noble
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
- Dept. of Computer Science, University of Washington, Seattle, WA 98195
| | - Stanley Fields
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195
| | - John A. Stamatoyannopoulos
- Dept. of Genome Sciences, University of Washington, Seattle, WA 98195
- Dept. of Medicine, University of Washington, Seattle, WA 98195
| |
Collapse
|
212
|
Chen CL, Zhou H, Liao JY, Qu LH, Amar L. Genome-wide evolutionary analysis of the noncoding RNA genes and noncoding DNA of Paramecium tetraurelia. RNA (NEW YORK, N.Y.) 2009; 15:503-14. [PMID: 19218550 PMCID: PMC2661823 DOI: 10.1261/rna.1306009] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The compact genome of the unicellular eukaryote Paramecium tetraurelia contains noncoding DNA (ncDNA) distributed into >39,000 intergenic sequences and >90,000 introns of 390 base pairs (bp) and 25 bp on average, respectively. Here we analyzed the molecular features of the ncRNA genes, introns, and intergenic sequences of this genome. We mainly used computational programs and comparative genomics possible because the P. tetraurelia genome had formed throughout whole-genome duplications (WGDs). We characterized 417 5S rRNA, snRNA, snoRNA, SRP RNA, and tRNA putative genes, 415 of which map within intergenic sequences, and two, within introns. The evolution of these ncRNA genes appears to have mainly involved purifying selection and gene deletion. We then compared the introns that interrupt the protein-coding gene duplicates arisen from the recent WGD and identified a population of a few thousands of introns having evolved under most stringent constraints (>95% of identity). We also showed that low nucleotide substitution levels characterize the 50 and 80-115 base pairs flanking, respectively, the stop and start codons of the protein-coding genes. Lower substitution levels mark the base pairs flanking the highly transcribed genes, or the start codons of the genes of the sets with a high number of WGD-related sequences. Finally, adjacent to protein-coding genes, we characterized 32 DNA motifs able to encode stable and evolutionary conserved RNA secondary structures and defining putative expression controlling elements. Fourteen DNA motifs with similar properties map distant from protein-coding genes and may encode regulatory ncRNAs.
Collapse
Affiliation(s)
- Chun-Long Chen
- Institut de Biologie Animale Intégrative et Cellulaire, Université Paris Sud, Orsay, France
| | | | | | | | | |
Collapse
|
213
|
Candida glabrata PHO4 is necessary and sufficient for Pho2-independent transcription of phosphate starvation genes. Genetics 2009; 182:471-9. [PMID: 19332882 DOI: 10.1534/genetics.109.101063] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Comparative genomic analyses of Candida glabrata and Saccharomyces cerevisiae suggest many signal transduction pathways are highly conserved. Focusing on the phosphate signal transduction (PHO) pathway of C. glabrata, we demonstrate that components of the pathway are conserved and confirm the role of CgPHO81, CgPHO80, CgPHO4, and CgMSN5 in the PHO pathway through deletion analysis. Unlike S. cerevisiae, C. glabrata shows little dependence on the transcription factor, Pho2, for induction of phosphate-regulated genes during phosphate limitation. We show that the CgPho4 protein is necessary and sufficient for Pho2-independent gene expression; CgPho4 is capable of driving expression of PHO promoters in S. cerevisiae in the absence of ScPHO2. On the basis of the sequences of PHO4 in the hemiascomycetes and complementation analysis, we suggest that Pho2 dependence is a trait only observed in species closely related to S. cerevisiae. Our data are consistent with trans-regulatory changes in the PHO pathway via the transcription factor Pho4 as opposed to cis-regulatory changes (the promoter).
Collapse
|
214
|
Schacherer J, Shapiro JA, Ruderfer DM, Kruglyak L. Comprehensive polymorphism survey elucidates population structure of Saccharomyces cerevisiae. Nature 2009; 458:342-5. [PMID: 19212320 PMCID: PMC2782482 DOI: 10.1038/nature07670] [Citation(s) in RCA: 349] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2008] [Accepted: 11/25/2008] [Indexed: 01/07/2023]
Abstract
Comprehensive identification of polymorphisms among individuals within a species is essential both for studying the genetic basis of phenotypic differences and for elucidating the evolutionary history of the species. Large-scale polymorphism surveys have recently been reported for human, mouse and Arabidopsis thaliana. Here we report a nucleotide-level survey of genomic variation in a diverse collection of 63 Saccharomyces cerevisiae strains sampled from different ecological niches (beer, bread, vineyards, immunocompromised individuals, various fermentations and nature) and from locations on different continents. We hybridized genomic DNA from each strain to whole-genome tiling microarrays and detected 1.89 million single nucleotide polymorphisms, which were grouped into 101,343 distinct segregating sites. We also identified 3,985 deletion events of length >200 base pairs among the surveyed strains. We analysed the genome-wide patterns of nucleotide polymorphism and deletion variants, and measured the extent of linkage disequilibrium in S. cerevisiae. These results and the polymorphism resource we have generated lay the foundation for genome-wide association studies in yeast. We also examined the population structure of S. cerevisiae, providing support for multiple domestication events as well as insight into the origins of pathogenic strains.
Collapse
Affiliation(s)
- Joseph Schacherer
- Lewis-Sigler Institute for Integrative Genomics, Department of Ecology and Evolutionary Biology and Howard Hughes Medical Institute, Princeton University, Princeton, New Jersey 08544, USA
| | | | | | | |
Collapse
|
215
|
White GE, Erickson HP. The coiled coils of cohesin are conserved in animals, but not in yeast. PLoS One 2009; 4:e4674. [PMID: 19262687 PMCID: PMC2650401 DOI: 10.1371/journal.pone.0004674] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2008] [Accepted: 01/27/2009] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The SMC proteins are involved in DNA repair, chromosome condensation, and sister chromatid cohesion throughout Eukaryota. Long, anti-parallel coiled coils are a prominent feature of SMC proteins, and are thought to serve as spacer rods to provide an elongated structure and to separate domains. We reported recently that the coiled coils of mammalian condensin (SMC2/4) showed moderate sequence divergence (approximately 10-15%) consistent with their functioning as spacer rods. The coiled coils of mammalian cohesins (SMC1/3), however, were very highly constrained, with amino acid sequence divergence typically <0.5%. These coiled coils are among the most highly conserved mammalian proteins, suggesting that they make extensive contacts over their entire surface. METHODOLOGY/PRINCIPAL FINDINGS Here, we broaden our initial analysis of condensin and cohesin to include additional vertebrate and invertebrate organisms and multiple species of yeast. We found that the coiled coils of SMC1/3 are highly constrained in Drosophila and other insects, and more generally across all animal species. However, in yeast they are no more constrained than the coils of SMC2/4 and Ndc80/Nuf2p, suggesting that they are serving primarily as spacer rods. CONCLUSIONS/SIGNIFICANCE SMC1/3 functions for sister chromatid cohesion in all species. Since its coiled coils apparently serve only as spacer rods in yeast, it is likely that this is sufficient for sister chromatid cohesion in all species. This suggests an additional function in animals that constrains the sequence of the coiled coils. Several recent studies have demonstrated that cohesin has a role in gene expression in post-mitotic neurons of Drosophila, and other animal cells. Some variants of human Cornelia de Lange Syndrome involve mutations in human SMC1/3. We suggest that the role of cohesin in gene expression may involve intimate contact of the coiled coils of SMC1/3, and impose the constraint on sequence divergence.
Collapse
Affiliation(s)
- Glenn E. White
- Department of Biological and Environmental Sciences, Longwood University, Farmville, Virginia, United Kingdom
| | - Harold P. Erickson
- Department of Cell Biology, Duke University Medical Center, Durham, North Carolina, United States of America
| |
Collapse
|
216
|
Kristiansson E, Thorsen M, Tamás MJ, Nerman O. Evolutionary forces act on promoter length: identification of enriched cis-regulatory elements. Mol Biol Evol 2009; 26:1299-307. [PMID: 19258451 DOI: 10.1093/molbev/msp040] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Transcription factors govern gene expression by binding to short DNA sequences called cis-regulatory elements. These sequences are typically located in promoters, which are regions of variable length upstream of the open reading frames of genes. Here, we report that promoter length and gene function are related in yeast, fungi, and plants. In particular, the promoters for stress-responsive genes are in general longer than those of other genes. Essential genes have, on the other hand, relatively short promoters. We utilize these findings in a novel method for identifying relevant cis-regulatory elements in a set of coexpressed genes. The method is shown to generate more accurate results and fewer false positives compared with other common procedures. Our results suggest that genes with complex transcriptional regulation tend to have longer promoters than genes responding to few signals. This phenomenon is present in all investigated species, indicating that evolution adjust promoter length according to gene function. Identification of cis-regulatory elements in Saccharomyces cerevisiae can be done with the web service located at http://enricher.zool.gu.se.
Collapse
|
217
|
Chimeric genomes of natural hybrids of Saccharomyces cerevisiae and Saccharomyces kudriavzevii. Appl Environ Microbiol 2009; 75:2534-44. [PMID: 19251887 DOI: 10.1128/aem.02282-08] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Recently, a new type of hybrid resulting from the hybridization between Saccharomyces cerevisiae and Saccharomyces kudriavzevii was described. These strains exhibit physiological properties of potential biotechnological interest. A preliminary characterization of these hybrids showed a trend to reduce the S. kudriavzevii fraction of the hybrid genome. We characterized the genomic constitution of several wine S. cerevisiae x S. kudriavzevii strains by using a combined approach based on the restriction fragment length polymorphism analysis of gene regions, comparative genome hybridizations with S. cerevisiae DNA arrays, ploidy analysis, and gene dose determination by quantitative real-time PCR. The high similarity in the genome structures of the S. cerevisiae x S. kudriavzevii hybrids under study indicates that they originated from a single hybridization event. After hybridization, the hybrid genome underwent extensive chromosomal rearrangements, including chromosome losses and the generation of chimeric chromosomes by the nonreciprocal recombination between homeologous chromosomes. These nonreciprocal recombinations between homeologous chromosomes occurred in highly conserved regions, such as Ty long terminal repeats (LTRs), rRNA regions, and conserved protein-coding genes. This study supports the hypothesis that chimeric chromosomes may have been generated by a mechanism similar to the recombination-mediated chromosome loss acting during meiosis in Saccharomyces hybrids. As a result of the selective processes acting during fermentation, hybrid genomes maintained the S. cerevisiae genome but reduced the S. kudriavzevii fraction.
Collapse
|
218
|
Miklós I, Novák Á, Satija R, Lyngsø R, Hein J. Stochastic models of sequence evolution including insertion—deletion events. Stat Methods Med Res 2009; 18:453-85. [DOI: 10.1177/0962280208099500] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Comparison of sequences that have descended from a common ancestor based on an explicit stochastic model of substitutions, insertions and deletions has risen to prominence in the last decade. Making statements about the positions of insertions-deletions (abbr. indels) is central in sequence and genome analysis and is called alignment. This statistical approach is harder conceptually and computationally, than competing approaches based on choosing an alignment according to some optimality criteria. But it has major practical advantages in terms of testing evolutionary hypotheses and parameter estimation. Basic dynamic approaches can allow the analysis of up to 4—5 sequences. MCMC techniques can bring this to about 10—15 sequences. Beyond this, different or heuristic approaches must be used. Besides the computational challenges, increasing realism in the underlying models is presently being addressed. A recent development that has been especially fruitful is combining statistical alignment with the problem of sequence annotation, making statements about the function of each nucleotide/amino acid. So far gene finding, protein secondary structure prediction and regulatory signal detection has been tackled within this framework. Much progress can be reported, but clearly major challenges remain if this approach is to be central in the analyses of large incoming sequence data sets.
Collapse
Affiliation(s)
- István Miklós
- Bioinformatics Group, Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences, 1053 Budapest, Reáltanoda u. 13-15, Hungary, , Bioinformatics Group, Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG Oxford, UK, Data Mining and Search Research Group, Computer and Automation Institute, Hungarian Academy of Sciences, 1111 Budapest, Lágymányosi u. 11., Hungary
| | - Ádám Novák
- Bioinformatics Group, Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG Oxford, UK
| | - Rahul Satija
- Bioinformatics Group, Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG Oxford, UK
| | - Rune Lyngsø
- Bioinformatics Group, Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG Oxford, UK
| | - Jotun Hein
- Bioinformatics Group, Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG Oxford, UK
| |
Collapse
|
219
|
Rispail N, Soanes DM, Ant C, Czajkowski R, Grünler A, Huguet R, Perez-Nadales E, Poli A, Sartorel E, Valiante V, Yang M, Beffa R, Brakhage AA, Gow NAR, Kahmann R, Lebrun MH, Lenasi H, Perez-Martin J, Talbot NJ, Wendland J, Di Pietro A. Comparative genomics of MAP kinase and calcium-calcineurin signalling components in plant and human pathogenic fungi. Fungal Genet Biol 2009; 46:287-98. [PMID: 19570501 DOI: 10.1016/j.fgb.2009.01.002] [Citation(s) in RCA: 242] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2008] [Revised: 01/16/2009] [Accepted: 01/17/2009] [Indexed: 01/22/2023]
Abstract
Mitogen-activated protein kinase (MAPK) cascades and the calcium-calcineurin pathway control fundamental aspects of fungal growth, development and reproduction. Core elements of these signalling pathways are required for virulence in a wide array of fungal pathogens of plants and mammals. In this review, we have used the available genome databases to explore the structural conservation of three MAPK cascades and the calcium-calcineurin pathway in ten different fungal species, including model organisms, plant pathogens and human pathogens. While most known pathway components from the model yeast Saccharomyces cerevisiae appear to be widely conserved among taxonomically and biologically diverse fungi, some of them were found to be restricted to the Saccharomycotina. The presence of multiple paralogues in certain species such as the zygomycete Rhizopus oryzae and the incorporation of new functional domains that are lacking in S. cerevisiae signalling proteins, most likely reflect functional diversification or adaptation as filamentous fungi have evolved to occupy distinct ecological niches.
Collapse
Affiliation(s)
- Nicolas Rispail
- Departamento de Genética, Universidad de Córdoba, Córdoba, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
220
|
Knijnenburg TA, Daran JMG, van den Broek MA, Daran-Lapujade PA, de Winde JH, Pronk JT, Reinders MJT, Wessels LFA. Combinatorial effects of environmental parameters on transcriptional regulation in Saccharomyces cerevisiae: a quantitative analysis of a compendium of chemostat-based transcriptome data. BMC Genomics 2009; 10:53. [PMID: 19173729 PMCID: PMC2640415 DOI: 10.1186/1471-2164-10-53] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2008] [Accepted: 01/27/2009] [Indexed: 11/18/2022] Open
Abstract
Background Microorganisms adapt their transcriptome by integrating multiple chemical and physical signals from their environment. Shake-flask cultivation does not allow precise manipulation of individual culture parameters and therefore precludes a quantitative analysis of the (combinatorial) influence of these parameters on transcriptional regulation. Steady-state chemostat cultures, which do enable accurate control, measurement and manipulation of individual cultivation parameters (e.g. specific growth rate, temperature, identity of the growth-limiting nutrient) appear to provide a promising experimental platform for such a combinatorial analysis. Results A microarray compendium of 170 steady-state chemostat cultures of the yeast Saccharomyces cerevisiae is presented and analyzed. The 170 microarrays encompass 55 unique conditions, which can be characterized by the combined settings of 10 different cultivation parameters. By applying a regression model to assess the impact of (combinations of) cultivation parameters on the transcriptome, most S. cerevisiae genes were shown to be influenced by multiple cultivation parameters, and in many cases by combinatorial effects of cultivation parameters. The inclusion of these combinatorial effects in the regression model led to higher explained variance of the gene expression patterns and resulted in higher function enrichment in subsequent analysis. We further demonstrate the usefulness of the compendium and regression analysis for interpretation of shake-flask-based transcriptome studies and for guiding functional analysis of (uncharacterized) genes and pathways. Conclusion Modeling the combinatorial effects of environmental parameters on the transcriptome is crucial for understanding transcriptional regulation. Chemostat cultivation offers a powerful tool for such an approach.
Collapse
Affiliation(s)
- Theo A Knijnenburg
- Information and Communication Theory Group, Department of Mediamatics, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, the Netherlands.
| | | | | | | | | | | | | | | |
Collapse
|
221
|
Elaboration, diversification and regulation of the Sir1 family of silencing proteins in Saccharomyces. Genetics 2009; 181:1477-91. [PMID: 19171939 DOI: 10.1534/genetics.108.099663] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Heterochromatin renders domains of chromosomes transcriptionally silent and, due to clonal variation in its formation, can generate heritably distinct populations of genetically identical cells. Saccharomyces cerevisiae's Sir1 functions primarily in the establishment, but not the maintenance, of heterochromatic silencing at the HMR and HML loci. In several Saccharomyces species, we discovered multiple paralogs of Sir1, called Kos1-Kos4 (Kin of Sir1). The Kos and Sir1 proteins contributed partially overlapping functions to silencing of both cryptic mating loci in S. bayanus. Mutants of these paralogs reduced silencing at HML more than at HMR. Most genes of the SIR1 family were located near telomeres, and at least one paralog was regulated by telomere position effect. In S. cerevisiae, Sir1 is recruited to the silencers at HML and HMR via its ORC interacting region (OIR), which binds the bromo adjacent homology (BAH) domain of Orc1. Zygosaccharomyces rouxii, which diverged from Saccharomyces after the appearance of the silent mating cassettes, but before the whole-genome duplication, contained an ortholog of Kos3 that was apparently the archetypal member of the family, with only one OIR. In contrast, a duplication of this domain was present in all orthologs of Sir1, Kos1, Kos2, and Kos4. We propose that the functional specialization of Sir3, itself a paralog of Orc1, as a silencing protein was facilitated by the tandem duplication of the OIR domain in the Sir1 family, allowing distinct Sir1-Sir3 and Sir1-Orc1 interactions through OIR-BAH domain interactions.
Collapse
|
222
|
Nakajima Y, Tyers RG, Wong CCL, Yates JR, Drubin DG, Barnes G. Nbl1p: a Borealin/Dasra/CSC-1-like protein essential for Aurora/Ipl1 complex function and integrity in Saccharomyces cerevisiae. Mol Biol Cell 2009; 20:1772-84. [PMID: 19158380 DOI: 10.1091/mbc.e08-10-1011] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
The Aurora kinase complex, also called the chromosomal passenger complex (CPC), is essential for faithful chromosome segregation and completion of cell division. In Fungi and Animalia, this complex consists of the kinase Aurora B/AIR-2/Ipl1p, INCENP/ICP-1/Sli15p, and Survivin/BIR-1/Bir1p. A fourth subunit, Borealin/Dasra/CSC-1, is required for CPC targeting to centromeres and central spindles and has only been found in Animalia. Here we identified a new core component of the CPC in budding yeast, Nbl1p. NBL1 is essential for viability and nbl1 mutations cause chromosome missegregation and lagging chromosomes. Nbl1p colocalizes and copurifies with the CPC, and it is essential for CPC localization, stability, integrity, and function. Nbl1p is related to the N-terminus of Borealin/Dasra/CSC-1 and is similarly involved in connecting the other CPC subunits. Distant homology searching identified nearly 200, mostly unannotated, Borealin/Dasra/CSC-1-related proteins from nearly 150 species within Fungi and Animalia. Analysis of the sequence of these proteins, combined with comparative protein structure modeling of Bir1p-Nbl1p-Sli15p using the crystal structure of the human Survivin-Borealin-INCENP complex, revealed a striking structural conservation across a broad range of species. Our biological and computational analyses therefore establish that the fundamental design of the CPC is conserved from Fungi to Animalia.
Collapse
Affiliation(s)
- Yuko Nakajima
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | | | | | | | | | | |
Collapse
|
223
|
Gertz J, Siggia ED, Cohen BA. Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 2009; 457:215-8. [PMID: 19029883 PMCID: PMC2677908 DOI: 10.1038/nature07521] [Citation(s) in RCA: 239] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2008] [Accepted: 10/01/2008] [Indexed: 11/09/2022]
Abstract
Transcription factor binding sites are being discovered at a rapid pace. It is now necessary to turn attention towards understanding how these sites work in combination to influence gene expression. Quantitative models that accurately predict gene expression from promoter sequence will be a crucial part of solving this problem. Here we present such a model, based on the analysis of synthetic promoter libraries in yeast (Saccharomyces cerevisiae). Thermodynamic models based only on the equilibrium binding of transcription factors to DNA and to each other captured a large fraction of the variation in expression in every library. Thermodynamic analysis of these libraries uncovered several phenomena in our system, including cooperativity and the effects of weak binding sites. When applied to the S. cerevisiae genome, a model of repression by Mig1 (which was trained on synthetic promoters) predicts a number of Mig1-regulated genes that lack significant Mig1-binding sites in their promoters. The success of the thermodynamic approach suggests that the information encoded by combinations of cis-regulatory sites is interpreted primarily through simple protein-DNA and protein-protein interactions, with complicated biochemical reactions-such as nucleosome modifications-being downstream events. Quantitative analyses of synthetic promoter libraries will be an important tool in unravelling the rules underlying combinatorial cis-regulation.
Collapse
Affiliation(s)
- Jason Gertz
- Center for Genome Sciences, Department of Genetics, Washington University in St. Louis School of Medicine, 4444 Forest Park Ave., St. Louis, MO 63108
| | - Eric D. Siggia
- Center for Studies in Physics and Biology, The Rockefeller University, New York, NY 10021
| | - Barak A. Cohen
- Center for Genome Sciences, Department of Genetics, Washington University in St. Louis School of Medicine, 4444 Forest Park Ave., St. Louis, MO 63108
| |
Collapse
|
224
|
Kavanaugh LA, Dietrich FS. Non-coding RNA prediction and verification in Saccharomyces cerevisiae. PLoS Genet 2009; 5:e1000321. [PMID: 19119416 PMCID: PMC2603021 DOI: 10.1371/journal.pgen.1000321] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2008] [Accepted: 12/01/2008] [Indexed: 11/18/2022] Open
Abstract
Non-coding RNA (ncRNA) play an important and varied role in cellular function. A significant amount of research has been devoted to computational prediction of these genes from genomic sequence, but the ability to do so has remained elusive due to a lack of apparent genomic features. In this work, thermodynamic stability of ncRNA structural elements, as summarized in a Z-score, is used to predict ncRNA in the yeast Saccharomyces cerevisiae. This analysis was coupled with comparative genomics to search for ncRNA genes on chromosome six of S. cerevisiae and S. bayanus. Sets of positive and negative control genes were evaluated to determine the efficacy of thermodynamic stability for discriminating ncRNA from background sequence. The effect of window sizes and step sizes on the sensitivity of ncRNA identification was also explored. Non-coding RNA gene candidates, common to both S. cerevisiae and S. bayanus, were verified using northern blot analysis, rapid amplification of cDNA ends (RACE), and publicly available cDNA library data. Four ncRNA transcripts are well supported by experimental data (RUF10, RUF11, RUF12, RUF13), while one additional putative ncRNA transcript is well supported but the data are not entirely conclusive. Six candidates appear to be structural elements in 5′ or 3′ untranslated regions of annotated protein-coding genes. This work shows that thermodynamic stability, coupled with comparative genomics, can be used to predict ncRNA with significant structural elements. Recent advances in DNA sequence technology have made it possible to sequence entire genomes. Once a genome is sequenced, it becomes necessary to identify the set of genes and other functional elements within the genome. This is particularly challenging as much of the genomic sequence does not appear to perform any function and is loosely referred to as “junk.” Identifying functional elements among the “junk” is difficult. Experimental methods have been developed for this purpose but they are time-consuming, expensive, and often provide an incomplete picture. Thus, it is important to develop the ability to identify these functional elements using computational methods. Protein-coding genes are relatively easy to identify computationally, but other categories of functional elements present a significantly greater challenge. In this work, we used a computational approach to identify genes that do not encode for a protein but rather function as an RNA molecule. We then used experimental methods to verify our predictions and thereby validate the computational method.
Collapse
Affiliation(s)
- Laura A. Kavanaugh
- Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Fred S. Dietrich
- Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
225
|
Nieduszynski CA, Donaldson AD. Detection of replication origins using comparative genomics and recombinational ARS assay. Methods Mol Biol 2009; 521:295-313. [PMID: 19563113 DOI: 10.1007/978-1-60327-815-7_16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Effective experimental techniques are available to identify replication origin regions in eukaryotic cells. Genome-wide identification of the precise sequence elements that direct origin activity is however still not straightforward, even in the yeast Saccharomyces cerevisiae which has the best characterised eukaryotic replication origins. The availability of genome sequences for a series of closely related (sensu stricto) budding yeasts has allowed us to take a 'comparative genomics' approach to this problem. Since they represent functional protein-binding sites, origin sequences are conserved better than the surrounding intergenic sequence within the genomes of closely related yeasts. We describe here how phylogenetic comparison data can be used to identify candidate replication origin sequences in the S. cerevisiae genome, and how large numbers of such candidate sites can simultaneously be assayed for ability to initiate replication. Similar approaches could potentially be used to identify protein-binding sequence elements having other functions, as well as replication origin sites in other species.
Collapse
|
226
|
Sherman DJ, Martin T, Nikolski M, Cayla C, Souciet JL, Durrens P. Génolevures: protein families and synteny among complete hemiascomycetous yeast proteomes and genomes. Nucleic Acids Res 2009; 37:D550-4. [PMID: 19015150 PMCID: PMC2686504 DOI: 10.1093/nar/gkn859] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2008] [Revised: 10/15/2008] [Accepted: 10/16/2008] [Indexed: 01/21/2023] Open
Abstract
The Génolevures online database (http://cbi.labri.fr/Genolevures/ and http://genolevures.org/) provides exploratory tools and curated data sets relative to nine complete and seven partial genome sequences determined and manually annotated by the Génolevures Consortium, to facilitate comparative genomic studies of Hemiascomycete yeasts. The 2008 update to the Génolevures database provides four new genomes in complete (subtelomere to subtelomere) chromosome sequences, 50,000 protein-coding and tRNA genes, and in silico analyses for each gene element. A key element is a novel classification of conserved multi-species protein families and their use in detecting synteny, gene fusions and other aspects of genome remodeling in evolution. Our purpose is to release high-quality curated data from complete genomes, with a focus on the relations between genes, genomes and proteins.
Collapse
Affiliation(s)
- David J Sherman
- LaBRI, Laboratoire Bordelais de Recherche en Informatique, UMR CNRS 5800, 33405 Talence cedex, France.
| | | | | | | | | | | |
Collapse
|
227
|
Abstract
Chemostat cultivation of micro-organisms offers unique opportunities for experimental manipulation of individual environmental parameters at a fixed, controllable specific growth rate. Chemostat cultivation was originally developed as a tool to study quantitative aspects of microbial growth and metabolism. Renewed interest in this cultivation method is stimulated by the availability of high-information-density techniques for systemic analysis of microbial cultures, which require high reproducibility and careful experimental design. Genome-wide analysis of transcript levels with DNA micro-arrays is currently the most commonly applied of these high-information-density analysis tools for microbial gene expression. Based on published studies on the yeast Saccharomyces cerevisiae, a critical overview is presented of the possibilities and pitfalls associated with the combination of chemostat cultivation and transcriptome analysis with DNA micro-arrays. After a brief introduction to chemostat cultivation and micro-array analysis, key aspects of experimental design of chemostat-based micro-array experiments are discussed. The main focus of this review is on key biological concepts that can be accessed by chemostat-based micro-array analysis. These include effects of specific growth rate on transcriptional regulation, context-dependency of transcriptional responses, correlations between transcript profiles and contribution of the corresponding proteins to cellular function and fitness, and the analysis and application of evolutionary adaptation during prolonged chemostat cultivation. It is concluded that, notwithstanding the incompatibility of chemostat cultivation with high-throughput analysis, integration of chemostat cultivation with micro-array analysis and other high-information-density analytical approaches (e.g. proteomics and metabolomics techniques) offers unique advantages in terms of reproducibility and experimental design in comparison with standard batch cultivation systems. Therefore, chemostat cultivation and derived methods for controlled cultivation of micro-organisms are anticipated to become increasingly important in microbial physiology and systems biology.
Collapse
|
228
|
Lee A, Hansen KD, Bullard J, Dudoit S, Sherlock G. Novel low abundance and transient RNAs in yeast revealed by tiling microarrays and ultra high-throughput sequencing are not conserved across closely related yeast species. PLoS Genet 2008; 4:e1000299. [PMID: 19096707 PMCID: PMC2601015 DOI: 10.1371/journal.pgen.1000299] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2008] [Accepted: 11/06/2008] [Indexed: 11/18/2022] Open
Abstract
A complete description of the transcriptome of an organism is crucial for a comprehensive understanding of how it functions and how its transcriptional networks are controlled, and may provide insights into the organism's evolution. Despite the status of Saccharomyces cerevisiae as arguably the most well-studied model eukaryote, we still do not have a full catalog or understanding of all its genes. In order to interrogate the transcriptome of S. cerevisiae for low abundance or rapidly turned over transcripts, we deleted elements of the RNA degradation machinery with the goal of preferentially increasing the relative abundance of such transcripts. We then used high-resolution tiling microarrays and ultra high–throughput sequencing (UHTS) to identify, map, and validate unannotated transcripts that are more abundant in the RNA degradation mutants relative to wild-type cells. We identified 365 currently unannotated transcripts, the majority presumably representing low abundance or short-lived RNAs, of which 185 are previously unknown and unique to this study. It is likely that many of these are cryptic unstable transcripts (CUTs), which are rapidly degraded and whose function(s) within the cell are still unclear, while others may be novel functional transcripts. Of the 185 transcripts we identified as novel to our study, greater than 80 percent come from regions of the genome that have lower conservation scores amongst closely related yeast species than 85 percent of the verified ORFs in S. cerevisiae. Such regions of the genome have typically been less well-studied, and by definition transcripts from these regions will distinguish S. cerevisiae from these closely related species. The budding yeast Saccharomyces cerevisiae, because of the relative ease of its genetic manipulation and its ease of handling in the laboratory, has long served as a model on which studies in higher organisms have been based. To more fully understand how eukaryotic cells express their genomes, we sought to identify RNA species that are transcribed at very low levels or that are rapidly degraded. We created mutants deficient in the ability to degrade RNA, with the expectation that this would increase the relative abundance of such RNAs, and then used high-resolution microarrays and sequencing technologies to locate and identify from where these RNAs are transcribed. Using this approach, we have identified 365 transcripts that do not appear in the most current list of annotated S. cerevisiae RNA transcripts; of these, 185 are unique to our study. Many of these novel transcripts derive from regions of the genome that are poorly conserved between S. cerevisiae and other closely related yeast species, suggesting that these RNAs may play an important role in the divergent microevolution of S. cerevisiae.
Collapse
Affiliation(s)
- Albert Lee
- Department of Genetics, Stanford University, Stanford, California, United of States of America
| | - Kasper Daniel Hansen
- Division of Biostatistics, School of Public Health, University of California Berkeley, Berkeley, California, United States of America
| | - James Bullard
- Division of Biostatistics, School of Public Health, University of California Berkeley, Berkeley, California, United States of America
| | - Sandrine Dudoit
- Division of Biostatistics, School of Public Health, University of California Berkeley, Berkeley, California, United States of America
| | - Gavin Sherlock
- Department of Genetics, Stanford University, Stanford, California, United of States of America
- * E-mail:
| |
Collapse
|
229
|
Jung K, Park J, Choi J, Park B, Kim S, Ahn K, Choi J, Choi D, Kang S, Lee YH. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics. BMC Genomics 2008; 9:586. [PMID: 19055845 PMCID: PMC2649115 DOI: 10.1186/1471-2164-9-586] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2008] [Accepted: 12/04/2008] [Indexed: 12/24/2022] Open
Abstract
Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP) was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB) integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets) and 34 plant and animal (38 datasets) species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site .
Collapse
Affiliation(s)
- Kyongyong Jung
- Fungal Bioinformatics Laboratory, Seoul National University, Seoul, Korea.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
230
|
Yaragatti M, Sandler T, Ungar L. A predictive model for identifying mini-regulatory modules in the mouse genome. Bioinformatics 2008; 25:353-7. [DOI: 10.1093/bioinformatics/btn622] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
231
|
Beck H, Dobritzsch D, Piškur J. Saccharomyces kluyverias a model organism to study pyrimidine degradation. FEMS Yeast Res 2008; 8:1209-13. [DOI: 10.1111/j.1567-1364.2008.00442.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
232
|
Osada N, Innan H. Duplication and gene conversion in the Drosophila melanogaster genome. PLoS Genet 2008; 4:e1000305. [PMID: 19079581 PMCID: PMC2588116 DOI: 10.1371/journal.pgen.1000305] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2008] [Accepted: 11/12/2008] [Indexed: 11/18/2022] Open
Abstract
Using the genomic sequences of Drosophila melanogaster subgroup, the pattern of gene duplications was investigated with special attention to interlocus gene conversion. Our fine-scale analysis with careful visual inspections enabled accurate identification of a number of duplicated blocks (genomic regions). The orthologous parts of those duplicated blocks were also identified in the D. simulans and D. sechellia genomes, by which we were able to clearly classify the duplicated blocks into post- and pre-speciation blocks. We found 31 post-speciation duplicated genes, from which the rate of gene duplication (from one copy to two copies) is estimated to be 1.0 x 10(-9) per single-copy gene per year. The role of interlocus gene conversion was observed in several respects in our data: (1) synonymous divergence between a duplicated pair is overall very low. Consequently, the gene duplication rate would be seriously overestimated by counting duplicated genes with low divergence; (2) the sizes of young duplicated blocks are generally large. We postulate that the degeneration of gene conversion around the edges could explain the shrinkage of "identifiable" duplicated regions; and (3) elevated paralogous divergence is observed around the edges in many duplicated blocks, supporting our gene conversion-degeneration model. Our analysis demonstrated that gene conversion between duplicated regions is a common and genome-wide phenomenon in the Drosophila genomes, and that its role should be especially significant in the early stages of duplicated genes. Based on a population genetic prediction, we applied a new genome-scan method to test for signatures of selection for neofunctionalization and found a strong signature in a pair of transporter genes.
Collapse
Affiliation(s)
- Naoki Osada
- National Institute of Biomedical Innovation, Osaka, Japan
- Graduate University for Advanced Studies, Hayama, Japan
| | - Hideki Innan
- National Institute of Biomedical Innovation, Osaka, Japan
- Graduate University for Advanced Studies, Hayama, Japan
| |
Collapse
|
233
|
Kechris K, Li H. c-REDUCE: incorporating sequence conservation to detect motifs that correlate with expression. BMC Bioinformatics 2008; 9:506. [PMID: 19040743 PMCID: PMC2626603 DOI: 10.1186/1471-2105-9-506] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2008] [Accepted: 11/28/2008] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Computational methods for characterizing novel transcription factor binding sites search for sequence patterns or "motifs" that appear repeatedly in genomic regions of interest. Correlation-based motif finding strategies are used to identify motifs that correlate with expression data and do not rely on promoter sequences from a pre-determined set of genes. RESULTS In this work, we describe a method for predicting motifs that combines the correlation-based strategy with phylogenetic footprinting, where motifs are identified by evaluating orthologous sequence regions from multiple species. Our method, c-REDUCE, can account for variability at a motif position inferred from evolutionary information. c-REDUCE has been tested on ChIP-chip data for yeast transcription factors and on gene expression data in Drosophila. CONCLUSION Our results indicate that utilizing sequence conservation information in addition to correlation-based methods improves the identification of known motifs.
Collapse
Affiliation(s)
- Katerina Kechris
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver, 4200 East Ninth Avenue, B-119, Denver, CO 80262, USA
| | - Hao Li
- Department of Biochemistry and Biophysics, UCSF, 1700 4th Street, San Francisco, CA 94143, USA
- Center for Theoretical Biology, Peking University, Beijing 100871, PR China
| |
Collapse
|
234
|
Abstract
Transcription factors play a key role in the regulation of cell cycle progression, yet many of the specific regulatory interactions that control cell cycle transcription are still unknown. To systematically identify new yeast cell cycle transcription factors, we used a quantitative flow cytometry assay to screen 268 transcription factor deletion strains for defects in cell cycle progression. Our results reveal that 20% of nonessential transcription factors have an impact on cell cycle progression, including several recently identified cyclin-dependent kinase (Cdk) targets, which have not previously been linked to cell cycle transcription. This expanded catalog of cell-cycle-associated transcription factors will be a valuable resource for decoding the transcriptional regulatory interactions that govern progression through the cell cycle. We conducted follow-up studies on Sfg1, a transcription factor with no previously known role in cell cycle progression. Deletion of Sfg1 retards cells in G(1), and overexpression of Sfg1 delays cells in the G(2)/M phase. We find that Sfg1 represses early G(1), Swi5/Ace2-regulated genes involved in mother-daughter cell separation. We also show that Sfg1, a known in vitro cyclin-dependent kinase target, is phosphorylated in vivo on conserved Cdk phosphorylation sites and that phosphorylation of Sfg1 is necessary for its role in promoting cell cycle progression. Overall, our work increases the number of transcription factors associated with cell cycle progression, strongly indicates that there are many more unexplored connections between the Cdk-cyclin oscillator and cell cycle transcription, and suggests a new mechanism for the regulation of cell separation during the M/G(1) phase transition.
Collapse
|
235
|
Vigentini I, Fracassetti D, Picozzi C, Foschino R. Polymorphisms of Saccharomyces cerevisiae genes involved in wine production. Curr Microbiol 2008; 58:211-8. [PMID: 19005725 DOI: 10.1007/s00284-008-9310-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2008] [Revised: 10/02/2008] [Accepted: 10/09/2008] [Indexed: 11/30/2022]
Abstract
The setting up of new molecular methods for Saccharomyces cerevisiae typing is valuable in enology. Actually, the ability to discriminate different strains in wine making can have a benefit both for the control of the fermentation process and for the preservation of wine typicity. This study focused on the screening of single-nucleotide polymorphisms in genes involved in wine production that could evolve rapidly considering the selective pressure of the isolation environment. Preliminary screening of 30 genes in silico was performed, followed by the selection of 10 loci belonging to 8 genes. The sequence analysis showed a low polymorphism and a degree of heterozygosity. However, a new potential molecular target was recognized in the TPS1 gene coding for the trehalose-6-phosphate synthase enzyme involved in the ethanol resistance mechanism. This gene showed a 1.42% sequence diversity with seven different nucleotide substitutions. Moreover, classic techniques were applied to a collection of 50 S. cerevisiae isolates, mostly with enologic origin. Our results confirmed that the wine making was not carried out only by the inoculated commercial starter because indigenous strains of S. cerevisiae present during fermentation were detected. In addition, a high genetic relationship among some commercial cultures was found, highlighting imprecision or fraudulent practices by starter manufacturers.
Collapse
Affiliation(s)
- Ileana Vigentini
- Dipartimento di Scienze e Tecnologie Alimentari e Microbiologiche, Università degli Studi di Milano, via Celoria 2, 20133 Milan, Italy
| | | | | | | |
Collapse
|
236
|
Teytelman L, Eisen MB, Rine J. Silent but not static: accelerated base-pair substitution in silenced chromatin of budding yeasts. PLoS Genet 2008; 4:e1000247. [PMID: 18989454 PMCID: PMC2570616 DOI: 10.1371/journal.pgen.1000247] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2008] [Accepted: 10/01/2008] [Indexed: 01/06/2023] Open
Abstract
Subtelomeric DNA in budding yeasts, like metazoan heterochromatin, is gene poor, repetitive, transiently silenced, and highly dynamic. The rapid evolution of subtelomeric regions is commonly thought to arise from transposon activity and increased recombination between repetitive elements. However, we found evidence of an additional factor in this diversification. We observed a surprising level of nucleotide divergence in transcriptionally silenced regions in inter-species comparisons of Saccharomyces yeasts. Likewise, intra-species analysis of polymorphisms also revealed increased SNP frequencies in both intergenic and synonymous coding positions of silenced DNA. This analysis suggested that silenced DNA in Saccharomyces cerevisiae and closely related species had increased single base-pair substitution that was likely due to the effects of the silencing machinery on DNA replication or repair. Many plants, fungi, pathogens, and animals have chromosome regions that are silenced. Special proteins change the chromosome structure in these domains, turning genes off or lowering their expression levels. We found an increased frequency of DNA mutations in these silenced regions of closely related yeasts. This increase is likely due to silencing proteins interfering with DNA repair or replication. Accurate replication of genetic information with minimal mutations is usually critical for the survival and fitness of an organism; however, there are examples where a high mutation rate is beneficial. The silenced regions of chromosomes are often associated with virus-like transposable elements, and with genes that are important in responding to environmental changes. Hence, it is possible that elevated DNA mutations in silenced regions contribute to genome defense against transposable elements or increased genetic diversity to cope with variation in surrounding conditions.
Collapse
Affiliation(s)
- Leonid Teytelman
- Department of Molecular & Cell Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Michael B. Eisen
- Department of Molecular & Cell Biology, University of California Berkeley, Berkeley, California, United States of America
- California Institute for Quantitative Biosciences, Berkeley, California, United States of America
- Center for Integrative Genomics, University of California Berkeley, Berkeley, California, United States of America
| | - Jasper Rine
- Department of Molecular & Cell Biology, University of California Berkeley, Berkeley, California, United States of America
- California Institute for Quantitative Biosciences, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
237
|
Li E, Reich CI, Olsen GJ. A whole-genome approach to identifying protein binding sites: promoters in Methanocaldococcus (Methanococcus) jannaschii. Nucleic Acids Res 2008; 36:6948-58. [PMID: 18981048 PMCID: PMC2602779 DOI: 10.1093/nar/gkm499] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
We have adapted an electrophoretic mobility shift assay (EMSA) to isolate genomic DNA fragments that bind the archaeal transcription initiation factors TATA-binding protein (TBP) and transcription factor B (TFB) to perform a genome-wide search for promoters. Mobility-shifted fragments were cloned, tested for their ability to compete with known promoter-containing fragments for a limited concentration of transcription factors, and sequenced. We applied the method to search for promoters in the genome of Methanocaldococcus jannaschii. Selection was most efficient for promoters of tRNA genes and genes for several presumed small non-coding RNAs (ncRNA). Protein-coding gene promoters were dramatically underrepresented relative to their frequency in the genome. The repeated isolation of these genomic regions was partially rectified by including a hybridization-based screening. Sequence alignment of the affinity-selected promoters revealed previously identified TATA box, BRE, and the putative initiator element. In addition, the conserved bases immediately upstream and downstream of the BRE and TATA box suggest that the composition and structure of archaeal natural promoters are more complicated.
Collapse
Affiliation(s)
- Enhu Li
- Division of Biology, California Institute of Technology, Pasadena, CA 91125, USA
| | | | | |
Collapse
|
238
|
Kuntz SG, Schwarz EM, DeModena JA, De Buysscher T, Trout D, Shizuya H, Sternberg PW, Wold BJ. Multigenome DNA sequence conservation identifies Hox cis-regulatory elements. Genome Res 2008; 18:1955-68. [PMID: 18981268 DOI: 10.1101/gr.085472.108] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced approximately 0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide.
Collapse
Affiliation(s)
- Steven G Kuntz
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | | | | | | | | | | | | | | |
Collapse
|
239
|
Durrens P, Nikolski M, Sherman D. Fusion and fission of genes define a metric between fungal genomes. PLoS Comput Biol 2008; 4:e1000200. [PMID: 18949021 PMCID: PMC2557144 DOI: 10.1371/journal.pcbi.1000200] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2008] [Accepted: 09/05/2008] [Indexed: 12/19/2022] Open
Abstract
Gene fusion and fission events are key mechanisms in the evolution of gene architecture, whose effects are visible in protein architecture when they occur in coding sequences. Until now, the detection of fusion and fission events has been performed at the level of protein sequences with a post facto removal of supernumerary links due to paralogy, and often did not include looking for events defined only in single genomes. We propose a method for the detection of these events, defined on groups of paralogs to compensate for the gene redundancy of eukaryotic genomes, and apply it to the proteomes of 12 fungal species. We collected an inventory of 1,680 elementary fusion and fission events. In half the cases, both composite and element genes are found in the same species. Per-species counts of events correlate with the species genome size, suggesting a random mechanism of occurrence. Some biological functions of the genes involved in fusion and fission events are slightly over- or under-represented. As already noted in previous studies, the genes involved in an event tend to belong to the same functional category. We inferred the position of each event in the evolution tree of the 12 fungal species. The event localization counts for all the segments of the tree provide a metric that depicts the “recombinational” phylogeny among fungi. A possible interpretation of this metric as distance in adaptation space is proposed. One consequence of genome remodelling in evolution is the modification of genes, either by fusion with other genes, or by fission into several parts. By tracking the mathematical relations between groups of similar genes, rather than between individual genes, we can paint a global picture of remodelling across many species simultaneously. The strengths of our method are that it allows us to include highly redundant eukaryote genomes, and that it avoids alignment artifacts by representing each group of similar genes by a mathematical model. Applying our method to a set of fungal genomes, we confirmed first that the number of fusion/fission events is correlated with genome size, second that the fusion to fission ratio favors fusions, third that the set of events is not saturated, and fourth that while genes assembled in a fusion tend to have the same biochemical function, there appears to be little bias for the functions that are involved. Indeed, fusion and fission events are landmarks of random remodelling, independent of mutation rate: they define a metric of “recombination distance.” This distance lets us build a genome evolution history of species and may well be a better measure than mutation distance of the process of adaptation.
Collapse
Affiliation(s)
- Pascal Durrens
- MAGNOME Team, INRIA Centre de Recherche Bordeaux- Sud-Ouest, Laboratoire Bordelais de Recherche en Informatique, UMR 5800 CNRS, Domaine Universitaire, Talence Cedex, France.
| | | | | |
Collapse
|
240
|
Washietl S, Machné R, Goldman N. Evolutionary footprints of nucleosome positions in yeast. Trends Genet 2008; 24:583-7. [PMID: 18951646 DOI: 10.1016/j.tig.2008.09.003] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2008] [Revised: 09/09/2008] [Accepted: 09/15/2008] [Indexed: 11/29/2022]
Abstract
Using genome-wide maps of nucleosome positions in yeast, we have analyzed the influence of chromatin structure on the molecular evolution of genomic DNA. We have observed, on average, 10-15% lower substitution rates in linker regions than in nucleosomal DNA. This widespread local rate heterogeneity represents an evolutionary footprint of nucleosome positions and reveals that nucleosome organization is a genomic feature conserved over evolutionary timescales.
Collapse
Affiliation(s)
- Stefan Washietl
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.
| | | | | |
Collapse
|
241
|
Solieri L, Antúnez O, Pérez-Ortín JE, Barrio E, Giudici P. Mitochondrial inheritance and fermentative : oxidative balance in hybrids between Saccharomyces cerevisiae and Saccharomyces uvarum. Yeast 2008; 25:485-500. [PMID: 18615860 DOI: 10.1002/yea.1600] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Breeding between Saccharomyces species is a useful tool for obtaining improved wine yeast strains, combining fermentative features of parental species. In this work, 25 artificial Saccharomyces cerevisiae x Saccharomyces uvarum hybrids were constructed by spore conjugation. A multi-locus PCR-restriction fragment length polymorphism (PCR-RFLP) analysis, targeting six nuclear gene markers and the ribosomal region including the 5.8S rRNA gene and the two internal transcribed spacers, showed that the hybrid genome is the result of two chromosome sets, one coming from S. cerevisiae and the other from S. uvarum. Mitochondrial DNA (mtDNA) typing showed uniparental inheritance in all hybrids. Furthermore, sibling hybrids, obtained by repeated crosses between the same parental strains, showed the same mtDNA, suggesting that the mitochondrial transmission is not stochastic or species-specific, but dependent on the parental strains. Finally four hybrids, two of which with S. cerevisiae mtDNA and two with S. uvarum mtDNA, were subjected to transcriptome analysis. Our results showed that the hybrids bearing S. cerevisiae mtDNA exhibited less expression of genes involved in glycolysis/fermentation pathways and in hexose transport compared to hybrids with S. uvarum mtDNA. Respiration assay confirmed the increased respiratory activity of hybrids with the S. cerevisiae mtDNA genome. These findings suggest that mtDNA type and fermentative : respiratory performances are correlated in S. cerevisiae x S. uvarum hybrids and the mtDNA type is an important trait for constructing new improved hybrids for winemaking.
Collapse
Affiliation(s)
- Lisa Solieri
- Department of Agricultural and Food Sciences, University of Modena and Reggio Emilia, via Amendola 2, Padiglione Besta, Reggio Emilia, Italy.
| | | | | | | | | |
Collapse
|
242
|
Dunn B, Sherlock G. Reconstruction of the genome origins and evolution of the hybrid lager yeast Saccharomyces pastorianus. Genome Res 2008; 18:1610-23. [PMID: 18787083 DOI: 10.1101/gr.076075.108] [Citation(s) in RCA: 213] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Inter-specific hybridization leading to abrupt speciation is a well-known, common mechanism in angiosperm evolution; only recently, however, have similar hybridization and speciation mechanisms been documented to occur frequently among the closely related group of sensu stricto Saccharomyces yeasts. The economically important lager beer yeast Saccharomyces pastorianus is such a hybrid, formed by the union of Saccharomyces cerevisiae and Saccharomyces bayanus-related yeasts; efforts to understand its complex genome, searching for both biological and brewing-related insights, have been underway since its hybrid nature was first discovered. It had been generally thought that a single hybridization event resulted in a unique S. pastorianus species, but it has been recently postulated that there have been two or more hybridization events. Here, we show that there may have been two independent origins of S. pastorianus strains, and that each independent group--defined by characteristic genome rearrangements, copy number variations, ploidy differences, and DNA sequence polymorphisms--is correlated with specific breweries and/or geographic locations. Finally, by reconstructing common ancestral genomes via array-CGH data analysis and by comparing representative DNA sequences of the S. pastorianus strains with those of many different S. cerevisiae isolates, we have determined that the most likely S. cerevisiae ancestral parent for each of the independent S. pastorianus groups was an ale yeast, with different, but closely related ale strains contributing to each group's parentage.
Collapse
Affiliation(s)
- Barbara Dunn
- Department of Genetics, Stanford University, Stanford, California 94305-5120, USA
| | | |
Collapse
|
243
|
PhyloGibbs-MP: module prediction and discriminative motif-finding by Gibbs sampling. PLoS Comput Biol 2008; 4:e1000156. [PMID: 18769735 PMCID: PMC2518514 DOI: 10.1371/journal.pcbi.1000156] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2008] [Accepted: 07/11/2008] [Indexed: 11/23/2022] Open
Abstract
PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules—tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other “discriminative motif-finders” have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use “informative priors” on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data. Proteins in a living cell are not expressed all the time: instead, genes are turned on or off on demand. Indeed, though nearly every cell in a multicellular organism has a complete copy of the genome, each cell expresses only a fraction of the encoded proteins. Regulation of gene expression occurs in various ways. One of the most important (especially in simpler organisms) is “transcriptional regulation,” where specialised DNA-binding proteins, “transcription factors,” attach to the DNA to recruit the gene-transcriptional machinery. Detecting binding sites in DNA for these factors has long been a problem of interest in computational biology. Here, a program, PhyloGibbs-MP, is presented that extends our previously published motif-finder PhyloGibbs to handle some important related problems, in particular, detecting “discriminative” sites that distinguish differently regulated groups of genes and finding “cis-regulatory modules,” regions of DNA that contain large clusters of regulatory-protein-binding sites. PhyloGibbs-MP compares well on benchmarks with the best specialised programs for all these tasks, while being the first to integrate them in one consistent formalism. Regulatory regions in higher eukaryotes can be highly complex, and PhyloGibbs-MP is expected to be a very useful tool in identifying and analysing regulatory DNA.
Collapse
|
244
|
A catalog of neutral and deleterious polymorphism in yeast. PLoS Genet 2008; 4:e1000183. [PMID: 18769710 PMCID: PMC2515631 DOI: 10.1371/journal.pgen.1000183] [Citation(s) in RCA: 177] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2008] [Accepted: 07/30/2008] [Indexed: 11/30/2022] Open
Abstract
The abundance and identity of functional variation segregating in natural populations is paramount to dissecting the molecular basis of quantitative traits as well as human genetic diseases. Genome sequencing of multiple organisms of the same species provides an efficient means of cataloging rearrangements, insertion, or deletion polymorphisms (InDels) and single-nucleotide polymorphisms (SNPs). While inbreeding depression and heterosis imply that a substantial amount of polymorphism is deleterious, distinguishing deleterious from neutral polymorphism remains a significant challenge. To identify deleterious and neutral DNA sequence variation within Saccharomyces cerevisiae, we sequenced the genome of a vineyard and oak tree strain and compared them to a reference genome. Among these three strains, 6% of the genome is variable, mostly attributable to variation in genome content that results from large InDels. Out of the 88,000 polymorphisms identified, 93% are SNPs and a small but significant fraction can be attributed to recent interspecific introgression and ectopic gene conversion. In comparison to the reference genome, there is substantial evidence for functional variation in gene content and structure that results from large InDels, frame-shifts, and polymorphic start and stop codons. Comparison of polymorphism to divergence reveals scant evidence for positive selection but an abundance of evidence for deleterious SNPs. We estimate that 12% of coding and 7% of noncoding SNPs are deleterious. Based on divergence among 11 yeast species, we identified 1,666 nonsynonymous SNPs that disrupt conserved amino acids and 1,863 noncoding SNPs that disrupt conserved noncoding motifs. The deleterious coding SNPs include those known to affect quantitative traits, and a subset of the deleterious noncoding SNPs occurs in the promoters of genes that show allele-specific expression, implying that some cis-regulatory SNPs are deleterious. Our results show that the genome sequences of both closely and distantly related species provide a means of identifying deleterious polymorphisms that disrupt functionally conserved coding and noncoding sequences. DNA sequence variation makes an important contribution to most traits that vary in natural populations. However, mapping mutations that underlie a trait of interest is a significant challenge. Genome sequencing of multiple organisms provides a complete list of DNA sequence differences responsible for any trait that differs among the organisms. Yet, distinguishing those DNA sequence variants that contribute to a trait from all other variants is not easy. Here, we sequence the genomes of two strains of yeast and, through comparisons with a reference genome, we catalog multiple types of DNA sequence variation among the three strains. Using a variety of comparative genomics methods, we show that a substantial fraction of DNA sequence variations has deleterious effects on fitness. Finally, we show that a subset of deleterious mutations is associated with changes in gene expression levels. Our results imply that comparative genomics methods will be a valuable approach to identifying DNA sequence changes underlying numerous traits of interest.
Collapse
|
245
|
Park J, Lee S, Choi J, Ahn K, Park B, Park J, Kang S, Lee YH. Fungal cytochrome P450 database. BMC Genomics 2008; 9:402. [PMID: 18755027 PMCID: PMC2542383 DOI: 10.1186/1471-2164-9-402] [Citation(s) in RCA: 125] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Accepted: 08/28/2008] [Indexed: 12/15/2022] Open
Abstract
Background Cytochrome P450 enzymes play critical roles in fungal biology and ecology. To support studies on the roles and evolution of cytochrome P450 enzymes in fungi based on rapidly accumulating genome sequences from diverse fungal species, an efficient bioinformatics platform specialized for this super family of proteins is highly desirable. Results The Fungal Cytochrome P450 Database (FCPD) archives genes encoding P450s in the genomes of 66 fungal and 4 oomycete species (4,538 in total) and supports analyses of their sequences, chromosomal distribution pattern, and evolutionary histories and relationships. The archived P450s were classified into 16 classes based on InterPro terms and clustered into 141 groups using tribe-MCL. The proportion of P450s in the total proteome and class distribution in individual species exhibited certain taxon-specific characteristics. Conclusion The FCPD will facilitate systematic identification and multifaceted analyses of P450s at multiple taxon levels via the web. All data and functions are available at the web site .
Collapse
Affiliation(s)
- Jongsun Park
- Fungal Bioinformatics Laboratory, Seoul National University, Seoul 151-921, Korea.
| | | | | | | | | | | | | | | |
Collapse
|
246
|
Abstract
Comparative genomics is a powerful tool for gaining insight into genomic function and evolution. However, in plants, sequence data that would enable detailed comparisons of both coding and noncoding regions have been limited in availability. Here we report the generation and analysis of sequences for an unduplicated conserved syntenic segment (CSS) in the genomes of five members of the agriculturally important plant family Solanaceae. This CSS includes a 105-kb region of tomato chromosome 2 and orthologous regions of the potato, eggplant, pepper, and petunia genomes. With a total neutral divergence of 0.73-0.78 substitutions/site, these sequences are similar enough that most noncoding regions can be aligned, yet divergent enough to be informative about evolutionary dynamics and selective pressures. The CSS contains 17 distinct genes with generally conserved order and orientation, but with numerous small-scale differences between species. Our analysis indicates that the last common ancestor of these species lived approximately 27-36 million years ago, that more than one-third of short genomic segments (5-15 bp) are under selection, and that more than two-thirds of selected bases fall in noncoding regions. In addition, we identify genes under positive selection and analyze hundreds of conserved noncoding elements. This analysis provides a window into 30 million years of plant evolution in the absence of polyploidization.
Collapse
|
247
|
Sharon E, Lubliner S, Segal E. A feature-based approach to modeling protein-DNA interactions. PLoS Comput Biol 2008; 4:e1000154. [PMID: 18725950 PMCID: PMC2516605 DOI: 10.1371/journal.pcbi.1000154] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2008] [Accepted: 07/10/2008] [Indexed: 11/18/2022] Open
Abstract
Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. However, in many cases, this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TF–DNA interactions, based on log-linear models. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our model and devise an algorithm for learning its structural features from binding site data. We also developed a discriminative motif finder, which discovers de novo FMMs that are enriched in target sets of sequences compared to background sets. We evaluate our approach on synthetic data and on the widely used TF chromatin immunoprecipitation (ChIP) dataset of Harbison et al. We then apply our algorithm to high-throughput TF ChIP data from mouse and human, reveal sequence features that are present in the binding specificities of mouse and human TFs, and show that FMMs explain TF binding significantly better than PSSMs. Our FMM learning and motif finder software are available at http://genie.weizmann.ac.il/. Transcription factor (TF) protein binding to its DNA target sequences is a fundamental physical interaction underlying gene regulation. Characterizing the binding specificities of TFs is essential for deducing which genes are regulated by which TFs. Recently, several high-throughput methods that measure sequences enriched for TF targets genomewide were developed. Since TFs recognize relatively short sequences, much effort has been directed at developing computational methods that identify enriched subsequences (motifs) from these sequences. However, little effort has been directed towards improving the representation of motifs. Practically, available motif finding software use the position specific scoring matrix (PSSM) model, which assumes independence between different motif positions. We present an alternative, richer model, called the feature motif model (FMM), that enables the representation of a variety of sequence features and captures dependencies that exist between binding site positions. We show how FMMs explain TF binding data better than PSSMs on both synthetic and real data. We also present a motif finder algorithm that learns FMM motifs from unaligned promoter sequences and show how de novo FMMs, learned from binding data of the human TFs c-Myc and CTCF, reveal intriguing insights about their binding specificities.
Collapse
Affiliation(s)
- Eilon Sharon
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
| | - Shai Lubliner
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| |
Collapse
|
248
|
Energy-dependent fitness: a quantitative model for the evolution of yeast transcription factor binding sites. Proc Natl Acad Sci U S A 2008; 105:12376-81. [PMID: 18723669 DOI: 10.1073/pnas.0805909105] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a genomewide cross-species analysis of regulation for broad-acting transcription factors in yeast. Our model for binding site evolution is founded on biophysics: the binding energy between transcription factor and site is a quantitative phenotype of regulatory function, and selection is given by a fitness landscape that depends on this phenotype. The model quantifies conservation, as well as loss and gain, of functional binding sites in a coherent way. Its predictions are supported by direct cross-species comparison between four yeast species. We find ubiquitous compensatory mutations within functional sites, such that the energy phenotype and the function of a site evolve in a significantly more constrained way than does its sequence. We also find evidence for substantial evolution of regulatory function involving point mutations as well as sequence insertions and deletions within binding sites. Genes lose their regulatory link to a given transcription factor at a rate similar to the neutral point mutation rate, from which we infer a moderate average fitness advantage of functional over nonfunctional sites. In a wider context, this study provides an example of inference of selection acting on a quantitative molecular trait.
Collapse
|
249
|
Zill OA, Rine J. Interspecies variation reveals a conserved repressor of alpha-specific genes in Saccharomyces yeasts. Genes Dev 2008; 22:1704-16. [PMID: 18559484 DOI: 10.1101/gad.1640008] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The mating-type determination circuit in Saccharomyces yeast serves as a classic paradigm for the genetic control of cell type in all eukaryotes. Using comparative genetics, we discovered a central and conserved, yet previously undetected, component of this genetic circuit: active repression of alpha-specific genes in a cells. Upon inactivation of the SUM1 gene in Saccharomyces bayanus, a close relative of Saccharomyces cerevisiae, a cells acquired mating characteristics of alpha cells and displayed autocrine activation of their mating response pathway. Sum1 protein bound to the promoters of alpha-specific genes, repressing their transcription. In contrast to the standard model, alpha1 was important but not required for alpha-specific gene activation and mating of alpha cells in the absence of Sum1. Neither Sum1 protein expression, nor its association with target promoters was mating-type-regulated. Thus, the alpha1/Mcm1 coactivators did not overcome repression by occluding Sum1 binding to DNA. Surprisingly, the mating-type regulatory function of Sum1 was conserved in S. cerevisiae. We suggest that a comprehensive understanding of some genetic pathways may be best attained through the expanded phenotypic space provided by study of those pathways in multiple related organisms.
Collapse
Affiliation(s)
- Oliver A Zill
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | | |
Collapse
|
250
|
Kuehner JN, Brow DA. Regulation of a eukaryotic gene by GTP-dependent start site selection and transcription attenuation. Mol Cell 2008; 31:201-11. [PMID: 18657503 DOI: 10.1016/j.molcel.2008.05.018] [Citation(s) in RCA: 114] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2008] [Revised: 04/24/2008] [Accepted: 05/29/2008] [Indexed: 10/21/2022]
Abstract
Guanine nucleotide negatively regulates yeast inosine monophosphate dehydrogenase (IMPDH) mRNA synthesis by an unknown mechanism. IMPDH catalyzes the first dedicated step of GTP biosynthesis, and feedback control of its expression maintains the proper balance of purine nucleotides. Here we show that RNA polymerase II (Pol II) responds to GTP concentration. When GTP is sufficient, Pol II initiates transcription of the IMPDH gene (IMD2) at TATA box-proximal "G" sites, producing attenuated transcripts. When GTP is deficient, Pol II initiates at an "A" further downstream, circumventing the regulatory terminator to produce IMPDH mRNA. A major determinant for GTP concentration-dependent initiation at the upstream sites is the presence of guanine at the first and second positions of the transcript. Mutations in the Rpb1 subunit of Pol II and in TFIIB disrupt IMD2 regulation by altering start site selection. Thus, Pol II initiation can be regulated by the concentration of initiating nucleotide.
Collapse
Affiliation(s)
- Jason N Kuehner
- Cellular and Molecular Biology Graduate Program, University of Wisconsin School of Medicine and Public Health, Madison, WI 53706, USA
| | | |
Collapse
|