1
|
Engel SR, Wong ED, Nash RS, Aleksander S, Alexander M, Douglass E, Karra K, Miyasato SR, Simison M, Skrzypek MS, Weng S, Cherry JM. New data and collaborations at the Saccharomyces Genome Database: updated reference genome, alleles, and the Alliance of Genome Resources. Genetics 2022; 220:iyab224. [PMID: 34897464 PMCID: PMC9209811 DOI: 10.1093/genetics/iyab224] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 11/11/2021] [Indexed: 02/03/2023] Open
Abstract
Saccharomyces cerevisiae is used to provide fundamental understanding of eukaryotic genetics, gene product function, and cellular biological processes. Saccharomyces Genome Database (SGD) has been supporting the yeast research community since 1993, serving as its de facto hub. Over the years, SGD has maintained the genetic nomenclature, chromosome maps, and functional annotation, and developed various tools and methods for analysis and curation of a variety of emerging data types. More recently, SGD and six other model organism focused knowledgebases have come together to create the Alliance of Genome Resources to develop sustainable genome information resources that promote and support the use of various model organisms to understand the genetic and genomic bases of human biology and disease. Here we describe recent activities at SGD, including the latest reference genome annotation update, the development of a curation system for mutant alleles, and new pages addressing homology across model organisms as well as the use of yeast to study human disease.
Collapse
Affiliation(s)
- Stacia R Engel
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Edith D Wong
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Robert S Nash
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Suzi Aleksander
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Micheal Alexander
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Eric Douglass
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Kalpana Karra
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Stuart R Miyasato
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Matt Simison
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Marek S Skrzypek
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Shuai Weng
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - J Michael Cherry
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| |
Collapse
|
2
|
Proteomic Analysis Identifies Markers of Exposure to Cadmium Sulphide Quantum Dots (CdS QDs). NANOMATERIALS 2020; 10:nano10061214. [PMID: 32580447 PMCID: PMC7353101 DOI: 10.3390/nano10061214] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 06/10/2020] [Accepted: 06/17/2020] [Indexed: 12/11/2022]
Abstract
The use of cadmium sulphide quantum dot (CdS QD)-enabled products has become increasingly widespread. The prospect of their release in the environment is raising concerns. Here we have used the yeast model Saccharomyces cerevisiae to determine the potential impact of CdS QD nanoparticles on living organisms. Proteomic analyses and cell viability assays performed after 9 h exposure revealed expression of proteins involved in oxidative stress and reduced lethality, respectively, whereas oxidative stress declined, and lethality increased after 24 h incubation in the presence of CdS QDs. Quantitative proteomics using the iTRAQ approach (isobaric tags for relative and absolute quantitation) revealed that key proteins involved in essential biological pathways were differentially regulated over the time course of the experiment. At 9 h, most of the glycolytic functions increased, and the abundance of the number of heat shock proteins increased. This contrasts with the situation at 24 h where glycolytic functions, some heat shock proteins as well as oxidative phosphorylation and ATP synthesis were down-regulated. It can be concluded from our data that cell exposure to CdS QDs provokes a metabolic shift from respiration to fermentation, comparable to the situation reported in some cancer cell lines.
Collapse
|
3
|
A highly sensitive “turn-on” fluorescent probe with an aggregation-induced emission characteristic for quantitative detection of γ-globulin. Biosens Bioelectron 2017; 92:536-541. [DOI: 10.1016/j.bios.2016.10.064] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 10/22/2016] [Indexed: 12/22/2022]
|
4
|
Pakula TM, Nygren H, Barth D, Heinonen M, Castillo S, Penttilä M, Arvas M. Genome wide analysis of protein production load in Trichoderma reesei. BIOTECHNOLOGY FOR BIOFUELS 2016; 9:132. [PMID: 27354857 PMCID: PMC4924338 DOI: 10.1186/s13068-016-0547-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 06/07/2016] [Indexed: 05/05/2023]
Abstract
BACKGROUND The filamentous fungus Trichoderma reesei (teleomorph Hypocrea jecorina) is a widely used industrial host organism for protein production. In industrial cultivations, it can produce over 100 g/l of extracellular protein, mostly constituting of cellulases and hemicellulases. In order to improve protein production of T. reesei the transcriptional regulation of cellulases and secretory pathway factors have been extensively studied. However, the metabolism of T. reesei under protein production conditions has not received much attention. RESULTS To understand the physiology and metabolism of T. reesei under protein production conditions we carried out a well-controlled bioreactor experiment with extensive analysis. We used minimal media to make the data amenable for modelling and three strain pairs to cover different protein production levels. With RNA-sequencing transcriptomics we detected the concentration of the carbon source as the most important determinant of the transcriptome. As the major transcriptional response concomitant to protein production we detected the induction of selected genes that were putatively regulated by xyr1 and were related to protein transport, amino acid metabolism and transcriptional regulation. We found novel metabolic responses such as production of glycerol and a cellotriose-like compound. We then used this cultivation data for flux balance analysis of T. reesei metabolism and demonstrate for the first time the use of genome wide stoichiometric metabolic modelling for T. reesei. We show that our model can predict protein production rate and provides novel insight into the metabolism of protein production. We also provide this unprecedented cultivation and transcriptomics data set for future modelling efforts. CONCLUSIONS The use of stoichiometric modelling can open a novel path for the improvement of protein production in T. reesei. Based on this we propose sulphur assimilation as a major limiting factor of protein production. As an organism with exceptional protein production capabilities modelling of T. reesei can provide novel insight also to other less productive organisms.
Collapse
Affiliation(s)
- Tiina M. Pakula
- />VTT Technical Research Centre of Finland, Tietotie 2, P.O. Box FI-1000, 02044 Espoo, Finland
| | - Heli Nygren
- />VTT Technical Research Centre of Finland, Tietotie 2, P.O. Box FI-1000, 02044 Espoo, Finland
| | - Dorothee Barth
- />VTT Technical Research Centre of Finland, Tietotie 2, P.O. Box FI-1000, 02044 Espoo, Finland
| | - Markus Heinonen
- />Department of Information and Computer Science, Aalto University, PO Box 15400, 00076 Espoo, Finland
- />Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Sandra Castillo
- />VTT Technical Research Centre of Finland, Tietotie 2, P.O. Box FI-1000, 02044 Espoo, Finland
| | - Merja Penttilä
- />VTT Technical Research Centre of Finland, Tietotie 2, P.O. Box FI-1000, 02044 Espoo, Finland
| | - Mikko Arvas
- />VTT Technical Research Centre of Finland, Tietotie 2, P.O. Box FI-1000, 02044 Espoo, Finland
| |
Collapse
|
5
|
Cdc42p-interacting protein Bem4p regulates the filamentous-growth mitogen-activated protein kinase pathway. Mol Cell Biol 2014; 35:417-36. [PMID: 25384973 DOI: 10.1128/mcb.00850-14] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The ubiquitous Rho (Ras homology) GTPase Cdc42p can function in different settings to regulate cell polarity and cellular signaling. How Cdc42p and other proteins are directed to function in a particular context remains unclear. We show that the Cdc42p-interacting protein Bem4p regulates the mitogen-activated protein kinase (MAPK) pathway that controls filamentous growth in Saccharomyces cerevisiae. Bem4p controlled the filamentous-growth pathway but not other MAPK pathways (mating or high-osmolarity glycerol response [HOG]) that also require Cdc42p and other shared components. Bem4p associated with the plasma membrane (PM) protein, Sho1p, to regulate MAPK activity and cell polarization under nutrient-limiting conditions that favor filamentous growth. Bem4p also interacted with the major activator of Cdc42p, the guanine nucleotide exchange factor (GEF) Cdc24p, which we show also regulates the filamentous-growth pathway. Bem4p interacted with the pleckstrin homology (PH) domain of Cdc24p, which functions in an autoinhibitory capacity, and was required, along with other pathway regulators, to maintain Cdc24p at polarized sites during filamentous growth. Bem4p also interacted with the MAPK kinase kinase (MAPKKK) Ste11p. Thus, Bem4p is a new regulator of the filamentous-growth MAPK pathway and binds to general proteins, like Cdc42p and Ste11p, to promote a pathway-specific response.
Collapse
|
6
|
Malloy LE, Wen KK, Pierick AR, Wedemeyer EW, Bergeron SE, Vanderpool ND, McKane M, Rubenstein PA, Bartlett HL. Thoracic aortic aneurysm (TAAD)-causing mutation in actin affects formin regulation of polymerization. J Biol Chem 2012; 287:28398-408. [PMID: 22753406 PMCID: PMC3436569 DOI: 10.1074/jbc.m112.371914] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Revised: 06/07/2012] [Indexed: 01/01/2023] Open
Abstract
More than 30 mutations in ACTA2, which encodes α-smooth muscle actin, have been identified to cause autosomal dominant thoracic aortic aneurysm and dissection. The mutation R256H is of particular interest because it also causes patent ductus arteriosus and moyamoya disease. R256H is one of the more prevalent mutations and, based on its molecular location near the strand-strand interface in the actin filament, may affect F-actin stability. To understand the molecular ramifications of the R256H mutation, we generated Saccharomyces cerevisiae yeast cells expressing only R256H yeast actin as a model system. These cells displayed abnormal cytoskeletal morphology and increased sensitivity to latrunculin A. After cable disassembly induced by transient exposure to latrunculin A, mutant cells were delayed in reestablishing the actin cytoskeleton. In vitro, mutant actin exhibited a higher than normal critical concentration and a delayed nucleation. Consequently, we investigated regulation of mutant actin by formin, a potent facilitator of nucleation and a protein needed for normal vascular smooth muscle cell development. Mutant actin polymerization was inhibited by the FH1-FH2 fragment of the yeast formin, Bni1. This fragment strongly capped the filament rather than facilitating polymerization. Interestingly, phalloidin or the presence of wild type actin reversed the strong capping behavior of Bni1. Together, the data suggest that the R256H actin mutation alters filament conformation resulting in filament instability and misregulation by formin. These biochemical effects may contribute to abnormal histology identified in diseased arterial samples from affected patients.
Collapse
Affiliation(s)
| | - Kuo-Kuang Wen
- Biochemistry, Roy A. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242
| | | | | | - Sarah E. Bergeron
- From the Departments of Pediatrics and
- Biochemistry, Roy A. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242
| | - Nicole D. Vanderpool
- Biochemistry, Roy A. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242
| | - Melissa McKane
- Biochemistry, Roy A. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242
| | - Peter A. Rubenstein
- Biochemistry, Roy A. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242
| | - Heather L. Bartlett
- From the Departments of Pediatrics and
- Biochemistry, Roy A. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242
| |
Collapse
|
7
|
Gan Y, Guan J, Zhou S, Zhang W. Structural features based genome-wide characterization and prediction of nucleosome organization. BMC Bioinformatics 2012; 13:49. [PMID: 22449207 PMCID: PMC3378464 DOI: 10.1186/1471-2105-13-49] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2011] [Accepted: 03/26/2012] [Indexed: 11/24/2022] Open
Abstract
Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene expression regulation. The results indicated that our proposed methods are effective in predicting nucleosome occupancy and positions and that these structural features are highly predictive of nucleosome organization. The implementation of our DLaNe method based on structural features is available online.
Collapse
Affiliation(s)
- Yanglan Gan
- Department of Computer Science and Technology, Tongji University, Shanghai, China
| | | | | | | |
Collapse
|
8
|
Parenteau J, Durand M, Morin G, Gagnon J, Lucier JF, Wellinger RJ, Chabot B, Elela SA. Introns within ribosomal protein genes regulate the production and function of yeast ribosomes. Cell 2011; 147:320-31. [PMID: 22000012 DOI: 10.1016/j.cell.2011.08.044] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2011] [Revised: 07/05/2011] [Accepted: 08/22/2011] [Indexed: 12/13/2022]
Abstract
In budding yeast, the most abundantly spliced pre-mRNAs encode ribosomal proteins (RPs). To investigate the contribution of splicing to ribosome production and function, we systematically eliminated introns from all RP genes to evaluate their impact on RNA expression, pre-rRNA processing, cell growth, and response to stress. The majority of introns were required for optimal cell fitness or growth under stress. Most introns are found in duplicated RP genes, and surprisingly, in the majority of cases, deleting the intron from one gene copy affected the expression of the other in a nonreciprocal manner. Consistently, 70% of all duplicated genes were asymmetrically expressed, and both introns and gene deletions displayed copy-specific phenotypic effects. Together, our results indicate that splicing in yeast RP genes mediates intergene regulation and implicate the expression ratio of duplicated RP genes in modulating ribosome function.
Collapse
Affiliation(s)
- Julie Parenteau
- Laboratoire de génomique fonctionnelle de l'Université de Sherbrooke, Département de microbiologie et d'infectiologie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Québec, Canada
| | | | | | | | | | | | | | | |
Collapse
|
9
|
Osterlund T, Nookaew I, Nielsen J. Fifteen years of large scale metabolic modeling of yeast: developments and impacts. Biotechnol Adv 2011; 30:979-88. [PMID: 21846501 DOI: 10.1016/j.biotechadv.2011.07.021] [Citation(s) in RCA: 88] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2011] [Accepted: 07/26/2011] [Indexed: 10/17/2022]
Abstract
Since the first large-scale reconstruction of the Saccharomyces cerevisiae metabolic network 15 years ago the development of yeast metabolic models has progressed rapidly, resulting in no less than nine different yeast genome-scale metabolic models. Here we review the historical development of large-scale mathematical modeling of yeast metabolism and the growing scope and impact of applications of these models in four different areas: as guide for metabolic engineering and strain improvement, as a tool for biological interpretation and discovery, applications of novel computational framework and for evolutionary studies.
Collapse
Affiliation(s)
- Tobias Osterlund
- Department of Chemical and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | | | | |
Collapse
|
10
|
Nookaew I, Olivares-Hernández R, Bhumiratana S, Nielsen J. Genome-scale metabolic models of Saccharomyces cerevisiae. Methods Mol Biol 2011; 759:445-63. [PMID: 21863502 DOI: 10.1007/978-1-61779-173-4_25] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Systematic analysis of Saccharomyces cerevisiae metabolic functions and pathways has been the subject of extensive studies and established in many aspects. With the reconstruction of the yeast genome-scale metabolic (GSM) network and in silico simulation of the GSM model, the nature of the underlying cellular processes can be tested and validated with the increasing metabolic knowledge. GSM models are also being exploited in fundamental research studies and industrial applications. In this chapter, the principle concepts for construction, simulation and validation of GSM models, progressive applications of the yeast GSM models, and future perspectives are described. This will support and encourage researchers who are interested in systemic analysis of yeast metabolism and systems biology.
Collapse
Affiliation(s)
- Intawat Nookaew
- Department of Chemical and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden.
| | | | | | | |
Collapse
|
11
|
Dobson PD, Smallbone K, Jameson D, Simeonidis E, Lanthaler K, Pir P, Lu C, Swainston N, Dunn WB, Fisher P, Hull D, Brown M, Oshota O, Stanford NJ, Kell DB, King RD, Oliver SG, Stevens RD, Mendes P. Further developments towards a genome-scale metabolic model of yeast. BMC SYSTEMS BIOLOGY 2010; 4:145. [PMID: 21029416 PMCID: PMC2988745 DOI: 10.1186/1752-0509-4-145] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2010] [Accepted: 10/28/2010] [Indexed: 12/15/2022]
Abstract
BACKGROUND To date, several genome-scale network reconstructions have been used to describe the metabolism of the yeast Saccharomyces cerevisiae, each differing in scope and content. The recent community-driven reconstruction, while rigorously evidenced and well annotated, under-represented metabolite transport, lipid metabolism and other pathways, and was not amenable to constraint-based analyses because of lack of pathway connectivity. RESULTS We have expanded the yeast network reconstruction to incorporate many new reactions from the literature and represented these in a well-annotated and standards-compliant manner. The new reconstruction comprises 1102 unique metabolic reactions involving 924 unique metabolites--significantly larger in scope than any previous reconstruction. The representation of lipid metabolism in particular has improved, with 234 out of 268 enzymes linked to lipid metabolism now present in at least one reaction. Connectivity is emphatically improved, with more than 90% of metabolites now reachable from the growth medium constituents. The present updates allow constraint-based analyses to be performed; viability predictions of single knockouts are comparable to results from in vivo experiments and to those of previous reconstructions. CONCLUSIONS We report the development of the most complete reconstruction of yeast metabolism to date that is based upon reliable literature evidence and richly annotated according to MIRIAM standards. The reconstruction is available in the Systems Biology Markup Language (SBML) and via a publicly accessible database http://www.comp-sys-bio.org/yeastnet/.
Collapse
Affiliation(s)
- Paul D Dobson
- School of Chemistry, The University of Manchester, Manchester M13 9PL, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Davis MJ, Sehgal MSB, Ragan MA. Automatic, context-specific generation of Gene Ontology slims. BMC Bioinformatics 2010; 11:498. [PMID: 20929524 PMCID: PMC3098080 DOI: 10.1186/1471-2105-11-498] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2010] [Accepted: 10/07/2010] [Indexed: 11/10/2022] Open
Abstract
Background The use of ontologies to control vocabulary and structure annotation has added value to genome-scale data, and contributed to the capture and re-use of knowledge across research domains. Gene Ontology (GO) is widely used to capture detailed expert knowledge in genomic-scale datasets and as a consequence has grown to contain many terms, making it unwieldy for many applications. To increase its ease of manipulation and efficiency of use, subsets called GO slims are often created by collapsing terms upward into more general, high-level terms relevant to a particular context. Creation of a GO slim currently requires manipulation and editing of GO by an expert (or community) familiar with both the ontology and the biological context. Decisions about which terms to include are necessarily subjective, and the creation process itself and subsequent curation are time-consuming and largely manual. Results Here we present an objective framework for generating customised ontology slims for specific annotated datasets, exploiting information latent in the structure of the ontology graph and in the annotation data. This framework combines ontology engineering approaches, and a data-driven algorithm that draws on graph and information theory. We illustrate this method by application to GO, generating GO slims at different information thresholds, characterising their depth of semantics and demonstrating the resulting gains in statistical power. Conclusions Our GO slim creation pipeline is available for use in conjunction with any GO-annotated dataset, and creates dataset-specific, objectively defined slims. This method is fast and scalable for application to other biomedical ontologies.
Collapse
Affiliation(s)
- Melissa J Davis
- The University of Queensland, Brisbane, QLD 4072, Australia.
| | | | | |
Collapse
|
13
|
Chu Y, Yuan X, Guo Y, Zhang Y, Wu Y, Liu H, Wu D, Bao H, Guan L, Jin X. YeastWeb: a workset-centric web resource for gene family analysis in yeast. BMC Genomics 2010; 11:429. [PMID: 20624324 PMCID: PMC2996957 DOI: 10.1186/1471-2164-11-429] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2010] [Accepted: 07/13/2010] [Indexed: 11/10/2022] Open
Abstract
Background Currently, a number of yeast genomes with different physiological features have been sequenced and annotated, which provides invaluable information to investigate yeast genetics, evolutionary mechanism, structure and function of gene families. Description YeastWeb is a novel database created to provide access to gene families derived from the available yeast genomes by assigning the genes into putative families. It has many useful features that complement existing databases, such as SGD, CYGD and Génolevures: 1) Detailed computational annotation was conducted with each entry with InterProScan, EMBOSS and functional/pathway databases, such as GO, COG and KEGG; 2) A well established user-friendly environment was created to allow users to retrieve the annotated genes and gene families using functional classification browser, keyword search or similarity-based search; 3) Workset offers users many powerful functions to manage the retrieved data efficiently, associate the individual items easily and save the intermediate results conveniently; 4) A series of comparative genomics and molecular evolution analysis tools are neatly implemented to allow users to view multiple sequence alignments and phylogenetic tree of gene families. At present, YeastWeb holds the gene families clustered from various MCL inflation values from a total of 13 available yeast genomes. Conclusions Given the great interest in yeast research, YeastWeb has the potential to become a useful resource for the scientific community of yeast biologists and related researchers investigating the evolutionary relationship of yeast gene families. YeastWeb is available at http://centre.bioinformatics.zj.cn/Yeast/.
Collapse
Affiliation(s)
- Yanhui Chu
- Heilongjiang Key Laboratory of Anti-fibrosis Biotherapy, Mudanjiang Medical University, Heilongjiang 157011, China.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Yang S, Pelletier DA, Lu TYS, Brown SD. The Zymomonas mobilis regulator hfq contributes to tolerance against multiple lignocellulosic pretreatment inhibitors. BMC Microbiol 2010; 10:135. [PMID: 20459639 PMCID: PMC2877685 DOI: 10.1186/1471-2180-10-135] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2009] [Accepted: 05/07/2010] [Indexed: 11/29/2022] Open
Abstract
Background Zymomonas mobilis produces near theoretical yields of ethanol and recombinant strains are candidate industrial microorganisms. To date, few studies have examined its responses to various stresses at the gene level. Hfq is a conserved bacterial member of the Sm-like family of RNA-binding proteins, coordinating a broad array of responses including multiple stress responses. In a previous study, we observed Z. mobilis ZM4 gene ZMO0347 showed higher expression under anaerobic, stationary phase compared to that of aerobic, stationary conditions. Results We generated a Z. mobilis hfq insertion mutant AcRIM0347 in an acetate tolerant strain (AcR) background and investigated its role in model lignocellulosic pretreatment inhibitors including acetate, vanillin, furfural and hydroxymethylfurfural (HMF). Saccharomyces cerevisiae Lsm protein (Hfq homologue) mutants and Lsm protein overexpression strains were also assayed for their inhibitor phenotypes. Our results indicated that all the pretreatment inhibitors tested in this study had a detrimental effect on both Z. mobilis and S. cerevisiae, and vanillin had the most inhibitory effect followed by furfural and then HMF for both Z. mobilis and S. cerevisiae. AcRIM0347 was more sensitive than the parental strain to the inhibitors and had an increased lag phase duration and/or slower growth depending upon the conditions. The hfq mutation in AcRIM0347 was complemented partially by trans-acting hfq gene expression. We also assayed growth phenotypes for S. cerevisiae Lsm protein mutant and overexpression phenotypes. Lsm1, 6, and 7 mutants showed reduced tolerance to acetate and other pretreatment inhibitors. S. cerevisiae Lsm protein overexpression strains showed increased acetate and HMF resistance as compared to the wild-type, while the overexpression strains showed greater inhibition under vanillin stress conditions. Conclusions We have shown the utility of the pKNOCK suicide plasmid for mutant construction in Z. mobilis, and constructed a Gateway compatible expression plasmid for use in Z. mobilis for the first time. We have also used genetics to show Z. mobilis Hfq and S. cerevisiae Lsm proteins play important roles in resisting multiple, important industrially relevant inhibitors. The conserved nature of this global regulator offers the potential to apply insights from these fundamental studies for further industrial strain development.
Collapse
Affiliation(s)
- Shihui Yang
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.
| | | | | | | |
Collapse
|
15
|
Cai Y, Lux MW, Adam L, Peccoud J. Modeling structure-function relationships in synthetic DNA sequences using attribute grammars. PLoS Comput Biol 2009; 5:e1000529. [PMID: 19816554 PMCID: PMC2748682 DOI: 10.1371/journal.pcbi.1000529] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2009] [Accepted: 09/03/2009] [Indexed: 11/18/2022] Open
Abstract
Recognizing that certain biological functions can be associated with specific DNA sequences has led various fields of biology to adopt the notion of the genetic part. This concept provides a finer level of granularity than the traditional notion of the gene. However, a method of formally relating how a set of parts relates to a function has not yet emerged. Synthetic biology both demands such a formalism and provides an ideal setting for testing hypotheses about relationships between DNA sequences and phenotypes beyond the gene-centric methods used in genetics. Attribute grammars are used in computer science to translate the text of a program source code into the computational operations it represents. By associating attributes with parts, modifying the value of these attributes using rules that describe the structure of DNA sequences, and using a multi-pass compilation process, it is possible to translate DNA sequences into molecular interaction network models. These capabilities are illustrated by simple example grammars expressing how gene expression rates are dependent upon single or multiple parts. The translation process is validated by systematically generating, translating, and simulating the phenotype of all the sequences in the design space generated by a small library of genetic parts. Attribute grammars represent a flexible framework connecting parts with models of biological function. They will be instrumental for building mathematical models of libraries of genetic constructs synthesized to characterize the function of genetic parts. This formalism is also expected to provide a solid foundation for the development of computer assisted design applications for synthetic biology.
Collapse
Affiliation(s)
- Yizhi Cai
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, United States of America
| | - Matthew W. Lux
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, United States of America
| | - Laura Adam
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, United States of America
| | - Jean Peccoud
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, United States of America
| |
Collapse
|
16
|
Berriz GF, Beaver JE, Cenik C, Tasan M, Roth FP. Next generation software for functional trend analysis. Bioinformatics 2009; 25:3043-4. [PMID: 19717575 DOI: 10.1093/bioinformatics/btp498] [Citation(s) in RCA: 198] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED FuncAssociate is a web application that discovers properties enriched in lists of genes or proteins that emerge from large-scale experimentation. Here we describe an updated application with a new interface and several new features. For example, enrichment analysis can now be performed within multiple gene- and protein-naming systems. This feature avoids potentially serious translation artifacts to which other enrichment analysis strategies are subject. AVAILABILITY The FuncAssociate web application is freely available to all users at http://llama.med.harvard.edu/funcassociate.
Collapse
Affiliation(s)
- Gabriel F Berriz
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 250 Longwood Avenue and Center for Cancer Systems Biology, Dana Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA
| | | | | | | | | |
Collapse
|
17
|
Achcar F, Camadro JM, Mestivier D. AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology. Nucleic Acids Res 2009; 37:W63-7. [PMID: 19474346 PMCID: PMC2703914 DOI: 10.1093/nar/gkp430] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Recently, several theoretical and applied studies have shown that unsupervised Bayesian classification systems are of particular relevance for biological studies. However, these systems have not yet fully reached the biological community mainly because there are few freely available dedicated computer programs, and Bayesian clustering algorithms are known to be time consuming, which limits their usefulness when using personal computers. To overcome these limitations, we developed AutoClass@IJM, a computational resource with a web interface to AutoClass, a powerful unsupervised Bayesian classification system developed by the Ames Research Center at N.A.S.A. AutoClass has many powerful features with broad applications in biological sciences: (i) it determines the number of classes automatically, (ii) it allows the user to mix discrete and real valued data, (iii) it handles missing values. End users upload their data sets through our web interface; computations are then queued in our cluster server. When the clustering is completed, an URL to the results is sent back to the user by e-mail. AutoClass@IJM is freely available at: http://ytat2.ijm.univ-paris-diderot.fr/AutoclassAtIJM.html.
Collapse
Affiliation(s)
- Fiona Achcar
- Modeling in Integrative Biology Group, Jacques Monod Institute, UMR7592 CNRS and Univ Paris-Diderot, Bâtiment Buffon, 15 rue Hélène Brion, 75205 Paris Cedex 13, France
| | | | | |
Collapse
|
18
|
Freyhult E, Edvardsson S, Tamas I, Moulton V, Poole AM. Fisher: a program for the detection of H/ACA snoRNAs using MFE secondary structure prediction and comparative genomics - assessment and update. BMC Res Notes 2008; 1:49. [PMID: 18710502 PMCID: PMC2551606 DOI: 10.1186/1756-0500-1-49] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Accepted: 07/21/2008] [Indexed: 11/25/2022] Open
Abstract
Background The H/ACA family of small nucleolar RNAs (snoRNAs) plays a central role in guiding the pseudouridylation of ribosomal RNA (rRNA). In an effort to systematically identify the complete set of rRNA-modifying H/ACA snoRNAs from the genome sequence of the budding yeast, Saccharomyces cerevisiae, we developed a program – Fisher – and previously presented several candidate snoRNAs based on our analysis [1]. Findings In this report, we provide a brief update of this work, which was aborted after the publication of experimentally-identified snoRNAs [2] identical to candidates we had identified bioinformatically using Fisher. Our motivation for revisiting this work is to report on the status of the candidate snoRNAs described in [1], and secondly, to report that a modified version of Fisher together with the available multiple yeast genome sequences was able to correctly identify several H/ACA snoRNAs for modification sites not identified by the snoGPS program [3]. While we are no longer developing Fisher, we briefly consider the merits of the Fisher algorithm relative to snoGPS, which may be of use for workers considering pursuing a similar search strategy for the identification of small RNAs. The modified source code for Fisher is made available as supplementary material. Conclusion Our results confirm the validity of using minimum free energy (MFE) secondary structure prediction to guide comparative genomic screening for RNA families with few sequence constraints.
Collapse
Affiliation(s)
- Eva Freyhult
- Linnaeus Centre for Bioinformatics, Uppsala University, Box 598, S-751, 24 Uppsala, Sweden; Department of Clinical Microbiology, Clinical Bacteriology, Umeå University, 901 85 Umeå, Sweden.
| | | | | | | | | |
Collapse
|
19
|
Caspi R, Foerster H, Fulcher CA, Kaipa P, Krummenacker M, Latendresse M, Paley S, Rhee SY, Shearer AG, Tissier C, Walk TC, Zhang P, Karp PD. The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 2007; 36:D623-31. [PMID: 17965431 PMCID: PMC2238876 DOI: 10.1093/nar/gkm900] [Citation(s) in RCA: 469] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
MetaCyc (MetaCyc.org) is a universal database of metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are curated from the primary scientific literature, and are experimentally determined small-molecule metabolic pathways. Each reaction in a MetaCyc pathway is annotated with one or more well-characterized enzymes. Because MetaCyc contains only experimentally elucidated knowledge, it provides a uniquely high-quality resource for metabolic pathways and enzymes. BioCyc (BioCyc.org) is a collection of more than 350 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the predicted metabolic network of one organism, including metabolic pathways, enzymes, metabolites and reactions predicted by the Pathway Tools software using MetaCyc as a reference database. BioCyc PGDBs also contain predicted operons and predicted pathway hole fillers—predictions of which enzymes may catalyze pathway reactions that have not been assigned to an enzyme. The BioCyc website offers many tools for computational analysis of PGDBs, including comparative analysis and analysis of omics data in a pathway context. The BioCyc PGDBs generated by SRI are offered for adoption by any interested party for the ongoing integration of metabolic and genome-related information about an organism.
Collapse
Affiliation(s)
- Ron Caspi
- SRI International, 333 Ravenswood, Menlo Park, CA 94025, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Nordle AKL, Rios P, Gaulton A, Pulido R, Attwood TK, Tabernero L. Functional assignment of MAPK phosphatase domains. Proteins 2007; 69:19-31. [PMID: 17596826 DOI: 10.1002/prot.21477] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Mitogen-activated protein kinase (MAPK) pathways are well conserved in most organisms, from yeast to humans. The principal components of these pathways are MAP kinases whose activity is regulated by phosphorylation, implicating various MAPK protein effectors-in particular, protein phosphatases that inactivate MAPKs by dephosphorylation. The molecular basis of binding specificity of such regulatory phosphatases to MAPKs is poorly understood. To try to pinpoint potential functional regions within the sequences and to help identify new family members, we have applied a multimotif pattern-recognition approach to characterize two MAPK phosphatase subfamilies (tyrosine-specific and dual specificity) that are crucial in the regulation of MAPKs. We built "fingerprints" for these two subfamilies that are unique to, and highly discriminatory for, each group of proteins. The fingerprints were used in a genome-wide screen, identifying more than 80 MAPK phosphatase domains, several of which were in partial sequences or unclassified proteins. We confirmed experimentally that one predicted MAPK phosphatase orthologue in Xenopus binds to ERK1/2, suggesting a role in MAPK signaling and thus supporting our functional predictions. Further analysis, mapping the fingerprints on the three-dimensional structure of MAPK phosphatases, revealed that some of the fingerprint motifs reside in the N-terminal noncatalytic regions coinciding with reported MAPK binding sites, while others lie within the catalytic phosphatase domain. These results also suggest the presence of putative allosteric sites in the catalytic region for modulation of protein-protein interactions, and provide a framework for future experimental validation.
Collapse
Affiliation(s)
- Anna K L Nordle
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom
| | | | | | | | | | | |
Collapse
|
21
|
Miranda-Saavedra D, Stark MJR, Packer JC, Vivares CP, Doerig C, Barton GJ. The complement of protein kinases of the microsporidium Encephalitozoon cuniculi in relation to those of Saccharomyces cerevisiae and Schizosaccharomyces pombe. BMC Genomics 2007; 8:309. [PMID: 17784954 PMCID: PMC2078597 DOI: 10.1186/1471-2164-8-309] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2007] [Accepted: 09/04/2007] [Indexed: 12/02/2022] Open
Abstract
Background Microsporidia, parasitic fungi-related eukaryotes infecting many cell types in a wide range of animals (including humans), represent a serious health threat in immunocompromised patients. The 2.9 Mb genome of the microsporidium Encephalitozoon cuniculi is the smallest known of any eukaryote. Eukaryotic protein kinases are a large superfamily of enzymes with crucial roles in most cellular processes, and therefore represent potential drug targets. We report here an exhaustive analysis of the E. cuniculi genomic database aimed at identifying and classifying all protein kinases of this organism with reference to the kinomes of two highly-divergent yeast species, Saccharomyces cerevisiae and Schizosaccharomyces pombe. Results A database search with a multi-level protein kinase family hidden Markov model library led to the identification of 29 conventional protein kinase sequences in the E. cuniculi genome, as well as 3 genes encoding atypical protein kinases. The microsporidian kinome presents striking differences from those of other eukaryotes, and this minimal kinome underscores the importance of conserved protein kinases involved in essential cellular processes. ~30% of its kinases are predicted to regulate cell cycle progression while another ~28% have no identifiable homologues in model eukaryotes and are likely to reflect parasitic adaptations. E. cuniculi lacks MAP kinase cascades and almost all protein kinases that are involved in stress responses, ion homeostasis and nutrient signalling in the model fungi S. cerevisiae and S. pombe, including AMPactivated protein kinase (Snf1), previously thought to be ubiquitous in eukaryotes. A detailed database search and phylogenetic analysis of the kinomes of the two model fungi showed that the degree of homology between their kinomes of ~85% is much higher than that previously reported. Conclusion The E. cuniculi kinome is by far the smallest eukaryotic kinome characterised to date. The difficulty in assigning clear homology relationships for nine out of the twentynine microsporidian conventional protein kinases despite its compact genome reflects the phylogenetic distance between microsporidia and other eukaryotes. Indeed, the E. cuniculi genome presents a high proportion of genes in which evolution has been accelerated by up to four-fold. There are no orthologues of the protein kinases that constitute MAP kinase pathways and many other protein kinases with roles in nutrient signalling are absent from the E. cuniculi kinome. However, orthologous kinases can nonetheless be identified that correspond to members of the yeast kinomes with roles in some of the most fundamental cellular processes. For example, E. cuniculi has clear orthologues of virtually all the major conserved protein kinases that regulate the core cell cycle machinery (Aurora, Polo, DDK, CDK and Chk1). A comprehensive comparison of the homology relationships between the budding and fission yeast kinomes indicates that, despite an estimated 800 million years of independent evolution, the two model fungi share ~85% of their protein kinases. This will facilitate the annotation of many of the as yet uncharacterised fission yeast kinases, and also those of novel fungal genomes.
Collapse
Affiliation(s)
- Diego Miranda-Saavedra
- College of Life Sciences, University of Dundee, Dow St, Dundee DD1 5EH, Scotland, UK
- Cambridge Institute for Medical Research, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Hills Road, Cambridge CB2 0XY, UK
| | - Michael JR Stark
- College of Life Sciences, University of Dundee, Dow St, Dundee DD1 5EH, Scotland, UK
| | - Jeremy C Packer
- Division of Advanced Technologies, Abbott Laboratories, 100 Abbott Park Road, Abbott Park, IL 60064, USA
| | - Christian P Vivares
- Laboratoire de Parasitologie Moléculaire et Cellulaire. UMR CNRS 6023, Université Blaise Pascal, Aubière, France
| | - Christian Doerig
- INSERM U609, Wellcome Centre for Molecular Parasitology, Glasgow Biomedical Research Centre, 120 University Place, Glasgow G12 8TA, Scotland, UK
| | - Geoffrey J Barton
- College of Life Sciences, University of Dundee, Dow St, Dundee DD1 5EH, Scotland, UK
| |
Collapse
|
22
|
Nash R, Weng S, Hitz B, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hong EL, Livstone MS, Oughtred R, Park J, Skrzypek M, Theesfeld CL, Binkley G, Dong Q, Lane C, Miyasato S, Sethuraman A, Schroeder M, Dolinski K, Botstein D, Cherry JM. Expanded protein information at SGD: new pages and proteome browser. Nucleic Acids Res 2006; 35:D468-71. [PMID: 17142221 PMCID: PMC1669759 DOI: 10.1093/nar/gkl931] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
The recent explosion in protein data generated from both directed small-scale studies and large-scale proteomics efforts has greatly expanded the quantity of available protein information and has prompted the Saccharomyces Genome Database (SGD; ) to enhance the depth and accessibility of protein annotations. In particular, we have expanded ongoing efforts to improve the integration of experimental information and sequence-based predictions and have redesigned the protein information web pages. A key feature of this redesign is the development of a GBrowse-derived interactive Proteome Browser customized to improve the visualization of sequence-based protein information. This Proteome Browser has enabled SGD to unify the display of hidden Markov model (HMM) domains, protein family HMMs, motifs, transmembrane regions, signal peptides, hydropathy plots and profile hits using several popular prediction algorithms. In addition, a physico-chemical properties page has been introduced to provide easy access to basic protein information. Improvements to the layout of the Protein Information page and integration of the Proteome Browser will facilitate the ongoing expansion of sequence-specific experimental information captured in SGD, including post-translational modifications and other user-defined annotations. Finally, SGD continues to improve upon the availability of genetic and physical interaction data in an ongoing collaboration with BioGRID by providing direct access to more than 82 000 manually-curated interactions.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Michael S. Livstone
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | | | | | | | | | | | | | - Mark Schroeder
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - David Botstein
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - J. Michael Cherry
- To whom correspondence should be addressed. Tel: +1 650 723 7541; Fax: +1 650 725 1534;
| |
Collapse
|
23
|
Reinders J, Zahedi RP, Pfanner N, Meisinger C, Sickmann A. Toward the Complete Yeast Mitochondrial Proteome: Multidimensional Separation Techniques for Mitochondrial Proteomics. J Proteome Res 2006; 5:1543-54. [PMID: 16823961 DOI: 10.1021/pr050477f] [Citation(s) in RCA: 305] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Proteomic analyses of different subcellular compartments, so-called organellar proteomics, facilitate the understanding of cellular functions on a molecular level. In this work, various orthogonal multidimensional separation techniques both on the protein and on the peptide level are compared with regard to the number of identified proteins as well as the classes of proteins accessible by the respective methodology. The most complete overview was achieved by a combination of such orthogonal techniques as shown by the analysis of the yeast mitochondrial proteome. A total of 851 different proteins (PROMITO dataset) were identified by use of multidimensional LC-MS/MS, 1D-SDS-PAGE combined with nano-LC-MS/MS and 2D-PAGE with subsequent MALDI-mass fingerprinting. Our PROMITO approach identified the 749 proteins, which were found in the largest previous study on the yeast mitochondrial proteome, and additionally 102 proteins including 42 open reading frames with unknown function, providing the basis for a more detailed elucidation of mitochondrial processes. Comparison of the different approaches emphasizes a bias of 2D-PAGE against proteins with very high isoelectric points as well as large and hydrophobic proteins, which can be accessed more appropriately by the other methods. While 2D-PAGE has advantages in the possible separation of protein isoforms and quantitative differential profiling, 1D-SDS-PAGE with nano-LC-MS/MS and multidimensional LC-MS/MS are better suited for efficient protein identification as they are less biased against distinct classes of proteins. Thus, comprehensive proteome analyses can only be realized by a combination of such orthogonal approaches, leading to the largest dataset available for the mitochondrial proteome of yeast.
Collapse
Affiliation(s)
- Joerg Reinders
- Protein Mass Spectrometry and Functional Proteomics Group, Rudolf-Virchow-Center for Experimental Biomedicine, Julius-Maximilians-Universität Würzburg, 97078 Würzburg, Germany
| | | | | | | | | |
Collapse
|
24
|
Landry CR, Townsend JP, Hartl DL, Cavalieri D. Ecological and evolutionary genomics of Saccharomyces cerevisiae. Mol Ecol 2006; 15:575-91. [PMID: 16499686 DOI: 10.1111/j.1365-294x.2006.02778.x] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Saccharomyces cerevisiae, the budding yeast, is the most thoroughly studied eukaryote at the cellular, molecular, and genetic levels. Yet, until recently, we knew very little about its ecology or population and evolutionary genetics. In recent years, it has been recognized that S. cerevisiae occupies numerous habitats and that populations harbour important genetic variation. There is therefore an increasing interest in understanding the evolutionary forces acting on the yeast genome. Several researchers have used the tools of functional genomics to study natural isolates of this unicellular fungus. Here, we review some of these studies, and show not only that budding yeast is a prime model system to address fundamental molecular and cellular biology questions, but also that it is becoming a powerful model species for ecological and evolutionary genomics studies as well.
Collapse
Affiliation(s)
- Christian R Landry
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
| | | | | | | |
Collapse
|
25
|
Caspi R, Foerster H, Fulcher CA, Hopkinson R, Ingraham J, Kaipa P, Krummenacker M, Paley S, Pick J, Rhee SY, Tissier C, Zhang P, Karp PD. MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res 2006; 34:D511-6. [PMID: 16381923 PMCID: PMC1347490 DOI: 10.1093/nar/gkj128] [Citation(s) in RCA: 226] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MetaCyc is a database of metabolic pathways and enzymes located at http://MetaCyc.org/. Its goal is to serve as a metabolic encyclopedia, containing a collection of non-redundant pathways central to small molecule metabolism, which have been reported in the experimental literature. Most of the pathways in MetaCyc occur in microorganisms and plants, although animal pathways are also represented. MetaCyc contains metabolic pathways, enzymatic reactions, enzymes, chemical compounds, genes and review-level comments. Enzyme information includes substrate specificity, kinetic properties, activators, inhibitors, cofactor requirements and links to sequence and structure databases. Data are curated from the primary literature by curators with expertise in biochemistry and molecular biology. MetaCyc serves as a readily accessible comprehensive resource on microbial and plant pathways for genome analysis, basic research, education, metabolic engineering and systems biology. Querying, visualization and curation of the database is supported by SRI's Pathway Tools software. The PathoLogic component of Pathway Tools is used in conjunction with MetaCyc to predict the metabolic network of an organism from its annotated genome. SRI and the European Bioinformatics Institute employed this tool to create pathway/genome databases (PGDBs) for 165 organisms, available at the BioCyc.org website. These PGDBs also include predicted operons and pathway hole fillers.
Collapse
Affiliation(s)
| | - Hartmut Foerster
- Department of Plant biology, Carnegie Institution260 Panama Street, Stanford, CA 94305, USA
| | | | | | - John Ingraham
- Section of Microbiology, University of CaliforniaDavis, One Shields Avenue, Davis, CA 95616, USA
| | | | | | | | | | - Seung Y. Rhee
- Department of Plant biology, Carnegie Institution260 Panama Street, Stanford, CA 94305, USA
| | - Christophe Tissier
- Department of Plant biology, Carnegie Institution260 Panama Street, Stanford, CA 94305, USA
| | - Peifen Zhang
- Department of Plant biology, Carnegie Institution260 Panama Street, Stanford, CA 94305, USA
| | - Peter D. Karp
- To whom correspondence should be addressed. Tel: +1 650 859 4358; Fax: +1 650 859 3735;
| |
Collapse
|
26
|
Pehkonen P, Wong G, Törönen P. Theme discovery from gene lists for identification and viewing of multiple functional groups. BMC Bioinformatics 2005; 6:162. [PMID: 15987504 PMCID: PMC1190153 DOI: 10.1186/1471-2105-6-162] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2005] [Accepted: 06/29/2005] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND High throughput methods of the genome era produce vast amounts of data in the form of gene lists. These lists are large and difficult to interpret without advanced computational or bioinformatic tools. Most existing methods analyse a gene list as a single entity although it is comprised of multiple gene groups associated with separate biological functions. Therefore it is imperative to define and visualize gene groups with unique functionality within gene lists. RESULTS In order to analyse the functional heterogeneity within a gene list, we have developed a method that clusters genes to groups with homogenous functionalities. The method uses Non-negative Matrix Factorization (NMF) to create several clustering results with varying numbers of clusters. The obtained clustering results are combined into a simple graphical presentation showing the functional groups over-represented in the analyzed gene list. We demonstrate its performance on two data sets and show results that improve upon existing methods. The comparison also shows that our method creates a more simplified view that aids in discovery of biological themes within the list and discards less informative classes from the results. CONCLUSION The presented method and associated software are useful for the identification and interpretation of biological functions associated with gene lists and are especially useful for the analysis of large lists.
Collapse
Affiliation(s)
- Petri Pehkonen
- Department of Neurobiology, A.I. Virtanen-Institute, University of Kuopio P.O. Box 1627, FIN-70211 Kuopio, Finland
- Department of Computer Science, University of Kuopio P.O. Box 1627, FIN-70211 Kuopio, Finland
| | - Garry Wong
- Department of Neurobiology, A.I. Virtanen-Institute, University of Kuopio P.O. Box 1627, FIN-70211 Kuopio, Finland
| | - Petri Törönen
- Department of Neurobiology, A.I. Virtanen-Institute, University of Kuopio P.O. Box 1627, FIN-70211 Kuopio, Finland
- Bioinformatics Group, Institute of Biotechnology, P.O. Box 56, 00014 University of Helsinki, Finland
| |
Collapse
|
27
|
McDermott J, Bumgarner R, Samudrala R. Functional annotation from predicted protein interaction networks. Bioinformatics 2005; 21:3217-26. [PMID: 15919725 DOI: 10.1093/bioinformatics/bti514] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Progress in large-scale experimental determination of protein-protein interaction networks for several organisms has resulted in innovative methods of functional inference based on network connectivity. However, the amount of effort and resources required for the elucidation of experimental protein interaction networks is prohibitive. Previously we, and others, have developed techniques to predict protein interactions for novel genomes using computational methods and data generated from other genomes. RESULTS We evaluated the performance of a network-based functional annotation method that makes use of our predicted protein interaction networks. We show that this approach performs equally well on experimentally derived and predicted interaction networks, for both manually and computationally assigned annotations. We applied the method to predicted protein interaction networks for over 50 organisms from all domains of life, providing annotations for many previously unannotated proteins and verifying existing low-confidence annotations. AVAILABILITY Functional predictions for over 50 organisms are available at http://bioverse.compbio.washington.edu and datasets used for analysis at http://data.compbio.washington.edu/misc/downloads/nannotation_data/. SUPPLEMENTARY INFORMATION A supplemental appendix gives additional details not in the main text. (http://data.compbio.washington.edu/misc/downloads/nannotation_data/supplement.pdf).
Collapse
Affiliation(s)
- Jason McDermott
- Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | | | | |
Collapse
|
28
|
Conservation and evolution of cis-regulatory systems in ascomycete fungi. PLoS Biol 2004; 2:e398. [PMID: 15534694 PMCID: PMC526180 DOI: 10.1371/journal.pbio.0020398] [Citation(s) in RCA: 179] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2004] [Accepted: 09/09/2004] [Indexed: 12/18/2022] Open
Abstract
Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups from S. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa. A systematic examination of the gene regulatory elements in ascomycete fungi reveals striking conservation along with some examples of the ways in which regulatory systems can evolve
Collapse
|
29
|
Maxwell PH, Coombes C, Kenny AE, Lawler JF, Boeke JD, Curcio MJ. Ty1 mobilizes subtelomeric Y' elements in telomerase-negative Saccharomyces cerevisiae survivors. Mol Cell Biol 2004; 24:9887-98. [PMID: 15509791 PMCID: PMC525482 DOI: 10.1128/mcb.24.22.9887-9898.2004] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2004] [Revised: 07/01/2004] [Accepted: 08/10/2004] [Indexed: 11/20/2022] Open
Abstract
When telomerase is inactivated in Saccharomyces cerevisiae, telomeric DNA shortens with every cell division, and cells stop dividing after approximately 100 generations. Survivors that form in these senescent populations and resume growing have variably amplified arrays of subtelomeric Y' elements. We marked a chromosomal Y' element with the his3AI retrotransposition indicator gene and found that Y'HIS3 cDNA was incorporated into the genome at approximately 10- to 1,000-fold-higher frequencies in survivors compared to telomerase-positive strains. Y'HIS3 cDNA mobility was significantly reduced if assayed at 30 degrees C, a nonpermissive temperature for Ty1 retrotransposition, or in the absence of Tec1p, a transcription factor for Ty1. Microarray analysis revealed that Y' RNA is preferentially associated with Ty1 virus-like particles (VLPs). Genomic copies of Y'HIS3 cDNA typically have downstream oligo(A) tracts, followed by a complete Ty1 long terminal repeat and TYA1 or TYB1 sequences. These data are consistent with the use of Ty1 cDNA to prime reverse transcription of polyadenylated Y' RNA within Ty1 VLPs. Unmarked Y'-oligo(A)-Ty1 cDNA was also detected in survivors, reaching copy numbers of approximately 10(-2) per genome. We propose that Y'-oligo(A)-Ty1 cDNA recombines with Y' elements at eroding telomeres in survivors and may play a role in telomere maintenance in the absence of telomerase.
Collapse
Affiliation(s)
- Patrick H Maxwell
- Laboratory of Developmental Genetics, Wadsworth Center, and Department of Biomedical Sciences, University at Albany School of Public Health, Albany, New York 12201-2002, USA
| | | | | | | | | | | |
Collapse
|
30
|
Marín A, Wang M, Gutiérrez G. Short-range compositional correlation in the yeast genome depends on transcriptional orientation. Gene 2004; 333:151-5. [PMID: 15177690 DOI: 10.1016/j.gene.2004.02.016] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2003] [Revised: 01/21/2004] [Accepted: 02/10/2004] [Indexed: 11/29/2022]
Abstract
This article reports an analysis of composition of about 5000 intergenic regions and neighboring ORFs in the nuclear genome of Saccharomyces cerevisiae, and their correlation. Intergenic regions flanked by divergently transcribed ORFs are GC richer (36%) than those separating convergent ORFs (29%). This difference in GC content cannot be fully attributed to its location upstream or downstream the ORFs, since no such strong compositional bias is found within 3' and 5' segments of intergenic regions between ORFs transcribed in the same direction. We have also found that the GC content of intergenic regions is positively correlated to that of its flanking ORFs in tandem and divergent orientations, but not in convergent orientations, and that the correlation coefficient between the GC content of nearby ORFs is higher for divergent pairs. Our observations are discussed in the light of recent work stressing the relationships between base composition, chromatin structure and meiotic recombination.
Collapse
Affiliation(s)
- Antonio Marín
- Departamento de Genética, Universidad de Sevilla, Apartado 1095, E-41080 Sevilla, Spain.
| | | | | |
Collapse
|
31
|
Stabenau A, McVicker G, Melsopp C, Proctor G, Clamp M, Birney E. The Ensembl core software libraries. Genome Res 2004; 14:929-33. [PMID: 15123588 PMCID: PMC479122 DOI: 10.1101/gr.1857204] [Citation(s) in RCA: 102] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Systems for managing genomic data must store a vast quantity of information. Ensembl stores these data in several MySQL databases. The core software libraries provide a practical and effective means for programmers to access these data. By encapsulating the underlying database structure, the libraries present end users with a simple, abstract interface to a complex data model. Programs that use the libraries rather than SQL to access the data are unaffected by most schema changes. The architecture of the core software libraries, the schema, and the factors influencing their design are described. All code and data are freely available.
Collapse
Affiliation(s)
- Arne Stabenau
- EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | | | | | | | | | | |
Collapse
|
32
|
Duarte NC, Herrgård MJ, Palsson BØ. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res 2004; 14:1298-309. [PMID: 15197165 PMCID: PMC442145 DOI: 10.1101/gr.2250904] [Citation(s) in RCA: 435] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
A fully compartmentalized genome-scale metabolic model of Saccharomyces cerevisiae that accounts for 750 genes and their associated transcripts, proteins, and reactions has been reconstructed and validated. All of the 1149 reactions included in this in silico model are both elementally and charge balanced and have been assigned to one of eight cellular locations (extracellular space, cytosol, mitochondrion, peroxisome, nucleus, endoplasmic reticulum, Golgi apparatus, or vacuole). When in silico predictions of 4154 growth phenotypes were compared to two published large-scale gene deletion studies, an 83% agreement was found between iND750's predictions and the experimental studies. Analysis of the failure modes showed that false predictions were primarily caused by iND750's limited inclusion of cellular processes outside of metabolism. This study systematically identified inconsistencies in our knowledge of yeast metabolism that require specific further experimental investigation.
Collapse
Affiliation(s)
- Natalie C Duarte
- Department of Bioengineering, University of California-San Diego, La Jolla, California 92093-0412, USA
| | | | | |
Collapse
|
33
|
Toronen P. Selection of informative clusters from hierarchical cluster tree with gene classes. BMC Bioinformatics 2004; 5:32. [PMID: 15043761 PMCID: PMC407846 DOI: 10.1186/1471-2105-5-32] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2003] [Accepted: 03/25/2004] [Indexed: 11/18/2022] Open
Abstract
Background A common clustering method in the analysis of gene expression data has been hierarchical clustering. Usually the analysis involves selection of clusters by cutting the tree at a suitable level and/or analysis of a sorted gene list that is obtained with the tree. Cutting of the hierarchical tree requires the selection of a suitable level and it results in the loss of information on the other level. Sorted gene lists depend on the sorting method of the joined clusters. Author proposes that the clusters should be selected using the gene classifications. Results This article presents a simple method for searching for clusters with the strongest enrichment of gene classes from a cluster tree. The clusters found are presented in the estimated order of importance. The method is demonstrated with a yeast gene expression data set and with two database classifications. The obtained clusters demonstrated a very strong enrichment of functional classes. The obtained clusters are also able to present similar gene groups to those that were observed from the data set in the original analysis and also many gene groups that were not reported in the original analysis. Visualization of the results on top of a cluster tree shows that the method finds informative clusters from several levels of the cluster tree and indicates that the clusters found could not have been obtained by simply cutting the cluster tree. Results were also used in the comparison of cluster trees from different clustering methods. Conclusion The presented method should facilitate the exploratory analysis of big data sets when the associated categorical data is available.
Collapse
Affiliation(s)
- Petri Toronen
- A. I. Virtanen Institute for Molecular Sciences, Neulaniementie 2, P.O. Box 1627, FIN-70211 Kuopio, Finland.
| |
Collapse
|
34
|
Krieger CJ, Zhang P, Mueller LA, Wang A, Paley S, Arnaud M, Pick J, Rhee SY, Karp PD. MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res 2004; 32:D438-42. [PMID: 14681452 PMCID: PMC308834 DOI: 10.1093/nar/gkh100] [Citation(s) in RCA: 193] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The MetaCyc database (see URL http://MetaCyc.org) is a collection of metabolic pathways and enzymes from a wide variety of organisms, primarily microorganisms and plants. The goal of MetaCyc is to contain a representative sample of each experimentally elucidated pathway, and thereby to catalog the universe of metabolism. MetaCyc also describes reactions, chemical compounds and genes. Many of the pathways and enzymes in MetaCyc contain extensive information, including comments and literature citations. SRI's Pathway Tools software supports querying, visualization and curation of MetaCyc. With its wide breadth and depth of metabolic information, MetaCyc is a valuable resource for a variety of applications. MetaCyc is the reference database of pathways and enzymes that is used in conjunction with SRI's metabolic pathway prediction program to create Pathway/Genome Databases that can be augmented with curation from the scientific literature and published on the world wide web. MetaCyc also serves as a readily accessible comprehensive resource on microbial and plant pathways for genome analysis, basic research, education, metabolic engineering and systems biology. In the past 2 years the data content and the Pathway Tools software used to query, visualize and edit MetaCyc have been expanded significantly. These enhancements are described in this paper.
Collapse
|
35
|
Abstract
RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes.
Collapse
Affiliation(s)
- Akihiro Nakao
- Department of Biotechnology, Research Center for Frontier Bioscience, Miyazaki University, 5200 Kihara, Kiyotake, Miyazaki 889-1692, Japan
| | | | | |
Collapse
|
36
|
Hertz-Fowler C, Peacock CS, Wood V, Aslett M, Kerhornou A, Mooney P, Tivey A, Berriman M, Hall N, Rutherford K, Parkhill J, Ivens AC, Rajandream MA, Barrell B. GeneDB: a resource for prokaryotic and eukaryotic organisms. Nucleic Acids Res 2004; 32:D339-43. [PMID: 14681429 PMCID: PMC308742 DOI: 10.1093/nar/gkh007] [Citation(s) in RCA: 188] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
GeneDB (http://www.genedb.org/) is a genome database for prokaryotic and eukaryotic organisms. The resource provides a portal through which data generated by the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute and other collaborating sequencing centres can be made publicly available. It combines data from finished and ongoing genome and expressed sequence tag (EST) projects with curated annotation, that can be searched, sorted and downloaded, using a single web based resource. The current release stores 11 datasets of which six are curated and maintained by biologists, who review and incorporate information from the scientific literature, public databases and the respective research communities.
Collapse
Affiliation(s)
- Christiane Hertz-Fowler
- The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A, Margalit H, Armstrong J, Bairoch A, Cesareni G, Sherman D, Apweiler R. IntAct: an open source molecular interaction database. Nucleic Acids Res 2004; 32:D452-5. [PMID: 14681455 PMCID: PMC308786 DOI: 10.1093/nar/gkh052] [Citation(s) in RCA: 624] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
IntAct provides an open source database and toolkit for the storage, presentation and analysis of protein interactions. The web interface provides both textual and graphical representations of protein interactions, and allows exploring interaction networks in the context of the GO annotations of the interacting proteins. A web service allows direct computational access to retrieve interaction networks in XML format. IntAct currently contains approximately 2200 binary and complex interactions imported from the literature and curated in collaboration with the Swiss-Prot team, making intensive use of controlled vocabularies to ensure data consistency. All IntAct software, data and controlled vocabularies are available at http://www.ebi.ac.uk/intact.
Collapse
|
38
|
Wiederkehr C, Basavaraj R, Sarrauste de Menthière C, Hermida L, Koch R, Schlecht U, Amon A, Brachat S, Breitenbach M, Briza P, Caburet S, Cherry M, Davis R, Deutschbauer A, Dickinson HG, Dumitrescu T, Fellous M, Goldman A, Grootegoed JA, Hawley R, Ishii R, Jégou B, Kaufman RJ, Klein F, Lamb N, Maro B, Nasmyth K, Nicolas A, Orr-Weaver T, Philippsen P, Pineau C, Rabitsch KP, Reinke V, Roest H, Saunders W, Schröder M, Schedl T, Siep M, Villeneuve A, Wolgemuth DJ, Yamamoto M, Zickler D, Esposito RE, Primig M. GermOnline, a cross-species community knowledgebase on germ cell differentiation. Nucleic Acids Res 2004; 32:D560-7. [PMID: 14681481 PMCID: PMC308789 DOI: 10.1093/nar/gkh055] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
GermOnline provides information and microarray expression data for genes involved in mitosis and meiosis, gamete formation and germ line development across species. The database has been developed, and is being curated and updated, by life scientists in cooperation with bioinformaticists. Information is contributed through an online form using free text, images and the controlled vocabulary developed by the GeneOntology Consortium. Authors provide up to three references in support of their contribution. The database is governed by an international board of scientists to ensure a standardized data format and the highest quality of GermOnline's information content. Release 2.0 provides exclusive access to microarray expression data from Saccharomyces cerevisiae and Rattus norvegicus, as well as curated information on approximately 700 genes from various organisms. The locus report pages include links to external databases that contain relevant annotation, microarray expression and proteome data. Conversely, the Saccharomyces Genome Database (SGD), S.cerevisiae GeneDB and Swiss-Prot link to the budding yeast section of GermOnline from their respective locus pages. GermOnline, a fully operational prototype subject-oriented knowledgebase designed for community annotation and array data visualization, is accessible at http://www.germonline.org. The target audience includes researchers who work on mitotic cell division, meiosis, gametogenesis, germ line development, human reproductive health and comparative genomics.
Collapse
Affiliation(s)
- C Wiederkehr
- Biozentrum and Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Spirin V, Mirny LA. Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A 2003; 100:12123-8. [PMID: 14517352 PMCID: PMC218723 DOI: 10.1073/pnas.2032324100] [Citation(s) in RCA: 744] [Impact Index Per Article: 35.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Proteins, nucleic acids, and small molecules form a dense network of molecular interactions in a cell. Molecules are nodes of this network, and the interactions between them are edges. The architecture of molecular networks can reveal important principles of cellular organization and function, similarly to the way that protein structure tells us about the function and organization of a protein. Computational analysis of molecular networks has been primarily concerned with node degree [Wagner, A. & Fell, D. A. (2001) Proc. R. Soc. London Ser. B 268, 1803-1810; Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. (2000) Nature 407, 651-654] or degree correlation [Maslov, S. & Sneppen, K. (2002) Science 296, 910-913], and hence focused on single/two-body properties of these networks. Here, by analyzing the multibody structure of the network of protein-protein interactions, we discovered molecular modules that are densely connected within themselves but sparsely connected with the rest of the network. Comparison with experimental data and functional annotation of genes showed two types of modules: (i) protein complexes (splicing machinery, transcription factors, etc.) and (ii) dynamic functional units (signaling cascades, cell-cycle regulation, etc.). Discovered modules are highly statistically significant, as is evident from comparison with random graphs, and are robust to noise in the data. Our results provide strong support for the network modularity principle introduced by Hartwell et al. [Hartwell, L. H., Hopfield, J. J., Leibler, S. & Murray, A. W. (1999) Nature 402, C47-C52], suggesting that found modules constitute the "building blocks" of molecular networks.
Collapse
Affiliation(s)
- Victor Spirin
- Harvard-MIT Division of Health Sciences and Technology, 16-343, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | | |
Collapse
|
40
|
|
41
|
Wang D, Harper JF, Gribskov M. Systematic trans-genomic comparison of protein kinases between Arabidopsis and Saccharomyces cerevisiae. PLANT PHYSIOLOGY 2003; 132:2152-65. [PMID: 12913170 PMCID: PMC181299 DOI: 10.1104/pp.103.021485] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2003] [Revised: 03/26/2003] [Accepted: 05/07/2003] [Indexed: 05/18/2023]
Abstract
The genome of the budding yeast (Saccharomyces cerevisiae) provides an important paradigm for transgenomic comparisons with other eukaryotic species. Here, we report a systematic comparison of the protein kinases of yeast (119 kinases) and a reference plant Arabidopsis (1,019 kinases). Using a whole-protein-based, hierarchical clustering approach, the complete set of protein kinases from both species were clustered. We validated our clustering by three observations: (a) clustering pattern of functional orthologs proven in genetic complementation experiments, (b) consistency with reported classifications of yeast kinases, and (c) consistency with the biochemical properties of those Arabidopsis kinases already experimentally characterized. The clustering pattern identified no overlap between yeast kinases and the receptor-like kinases (RLKs) of Arabidopsis. Ten more kinase families were found to be specific for one of the two species. Among them, the calcium-dependent protein kinase and phosphoenolpyruvate carboxylase kinase families are specific for plants, whereas the Ca(2+)/calmodulin-dependent protein kinase and provirus insertion in mouse-like kinase families were found only in yeast and animals. Three yeast kinase families, nitrogen permease reactivator/halotolerance-5), polyamine transport kinase, and negative regulator of sexual conjugation and meiosis, are absent in both plants and animals. The majority of yeast kinase families (21 of 26) display Arabidopsis counterparts, and all are mapped into Arabidopsis families of intracellular kinases that are not related to RLKs. Representatives from 11 of the common families (54 kinases from Arabidopsis and 17 from yeast) share an extremely high degree of similarity (blast E value < 10(-80)), suggesting the likelihood of orthologous functions. Selective expansion of yeast kinase families was observed in Arabidopsis. This is most evident for yeast genes CBK1, HRR25, and SNF1 and the kinase family S6K. Reduction of kinase families was also observed, as in the case of the NEK-like family. The distinguishing features between the two sets of kinases are the selective expansion of yeast families and the generation of a limited number of new kinase families for new functionality in Arabidopsis, most notably, the Arabidopsis RLKs that constitute important components of plant intercellular communication apparatus.
Collapse
Affiliation(s)
- Degeng Wang
- San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0537, USA.
| | | | | |
Collapse
|
42
|
Current awareness on yeast. Yeast 2003; 20:837-44. [PMID: 12886942 DOI: 10.1002/yea.946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
43
|
Zehetner G. OntoBlast function: From sequence similarities directly to potential functional annotations by ontology terms. Nucleic Acids Res 2003; 31:3799-803. [PMID: 12824422 PMCID: PMC168962 DOI: 10.1093/nar/gkg555] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
OntoBlast allows one to find information about potential functions of proteins by presenting a weighted list of ontology entries associated with similar sequences from completely sequenced genomes identified in a BLAST search. It combines, in a single analysis step, the search for sequence similarities in several species with the association of information stored in ontologies. From each identified ontology term a list of genes, which share the functional annotation, can be retrieved. The OntoBlast function is an integral part of the 'Ontologies TO GenomeMatrix' tool which provides an alternative entry point from ontology terms to the Genome-Matrix database. OntoBlast's web interface is accessible on the 'Ontologies TO GenomeMatrix Gate' page at http://functionalgenomics.de/ontogate/.
Collapse
Affiliation(s)
- Günther Zehetner
- Max-Planck-Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany.
| |
Collapse
|
44
|
Brannetti B, Helmer-Citterich M. iSPOT: A web tool to infer the interaction specificity of families of protein modules. Nucleic Acids Res 2003; 31:3709-11. [PMID: 12824399 PMCID: PMC168998 DOI: 10.1093/nar/gkg592] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
iSPOT (http://cbm.bio.uniroma2.it/ispot) is a web tool developed to infer the recognition specificity of protein module families; it is based on the SPOT procedure that utilizes information from position-specific contacts, derived from the available domain/ligand complexes of known structure, and experimental interaction data to build a database of residue-residue contact frequencies. iSPOT is available to infer the interaction specificity of PDZ, SH3 and WW domains. For each family of protein domains, iSPOT evaluates the probability of interaction between a query domain of the specified families and an input protein/peptide sequence and makes it possible to search for potential binding partners of a given domain within the SWISS-PROT database. The experimentally derived interaction data utilized to build the PDZ, SH3 and WW databases of residue-residue contact frequencies are also accessible. Here we describe the application to the WW family of protein modules.
Collapse
Affiliation(s)
- Barbara Brannetti
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, 00133 Rome, Italy
| | | |
Collapse
|
45
|
Affiliation(s)
- Sandra J Jacobson
- Department of Biology, University of California, San Diego, La Jolla, California 92093-0347, USA
| | | | | |
Collapse
|