1
|
Engel SR, Wong ED, Nash RS, Aleksander S, Alexander M, Douglass E, Karra K, Miyasato SR, Simison M, Skrzypek MS, Weng S, Cherry JM. New data and collaborations at the Saccharomyces Genome Database: updated reference genome, alleles, and the Alliance of Genome Resources. Genetics 2022; 220:iyab224. [PMID: 34897464 PMCID: PMC9209811 DOI: 10.1093/genetics/iyab224] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 11/11/2021] [Indexed: 02/03/2023] Open
Abstract
Saccharomyces cerevisiae is used to provide fundamental understanding of eukaryotic genetics, gene product function, and cellular biological processes. Saccharomyces Genome Database (SGD) has been supporting the yeast research community since 1993, serving as its de facto hub. Over the years, SGD has maintained the genetic nomenclature, chromosome maps, and functional annotation, and developed various tools and methods for analysis and curation of a variety of emerging data types. More recently, SGD and six other model organism focused knowledgebases have come together to create the Alliance of Genome Resources to develop sustainable genome information resources that promote and support the use of various model organisms to understand the genetic and genomic bases of human biology and disease. Here we describe recent activities at SGD, including the latest reference genome annotation update, the development of a curation system for mutant alleles, and new pages addressing homology across model organisms as well as the use of yeast to study human disease.
Collapse
Affiliation(s)
- Stacia R Engel
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Edith D Wong
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Robert S Nash
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Suzi Aleksander
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Micheal Alexander
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Eric Douglass
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Kalpana Karra
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Stuart R Miyasato
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Matt Simison
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Marek S Skrzypek
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Shuai Weng
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - J Michael Cherry
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| |
Collapse
|
2
|
Systematic Structure-Based Search for Ochratoxin-Degrading Enzymes in Proteomes from Filamentous Fungi. Biomolecules 2021; 11:biom11071040. [PMID: 34356666 PMCID: PMC8301969 DOI: 10.3390/biom11071040] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 07/13/2021] [Accepted: 07/15/2021] [Indexed: 01/05/2023] Open
Abstract
(1) Background: ochratoxins are mycotoxins produced by filamentous fungi with important implications in the food manufacturing industry due to their toxicity. Decontamination by specific ochratoxin-degrading enzymes has become an interesting alternative for the treatment of contaminated food commodities. (2) Methods: using a structure-based approach based on homology modeling, blind molecular docking of substrates and characterization of low-frequency protein motions, we performed a proteome mining in filamentous fungi to characterize new enzymes with potential ochratoxinase activity. (3) Results: the proteome mining results demonstrated the ubiquitous presence of fungal binuclear zinc-dependent amido-hydrolases with a high degree of structural homology to the already characterized ochratoxinase from Aspergillus niger. Ochratoxinase-like enzymes from ochratoxin-producing fungi showed more favorable substrate-binding pockets to accommodate ochratoxins A and B. (4) Conclusions: filamentous fungi are an interesting and rich source of hydrolases potentially capable of degrading ochratoxins, and could be used for the detoxification of diverse food commodities.
Collapse
|
3
|
Pires RH, Cataldi TR, Franceschini LM, Labate MV, Fusco-Almeida AM, Labate CA, Palma MS, Soares Mendes-Giannini MJ. Metabolic profiles of planktonic and biofilm cells of Candida orthopsilosis. Future Microbiol 2016; 11:1299-1313. [PMID: 27662506 DOI: 10.2217/fmb-2016-0025] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
AIM This study aims to understand which Candida orthopsilosis protein aids fungus adaptation upon its switching from planktonic growth to biofilm. MATERIALS & METHODS Ion mobility separation within mass spectrometry analysis combination were used. RESULTS Proteins mapped for different biosynthetic pathways showed that selective ribosome autophagy might occur in biofilms. Glucose, used as a carbon source in the glycolytic flux, changed to glycogen and trehalose. CONCLUSION Candida orthopsilosis expresses proteins that combine a variety of mechanisms to provide yeasts with the means to adjust the catalytic properties of enzymes. Adjustment of the enzymes helps modulate the biosynthesis/degradation rates of the available nutrients, in order to control and coordinate the metabolic pathways that enable cells to express an adequate response to nutrient availability.
Collapse
Affiliation(s)
- Regina Helena Pires
- Department of Clinical Analysis, Clinical Mycology Laboratory, Faculdade de Ciências Farmacêuticas, UNESP - Univ Estadual Paulista Júlio de Mesquita Filho, FCFAr, Rodovia Araraquara-Jaú, km1, Araraquara 14801-902, SP, Brazil
| | - Thaís Regiani Cataldi
- Department of Genetics, ESALQ/USP - Univ de São Paulo, Laboratory Max Feffer Plant Genetics, Av. Pádua Dias 11, Caixa Postal 83, Piracicaba 13400-970, SP, Brazil
| | - Livia Maria Franceschini
- Department of Genetics, ESALQ/USP - Univ de São Paulo, Laboratory Max Feffer Plant Genetics, Av. Pádua Dias 11, Caixa Postal 83, Piracicaba 13400-970, SP, Brazil
| | - Mônica Veneziano Labate
- Department of Genetics, ESALQ/USP - Univ de São Paulo, Laboratory Max Feffer Plant Genetics, Av. Pádua Dias 11, Caixa Postal 83, Piracicaba 13400-970, SP, Brazil
| | - Ana Marisa Fusco-Almeida
- Department of Clinical Analysis, Clinical Mycology Laboratory, Faculdade de Ciências Farmacêuticas, UNESP - Univ Estadual Paulista Júlio de Mesquita Filho, FCFAr, Rodovia Araraquara-Jaú, km1, Araraquara 14801-902, SP, Brazil
| | - Carlos Alberto Labate
- Department of Genetics, ESALQ/USP - Univ de São Paulo, Laboratory Max Feffer Plant Genetics, Av. Pádua Dias 11, Caixa Postal 83, Piracicaba 13400-970, SP, Brazil
| | - Mario Sérgio Palma
- Department of Biology, Lab. Structural Biology & Zoochemistry, CEIS, Univ Estadual Paulista Júlio de Mesquita Filho, UNESP, Institute of Biosciences, Av. 24-A, 1515. Bela Vista, Rio Claro 13506-900, SP, Brazil
| | - Maria José Soares Mendes-Giannini
- Department of Clinical Analysis, Clinical Mycology Laboratory, Faculdade de Ciências Farmacêuticas, UNESP - Univ Estadual Paulista Júlio de Mesquita Filho, FCFAr, Rodovia Araraquara-Jaú, km1, Araraquara 14801-902, SP, Brazil
| |
Collapse
|
4
|
Development and targeting of transcriptional regulatory network controlling FLU1 activation in Candida albicans for novel antifungals. J Mol Graph Model 2016; 69:1-7. [DOI: 10.1016/j.jmgm.2016.07.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Revised: 06/24/2016] [Accepted: 07/25/2016] [Indexed: 11/19/2022]
|
5
|
Yin H, Nie L, Zhao F, Zhou H, Li H, Dong X, Zhang H, Wang Y, Shi Q, Li J. De novo assembly and characterization of the Chinese three-keeled pond turtle (Mauremys reevesii) transcriptome: presence of longevity-related genes. PeerJ 2016; 4:e2062. [PMID: 27257545 PMCID: PMC4888314 DOI: 10.7717/peerj.2062] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 05/01/2016] [Indexed: 12/20/2022] Open
Abstract
Mauremys reevesii (Geoemydidae) is one of the most common and widespread semi-aquatic turtles in East Asia. The unusually long lifespan of some individuals makes this turtle species a potentially useful model organism for studying the molecular basis of longevity. In this study, pooled total RNA extracted from liver, spleen and skeletal-muscle of three adult individuals were sequenced using Illumina Hiseq 2500 platform. A set of telomere-related genes were found in the transcriptome, including tert, tep1, and six shelterin complex proteins coding genes (trf1, trf2, tpp1, pot1, tin2 and rap1). These genes products protect chromosome ends from deterioration and therefore significantly contribute to turtle longevity. The transcriptome data generated in this study provides a comprehensive reference for future molecular studies in the turtle.
Collapse
Affiliation(s)
- Huazong Yin
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Liuwang Nie
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Feifei Zhao
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Huaxing Zhou
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Haifeng Li
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Xianmei Dong
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Huanhuan Zhang
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Yuqin Wang
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Qiong Shi
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| | - Jun Li
- College of Life Science, Anhui Normal University, Provincial Key Lab of the Conservation and Exploitation Research of Biological Resources in Anhui, Wuhu, Anhui, China
| |
Collapse
|
6
|
Zhao X, Tang J, Wang X, Yang R, Zhang X, Gu Y, Li X, Ma M. YNL134C from Saccharomyces cerevisiae encodes a novel protein with aldehyde reductase activity for detoxification of furfural derived from lignocellulosic biomass. Yeast 2015; 32:409-22. [PMID: 25656244 DOI: 10.1002/yea.3068] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Revised: 12/28/2014] [Accepted: 01/28/2015] [Indexed: 02/03/2023] Open
Abstract
Furfural and 5-hydroxymethylfurfural (HMF) are the two main aldehyde compounds derived from pentoses and hexoses, respectively, during lignocellulosic biomass pretreatment. These two compounds inhibit microbial growth and interfere with subsequent alcohol fermentation. Saccharomyces cerevisiae has the in situ ability to detoxify furfural and HMF to the less toxic 2-furanmethanol (FM) and furan-2,5-dimethanol (FDM), respectively. Herein, we report that an uncharacterized gene, YNL134C, was highly up-regulated under furfural or HMF stress and Yap1p and Msn2/4p transcription factors likely controlled its up-regulated expression. Enzyme activity assays showed that YNL134C is an NADH-dependent aldehyde reductase, which plays a role in detoxification of furfural to FM. However, no NADH- or NADPH-dependent enzyme activity was observed for detoxification of HMF to FDM. This enzyme did not catalyse the reverse reaction of FM to furfural or FDM to HMF. Further studies showed that YNL134C is a broad-substrate aldehyde reductase, which can reduce multiple aldehydes to their corresponding alcohols. Although YNL134C is grouped into the quinone oxidoreductase family, no quinone reductase activity was observed using 1,2-naphthoquinone or 9,10-phenanthrenequinone as a substrate, and phylogenetic analysis indicates that it is genetically distant to quinone reductases. Proteins similar to YNL134C in sequence from S. cerevisiae and other microorganisms were phylogenetically analysed.
Collapse
Affiliation(s)
- Xianxian Zhao
- Institute of Ecological and Environmental Sciences, College of Resources and Environmental Sciences, Sichuan Agricultural University, Wenjiang, Sichuan, People's Republic of China
| | | | | | | | | | | | | | | |
Collapse
|
7
|
Chen SH, Shah AH, Segev N. Ypt31/32 GTPases and their F-Box effector Rcy1 regulate ubiquitination of recycling proteins. CELLULAR LOGISTICS 2014; 1:21-31. [PMID: 21686101 DOI: 10.4161/cl.1.1.14695] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2010] [Revised: 12/31/2010] [Accepted: 01/03/2011] [Indexed: 11/19/2022]
Abstract
Ypt/Rab GTPases are conserved molecular switches that regulate the different steps of intracellular trafficking pathways. In yeast, the Ypt31/32 GTPases are required for exit from the trans-Golgi and for recycling from the plasma membrane (PM), through early endosomes, to the Golgi. We have previously shown that the recycling function of Ypt31/32 is mediated by an effector called Rcy1. Specifically, both Ypt31/32 and Rcy1 are required for recycling the vSNARE Snc1. Rcy1 contains an F-box domain shared by proteins that act in substrate recognition of ubiquitin ligases. Here, we show that both Ypt31/32 and Rcy1 are important for Snc1 ubiquitination and that such ubiquitination plays a role in Snc1 recycling. Direct interaction between Rcy1 and Snc1 was demonstrated using two independent approaches. In vitro interaction was observed using co-precipitation of recombinant proteins, whereas interaction in yeast cells was observed using bimolecular fluorescence complementation. Ubiquitination of Snc1 in vivo at the K63 position was previously shown in a proteomic study. We show that the Snc1-K63R mutant protein is less ubquitinated than wild-type Snc1 and is defective in endosome-to-Golgi transport. Additionally, wild-type Snc1 is ubiquitinated to a lesser extent in ypt31/32ts and rcy1Δ mutant cells and Snc1 recycling is also blocked in endosomes in these mutants. Therefore, ubiquitination plays a role in the recycling of Snc1 from the PM to the Golgi, and Ypt31/32 and Rcy1 regulate this ubiquitination. Together, these results suggest a new role for ubiquitination in cargo recycling. Moreover, we propose that Ypt/Rabs integrate intra-cellular trafficking with ubiquitination.
Collapse
Affiliation(s)
- Shu H Chen
- Department of Biological Sciences; Laboratory for Molecular Biology; University of Illinois at Chicago; Chicago, IL USA
| | | | | |
Collapse
|
8
|
Candidate target genes for the Saccharomyces cerevisiae transcription factor, Yap2. Folia Microbiol (Praha) 2013; 58:403-8. [DOI: 10.1007/s12223-013-0224-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2012] [Accepted: 01/09/2013] [Indexed: 11/28/2022]
|
9
|
Subtil T, Boles E. Competition between pentoses and glucose during uptake and catabolism in recombinant Saccharomyces cerevisiae. BIOTECHNOLOGY FOR BIOFUELS 2012; 5:14. [PMID: 22424089 PMCID: PMC3364893 DOI: 10.1186/1754-6834-5-14] [Citation(s) in RCA: 104] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2011] [Accepted: 03/16/2012] [Indexed: 05/21/2023]
Abstract
BACKGROUND In mixed sugar fermentations with recombinant Saccharomyces cerevisiae strains able to ferment D-xylose and L-arabinose the pentose sugars are normally only utilized after depletion of D-glucose. This has been attributed to competitive inhibition of pentose uptake by D-glucose as pentose sugars are taken up into yeast cells by individual members of the yeast hexose transporter family. We wanted to investigate whether D-glucose inhibits pentose utilization only by blocking its uptake or also by interfering with its further metabolism. RESULTS To distinguish between inhibitory effects of D-glucose on pentose uptake and pentose catabolism, maltose was used as an alternative carbon source in maltose-pentose co-consumption experiments. Maltose is taken up by a specific maltose transport system and hydrolyzed only intracellularly into two D-glucose molecules. Pentose consumption decreased by about 20 - 30% during the simultaneous utilization of maltose indicating that hexose catabolism can impede pentose utilization. To test whether intracellular D-glucose might impair pentose utilization, hexo-/glucokinase deletion mutants were constructed. Those mutants are known to accumulate intracellular D-glucose when incubated with maltose. However, pentose utilization was not effected in the presence of maltose. Addition of increasing concentrations of D-glucose to the hexo-/glucokinase mutants finally completely blocked D-xylose as well as L-arabinose consumption, indicating a pronounced inhibitory effect of D-glucose on pentose uptake. Nevertheless, constitutive overexpression of pentose-transporting hexose transporters like Hxt7 and Gal2 could improve pentose consumption in the presence of D-glucose. CONCLUSION Our results confirm that D-glucose impairs the simultaneous utilization of pentoses mainly due to inhibition of pentose uptake. Whereas intracellular D-glucose does not seem to have an inhibitory effect on pentose utilization, further catabolism of D-glucose can also impede pentose utilization. Nevertheless, the results suggest that co-fermentation of pentoses in the presence of D-glucose can significantly be improved by the overexpression of pentose transporters, especially if they are not inhibited by D-glucose.
Collapse
Affiliation(s)
- Thorsten Subtil
- Institute of Molecular Biosciences, Goethe-University Frankfurt am Main, Max-von-Laue-Str. 9, D-60438 Frankfurt am Main, Germany
| | - Eckhard Boles
- Institute of Molecular Biosciences, Goethe-University Frankfurt am Main, Max-von-Laue-Str. 9, D-60438 Frankfurt am Main, Germany
| |
Collapse
|
10
|
Drosophila Inducer of MEiosis 4 (IME4) is required for Notch signaling during oogenesis. Proc Natl Acad Sci U S A 2011; 108:14855-60. [PMID: 21873203 DOI: 10.1073/pnas.1111577108] [Citation(s) in RCA: 132] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
N(6)-methyladenosine is a nonediting RNA modification found in mRNA of all eukaryotes, from yeast to humans. Although the functional significance of N(6)-methyladenosine is unknown, the Inducer of MEiosis 4 (IME4) gene of Saccharomyces cerevisiae, which encodes the enzyme that catalyzes this modification, is required for gametogenesis. Here we find that the Drosophila IME4 homolog, Dm ime4, is expressed in ovaries and testes, indicating an evolutionarily conserved function for this enzyme in gametogenesis. In contrast to yeast, but as in Arabidopsis, Dm ime4 is essential for viability. Lethality is rescued fully by a wild-type transgenic copy of Dm ime4 but not by introducing mutations shown to abrogate the catalytic activity of yeast Ime4, indicating functional conservation of the catalytic domain. The phenotypes of hypomorphic alleles of Dm ime4 that allow recovery of viable adults reveal critical functions for this gene in oogenesis. Ovarioles from Dm ime4 mutants have fused egg chambers with follicle-cell defects similar to those observed when Notch signaling is defective. Indeed, using a reporter for Notch activation, we find markedly reduced levels of Notch signaling in follicle cells of Dm ime4 mutants. This phenotype of Dm ime4 mutants is rescued by inducing expression of a constitutively activated form of Notch. Our study reveals the function of IME4 in a metazoan. In yeast, this enzyme is responsible for a crucial developmental decision, whereas in Drosophila it appears to target the conserved Notch signaling pathway, which regulates many vital aspects of metazoan development.
Collapse
|
11
|
Zomorrodi AR, Maranas CD. Improving the iMM904 S. cerevisiae metabolic model using essentiality and synthetic lethality data. BMC SYSTEMS BIOLOGY 2010; 4:178. [PMID: 21190580 PMCID: PMC3023687 DOI: 10.1186/1752-0509-4-178] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2010] [Accepted: 12/29/2010] [Indexed: 11/29/2022]
Abstract
BACKGROUND Saccharomyces cerevisiae is the first eukaryotic organism for which a multi-compartment genome-scale metabolic model was constructed. Since then a sequence of improved metabolic reconstructions for yeast has been introduced. These metabolic models have been extensively used to elucidate the organizational principles of yeast metabolism and drive yeast strain engineering strategies for targeted overproductions. They have also served as a starting point and a benchmark for the reconstruction of genome-scale metabolic models for other eukaryotic organisms. In spite of the successive improvements in the details of the described metabolic processes, even the recent yeast model (i.e., iMM904) remains significantly less predictive than the latest E. coli model (i.e., iAF1260). This is manifested by its significantly lower specificity in predicting the outcome of grow/no grow experiments in comparison to the E. coli model. RESULTS In this paper we make use of the automated GrowMatch procedure for restoring consistency with single gene deletion experiments in yeast and extend the procedure to make use of synthetic lethality data using the genome-scale model iMM904 as a basis. We identified and vetted using literature sources 120 distinct model modifications including various regulatory constraints for minimal and YP media. The incorporation of the suggested modifications led to a substantial increase in the fraction of correctly predicted lethal knockouts (i.e., specificity) from 38.84% (87 out of 224) to 53.57% (120 out of 224) for the minimal medium and from 24.73% (45 out of 182) to 40.11% (73 out of 182) for the YP medium. Synthetic lethality predictions improved from 12.03% (16 out of 133) to 23.31% (31 out of 133) for the minimal medium and from 6.96% (8 out of 115) to 13.04% (15 out of 115) for the YP medium. CONCLUSIONS Overall, this study provides a roadmap for the computationally driven correction of multi-compartment genome-scale metabolic models and demonstrates the value of synthetic lethals as curation agents.
Collapse
Affiliation(s)
- Ali R Zomorrodi
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
12
|
González-Díaz H, Dea-Ayuela MA, Pérez-Montoto LG, Prado-Prado FJ, Agüero-Chapín G, Bolas-Fernández F, Vazquez-Padrón RI, Ubeira FM. QSAR for RNases and theoretic-experimental study of molecular diversity on peptide mass fingerprints of a new Leishmania infantum protein. Mol Divers 2009; 14:349-69. [PMID: 19578942 PMCID: PMC7088557 DOI: 10.1007/s11030-009-9178-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2009] [Accepted: 06/13/2009] [Indexed: 11/29/2022]
Abstract
The toxicity and low success of current treatments for Leishmaniosis determines the search of new peptide drugs and/or molecular targets in Leishmania pathogen species (L. infantum and L. major). For example, Ribonucleases (RNases) are enzymes relevant to several biologic processes; then, theoretical and experimental study of the molecular diversity of Peptide Mass Fingerprints (PMFs) of RNases is useful for drug design. This study introduces a methodology that combines QSAR models, 2D-Electrophoresis (2D-E), MALDI-TOF Mass Spectroscopy (MS), BLAST alignment, and Molecular Dynamics (MD) to explore PMFs of RNases. We illustrate this approach by investigating for the first time the PMFs of a new protein of L. infantum. Here we report and compare new versus old predictive models for RNases based on Topological Indices (TIs) of Markov Pseudo-Folding Lattices. These group of indices called Pseudo-folding Lattice 2D-TIs include: Spectral moments pi ( k )(x,y), Mean Electrostatic potentials xi ( k )(x,y), and Entropy measures theta ( k )(x,y). The accuracy of the models (training/cross-validation) was as follows: xi ( k )(x,y)-model (96.0%/91.7%)>pi ( k )(x,y)-model (84.7/83.3) > theta ( k )(x,y)-model (66.0/66.7). We also carried out a 2D-E analysis of biological samples of L. infantum promastigotes focusing on a 2D-E gel spot of one unknown protein with M<20, 100 and pI <7. MASCOT search identified 20 proteins with Mowse score >30, but not one >52 (threshold value), the higher value of 42 was for a probable DNA-directed RNA polymerase. However, we determined experimentally the sequence of more than 140 peptides. We used QSAR models to predict RNase scores for these peptides and BLAST alignment to confirm some results. We also calculated 3D-folding TIs based on MD experiments and compared 2D versus 3D-TIs on molecular phylogenetic analysis of the molecular diversity of these peptides. This combined strategy may be of interest in drug development or target identification.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Department of Microbiology and Parasitology, and Department of Organic Chemistry, Faculty of Pharmacy, USC, 15782, Santiago de Compostela, Spain.
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Vilar S, González-Díaz H, Santana L, Uriarte E. QSAR model for alignment-free prediction of human breast cancer biomarkers based on electrostatic potentials of protein pseudofolding HP-lattice networks. J Comput Chem 2008; 29:2613-22. [PMID: 18478581 DOI: 10.1002/jcc.21016] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Network theory allows relationships to be established between numerical parameters that describe the molecular structure of genes and proteins and their biological properties. These models can be considered as quantitative structure-activity relationships (QSAR) for biopolymers. The work described here concerns the first QSAR model for 122 proteins that are associated with human breast cancer (HBC), as identified experimentally by Sjöblom et al. (Science 2006, 314, 268) from over 10,000 human proteins. In this study, the 122 proteins related to HBC (HBCp) and a control group of 200 proteins that are not related to HBC (non-HBCp) were forced to fold in an HP lattice network. From these networks a series of electrostatic potential parameters (xi(k)) was calculated to describe each protein numerically. The use of xi(k) as an entry point to linear discriminant analysis led to a QSAR model to discriminate between HBCp and non-HBCp, and this model could help to predict the involvement of a certain gene and/or protein in HBC. In addition, validation procedures were carried out on the model and these included an external prediction series and evaluation of an additional series of 1000 non-HBCp. In all cases good levels of classification were obtained with values above 80%. This study represents the first example of a QSAR model for the computational chemistry inspired search of potential HBC protein biomarkers.
Collapse
Affiliation(s)
- Santiago Vilar
- Unit of Bioinformatics and Connectivity Analysis, Institute of Industrial Pharmacy, and Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela 15782, Spain
| | | | | | | |
Collapse
|
14
|
Ozyurt AS, Selby TL. Computational active site analysis of molecular pathways to improve functional classification of enzymes. Proteins 2008; 72:184-96. [DOI: 10.1002/prot.21907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
15
|
Flavour formation in fungi: characterisation of KlAtf, the Kluyveromyces lactis orthologue of the Saccharomyces cerevisiae alcohol acetyltransferases Atf1 and Atf2. Appl Microbiol Biotechnol 2008; 78:783-92. [DOI: 10.1007/s00253-008-1366-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2007] [Revised: 01/10/2008] [Accepted: 01/12/2008] [Indexed: 10/22/2022]
|
16
|
Yang J, Chen L, Wang L, Zhang W, Liu T, Jin Q. TrED: the Trichophyton rubrum Expression Database. BMC Genomics 2007; 8:250. [PMID: 17650345 PMCID: PMC1940010 DOI: 10.1186/1471-2164-8-250] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2007] [Accepted: 07/25/2007] [Indexed: 02/07/2023] Open
Abstract
Background Trichophyton rubrum is the most common dermatophyte species and the most frequent cause of fungal skin infections in humans worldwide. It's a major concern because feet and nail infections caused by this organism is extremely difficult to cure. A large set of expression data including expressed sequence tags (ESTs) and transcriptional profiles of this important fungal pathogen are now available. Careful analysis of these data can give valuable information about potential virulence factors, antigens and novel metabolic pathways. We intend to create an integrated database TrED to facilitate the study of dermatophytes, and enhance the development of effective diagnostic and treatment strategies. Description All publicly available ESTs and expression profiles of T. rubrum during conidial germination in time-course experiments and challenged with antifungal agents are deposited in the database. In addition, comparative genomics hybridization results of 22 dermatophytic fungi strains from three genera, Trichophyton, Microsporum and Epidermophyton, are also included. ESTs are clustered and assembled to elongate the sequence length and abate redundancy. TrED provides functional analysis based on GenBank, Pfam, and KOG databases, along with KEGG pathway and GO vocabulary. It is integrated with a suite of custom web-based tools that facilitate querying and retrieving various EST properties, visualization and comparison of transcriptional profiles, and sequence-similarity searching by BLAST. Conclusion TrED is built upon a relational database, with a web interface offering analytic functions, to provide integrated access to various expression data of T. rubrum and comparative results of dermatophytes. It is devoted to be a comprehensive resource and platform to assist functional genomic studies in dermatophytes. TrED is available from URL: .
Collapse
Affiliation(s)
- Jian Yang
- State Key Laboratory for Molecular Virology and Genetic Engineering, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100176, China
| | - Lihong Chen
- State Key Laboratory for Molecular Virology and Genetic Engineering, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100176, China
| | - Lingling Wang
- State Key Laboratory for Molecular Virology and Genetic Engineering, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100176, China
- Institute of Pathogen Biology, Chinese Academy of Medical Sciences, Beijing 100730, China
| | - Wenliang Zhang
- State Key Laboratory for Molecular Virology and Genetic Engineering, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100176, China
| | - Tao Liu
- State Key Laboratory for Molecular Virology and Genetic Engineering, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100176, China
| | - Qi Jin
- State Key Laboratory for Molecular Virology and Genetic Engineering, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100176, China
- Institute of Pathogen Biology, Chinese Academy of Medical Sciences, Beijing 100730, China
| |
Collapse
|
17
|
Chi A, Huttenhower C, Geer LY, Coon JJ, Syka JEP, Bai DL, Shabanowitz J, Burke DJ, Troyanskaya OG, Hunt DF. Analysis of phosphorylation sites on proteins from Saccharomyces cerevisiae by electron transfer dissociation (ETD) mass spectrometry. Proc Natl Acad Sci U S A 2007; 104:2193-8. [PMID: 17287358 PMCID: PMC1892997 DOI: 10.1073/pnas.0607084104] [Citation(s) in RCA: 455] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a strategy for the analysis of the yeast phosphoproteome that uses endo-Lys C as the proteolytic enzyme, immobilized metal affinity chromatography for phosphopeptide enrichment, a 90-min nanoflow-HPLC/electrospray-ionization MS/MS experiment for phosphopeptide fractionation and detection, gas phase ion/ion chemistry, electron transfer dissociation for peptide fragmentation, and the Open Mass Spectrometry Search Algorithm for phosphoprotein identification and assignment of phosphorylation sites. From a 30-microg (approximately 600 pmol) sample of total yeast protein, we identify 1,252 phosphorylation sites on 629 proteins. Identified phosphoproteins have expression levels that range from <50 to 1,200,000 copies per cell and are encoded by genes involved in a wide variety of cellular processes. We identify a consensus site that likely represents a motif for one or more uncharacterized kinases and show that yeast kinases, themselves, contain a disproportionately large number of phosphorylation sites. Detection of a pHis containing peptide from the yeast protein, Cdc10, suggests an unexpected role for histidine phosphorylation in septin biology. From diverse functional genomics data, we show that phosphoproteins have a higher number of interactions than an average protein and interact with each other more than with a random protein. They are also likely to be conserved across large evolutionary distances.
Collapse
|
18
|
Castrillo JI, Zeef LA, Hoyle DC, Zhang N, Hayes A, Gardner DCJ, Cornell MJ, Petty J, Hakes L, Wardleworth L, Rash B, Brown M, Dunn WB, Broadhurst D, O'Donoghue K, Hester SS, Dunkley TPJ, Hart SR, Swainston N, Li P, Gaskell SJ, Paton NW, Lilley KS, Kell DB, Oliver SG. Growth control of the eukaryote cell: a systems biology study in yeast. J Biol 2007; 6:4. [PMID: 17439666 PMCID: PMC2373899 DOI: 10.1186/jbiol54] [Citation(s) in RCA: 215] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2006] [Revised: 11/20/2006] [Accepted: 02/07/2007] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Cell growth underlies many key cellular and developmental processes, yet a limited number of studies have been carried out on cell-growth regulation. Comprehensive studies at the transcriptional, proteomic and metabolic levels under defined controlled conditions are currently lacking. RESULTS Metabolic control analysis is being exploited in a systems biology study of the eukaryotic cell. Using chemostat culture, we have measured the impact of changes in flux (growth rate) on the transcriptome, proteome, endometabolome and exometabolome of the yeast Saccharomyces cerevisiae. Each functional genomic level shows clear growth-rate-associated trends and discriminates between carbon-sufficient and carbon-limited conditions. Genes consistently and significantly upregulated with increasing growth rate are frequently essential and encode evolutionarily conserved proteins of known function that participate in many protein-protein interactions. In contrast, more unknown, and fewer essential, genes are downregulated with increasing growth rate; their protein products rarely interact with one another. A large proportion of yeast genes under positive growth-rate control share orthologs with other eukaryotes, including humans. Significantly, transcription of genes encoding components of the TOR complex (a major controller of eukaryotic cell growth) is not subject to growth-rate regulation. Moreover, integrative studies reveal the extent and importance of post-transcriptional control, patterns of control of metabolic fluxes at the level of enzyme synthesis, and the relevance of specific enzymatic reactions in the control of metabolic fluxes during cell growth. CONCLUSION This work constitutes a first comprehensive systems biology study on growth-rate control in the eukaryotic cell. The results have direct implications for advanced studies on cell growth, in vivo regulation of metabolic fluxes for comprehensive metabolic engineering, and for the design of genome-scale systems biology models of the eukaryotic cell.
Collapse
Affiliation(s)
- Juan I Castrillo
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Leo A Zeef
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - David C Hoyle
- Northwest Institute for Bio-Health Informatics (NIBHI), School of Medicine, Stopford Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Nianshu Zhang
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Andrew Hayes
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - David CJ Gardner
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Michael J Cornell
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
- School of Computer Science, Kilburn Building, University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - June Petty
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Luke Hakes
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Leanne Wardleworth
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Bharat Rash
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Marie Brown
- School of Chemistry, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Warwick B Dunn
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - David Broadhurst
- School of Chemistry, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Kerry O'Donoghue
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Downing Site, Cambridge CB2 1QW, UK
| | - Svenja S Hester
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Downing Site, Cambridge CB2 1QW, UK
| | - Tom PJ Dunkley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Downing Site, Cambridge CB2 1QW, UK
| | - Sarah R Hart
- School of Chemistry, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Neil Swainston
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Peter Li
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Simon J Gaskell
- School of Chemistry, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Norman W Paton
- School of Computer Science, Kilburn Building, University of Manchester, Oxford Road, Manchester M13 9PL, UK
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Downing Site, Cambridge CB2 1QW, UK
| | - Douglas B Kell
- School of Chemistry, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Stephen G Oliver
- Faculty of Life Sciences, Michael Smith Building, University of Manchester, Oxford Road, Manchester M13 9PT, UK
- Manchester Centre for Integrative Systems Biology, Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| |
Collapse
|
19
|
Arnaud MB, Costanzo MC, Skrzypek MS, Shah P, Binkley G, Lane C, Miyasato SR, Sherlock G. Sequence resources at the Candida Genome Database. Nucleic Acids Res 2006; 35:D452-6. [PMID: 17090582 PMCID: PMC1669745 DOI: 10.1093/nar/gkl899] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The Candida Genome Database (CGD, ) contains a curated collection of genomic information and community resources for researchers who are interested in the molecular biology of the opportunistic pathogen Candida albicans. With the recent release of a new assembly of the C.albicans genome, Assembly 20, C.albicans genomics has entered a new era. Although the C.albicans genome assembly continues to undergo refinement, multiple assemblies and gene nomenclatures will remain in widespread use by the research community. CGD has now taken on the responsibility of maintaining the most up-to-date version of the genome sequence by providing the data from this new assembly alongside the data from the previous assemblies, as well as any future corrections and refinements. In this database update, we describe the sequence information available for C.albicans, the sequence information contained in CGD, and the tools for sequence retrieval, analysis and comparison that CGD provides. CGD is freely accessible at and CGD curators may be contacted by email at candida-curator@genome.stanford.edu.
Collapse
Affiliation(s)
- Martha B Arnaud
- Department of Genetics, Stanford University Medical School, Stanford, CA 94305-5120, USA.
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Tringe SG, Willis J, Liberatore KL, Ruby SW. The WTM genes in budding yeast amplify expression of the stress-inducible gene RNR3. Genetics 2006; 174:1215-28. [PMID: 16980392 PMCID: PMC1667055 DOI: 10.1534/genetics.106.062042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Cellular responses to DNA damage and inhibited replication are evolutionarily conserved sets of pathways that are critical to preserving genome stability. To identify new participants in these responses, we undertook a screen for regulators that, when present on a high-copy vector, alter expression of a DNA damage-inducible RNR3-lacZ reporter construct in Saccharomyces cerevisiae. From this screen we isolated a plasmid encoding two closely related paralogs, WTM1 and WTM2, that greatly increases constitutive expression of RNR3-lacZ. Moderate overexpression of both genes together, or high-level expression of WTM2 alone from a constitutive promoter, upregulates RNR3-lacZ in the absence of DNA damage. Overexpressed, tagged Wtm2p is associated with the RNR3 promoter, indicating that this effect is likely direct. Further investigation reveals that Wtm2p and Wtm1p, previously described as regulators of meiotic gene expression and transcriptional silencing, amplify transcriptional induction of RNR3 in response to replication stress and modulate expression of genes encoding other RNR subunits.
Collapse
Affiliation(s)
- Susannah Green Tringe
- Department of Molecular Genetics and Microbiology and Cancer Research and Treatment Center, University of New Mexico Health Sciences Center, Albuquerque, New Mexico 87131, USA
| | | | | | | |
Collapse
|
21
|
Fundel K, Zimmer R. Gene and protein nomenclature in public databases. BMC Bioinformatics 2006; 7:372. [PMID: 16899134 PMCID: PMC1560172 DOI: 10.1186/1471-2105-7-372] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2006] [Accepted: 08/09/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Frequently, several alternative names are in use for biological objects such as genes and proteins. Applications like manual literature search, automated text-mining, named entity identification, gene/protein annotation, and linking of knowledge from different information sources require the knowledge of all used names referring to a given gene or protein. Various organism-specific or general public databases aim at organizing knowledge about genes and proteins. These databases can be used for deriving gene and protein name dictionaries. So far, little is known about the differences between databases in terms of size, ambiguities and overlap. RESULTS We compiled five gene and protein name dictionaries for each of the five model organisms (yeast, fly, mouse, rat, and human) from different organism-specific and general public databases. We analyzed the degree of ambiguity of gene and protein names within and between dictionaries, to a lexicon of common English words and domain-related non-gene terms, and we compared different data sources in terms of size of extracted dictionaries and overlap of synonyms between those. The study shows that the number of genes/proteins and synonyms covered in individual databases varies significantly for a given organism, and that the degree of ambiguity of synonyms varies significantly between different organisms. Furthermore, it shows that, despite considerable efforts of co-curation, the overlap of synonyms in different data sources is rather moderate and that the degree of ambiguity of gene names with common English words and domain-related non-gene terms varies depending on the considered organism. CONCLUSION In conclusion, these results indicate that the combination of data contained in different databases allows the generation of gene and protein name dictionaries that contain significantly more used names than dictionaries obtained from individual data sources. Furthermore, curation of combined dictionaries considerably increases size and decreases ambiguity. The entries of the curated synonym dictionary are available for manual querying, editing, and PubMed- or Google-search via the ProThesaurus-wiki. For automated querying via custom software, we offer a web service and an exemplary client application.
Collapse
Affiliation(s)
- Katrin Fundel
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany
| | - Ralf Zimmer
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany
| |
Collapse
|
22
|
Schwarz EM, Sternberg PW. Searching WormBase for information about Caenorhabditis elegans. CURRENT PROTOCOLS IN BIOINFORMATICS 2006; Chapter 1:Unit 1.8. [PMID: 18428757 DOI: 10.1002/0471250953.bi0108s14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
WormBase is the major public biological database for the nematode Caenorhabditis elegans. It is meant to be useful to any biologist who wants to use C. elegans, whatever his or her specialty. WormBase contains information about the genomic sequence of C. elegans, its genes and their products, and its higher-level traits such as gene expression patterns and neuronal connectivity. WormBase also contains genomic sequences and gene structures of C. briggsae and C. remanei, two closely related worms. These data are interconnected, so that a search beginning with one object (such as a gene) can be directed to related objects of a different type (e.g., the DNA sequence of the gene or the cells in which the gene is active). One can also perform searches for complex data sets. The WormBase developers group actively invites suggestions for improvements from the database users. WormBase's source code and underlying database are freely available for local installation and modification.
Collapse
Affiliation(s)
- Erich M Schwarz
- California Institute of Technology, Pasadena, California, USA
| | | |
Collapse
|
23
|
Liu H, Hu ZZ, Torii M, Wu C, Friedman C. Quantitative assessment of dictionary-based protein named entity tagging. J Am Med Inform Assoc 2006; 13:497-507. [PMID: 16799122 PMCID: PMC1561801 DOI: 10.1197/jamia.m2085] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE Natural language processing (NLP) approaches have been explored to manage and mine information recorded in biological literature. A critical step for biological literature mining is biological named entity tagging (BNET) that identifies names mentioned in text and normalizes them with entries in biological databases. The aim of this study was to provide quantitative assessment of the complexity of BNET on protein entities through BioThesaurus, a thesaurus of gene/protein names for UniProt knowledgebase (UniProtKB) entries that was acquired using online resources. METHODS We evaluated the complexity through several perspectives: ambiguity (i.e., the number of genes/proteins represented by one name), synonymy (i.e., the number of names associated with the same gene/protein), and coverage (i.e., the percentage of gene/protein names in text included in the thesaurus). We also normalized names in BioThesaurus and measures were obtained twice, once before normalization and once after. RESULTS The current version of BioThesaurus has over 2.6 million names or 2.1 million normalized names covering more than 1.8 million UniProtKB entries. The average synonymy is 3.53 (2.86 after normalization), ambiguity is 2.31 before normalization and 2.32 after, while the coverage is 94.0% based on the BioCreAtive data set comprising MEDLINE abstracts containing genes/proteins. CONCLUSION The study indicated that names for genes/proteins are highly ambiguous and there are usually multiple names for the same gene or protein. It also demonstrated that most gene/protein names appearing in text can be found in BioThesaurus.
Collapse
Affiliation(s)
- Hongfang Liu
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, DC 20007, USA.
| | | | | | | | | |
Collapse
|
24
|
Lemmens K, Dhollander T, De Bie T, Monsieurs P, Engelen K, Smets B, Winderickx J, De Moor B, Marchal K. Inferring transcriptional modules from ChIP-chip, motif and microarray data. Genome Biol 2006; 7:R37. [PMID: 16677396 PMCID: PMC1779513 DOI: 10.1186/gb-2006-7-5-r37] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2005] [Revised: 12/21/2005] [Accepted: 04/10/2006] [Indexed: 12/29/2022] Open
Abstract
'ReMoDiscovery' is an intuitive algorithm to correlate regulatory programs with regulators and corresponding motifs to a set of co-expressed genes. It exploits in a concurrent way three independent data sources: ChIP-chip data, motif information and gene expression profiles. When compared to published module discovery algorithms, ReMoDiscovery is fast and easily tunable. We evaluated our method on yeast data, where it was shown to generate biologically meaningful findings and allowed the prediction of potential novel roles of transcriptional regulators.
Collapse
Affiliation(s)
- Karen Lemmens
- BIOI@SCD, Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| | - Thomas Dhollander
- BIOI@SCD, Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| | - Tijl De Bie
- Research Group on Quantitative Psychology, Department of Psychology, KU Leuven, Tiensestraat, B-3000 Leuven, Belgium
| | - Pieter Monsieurs
- BIOI@SCD, Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| | - Kristof Engelen
- BIOI@SCD, Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| | - Bart Smets
- Molecular Physiology of Plants and Micro-organisms Section, Biology Department, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| | - Joris Winderickx
- Molecular Physiology of Plants and Micro-organisms Section, Biology Department, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| | - Bart De Moor
- BIOI@SCD, Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| | - Kathleen Marchal
- BIOI@SCD, Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
- CMPG, Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg, B-3001 Heverlee, Belgium
| |
Collapse
|
25
|
Chiang DY, Nix DA, Shultzaberger RK, Gasch AP, Eisen MB. Flexible promoter architecture requirements for coactivator recruitment. BMC Mol Biol 2006; 7:16. [PMID: 16646957 PMCID: PMC1488866 DOI: 10.1186/1471-2199-7-16] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2005] [Accepted: 04/28/2006] [Indexed: 11/16/2022] Open
Abstract
Background The spatial organization of transcription factor binding sites in regulatory DNA, and the composition of intersite sequences, influences the assembly of the multiprotein complexes that regulate RNA polymerase recruitment and thereby affects transcription. We have developed a genetic approach to investigate how reporter gene transcription is affected by varying the spacing between transcription factor binding sites. We characterized the components of promoter architecture that govern the yeast transcription factors Cbf1 and Met31/32, which bind independently, but collaboratively recruit the coactivator Met4. Results A Cbf1 binding site was required upstream of a Met31/32 binding site for full reporter gene expression. Distance constraints on coactivator recruitment were more flexible than those for cooperatively binding transcription factors. Distances from 18 to 50 bp between binding sites support efficient recruitment of Met4, with only slight modulation by helical phasing. Intriguingly, we found that certain sequences located between the binding sites abolished gene expression. Conclusion These results yield insight to the influence of both binding site architecture and local DNA flexibility on gene expression, and can be used to refine computational predictions of gene expression from promoter sequences. In addition, our approach can be applied to survey promoter architecture requirements for arbitrary combinations of transcription factor binding sites.
Collapse
Affiliation(s)
- Derek Y Chiang
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA
| | - David A Nix
- Department of Genome Sciences, Life Sciences Division, Ernest Orlando Lawrence Berkeley National Lab, Berkeley, CA 94720, USA
- Affymetrix, Santa Clara, CA 95051, USA
| | - Ryan K Shultzaberger
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Audrey P Gasch
- Department of Genetics, University of Wisconsin, Madison, WI 53706, USA
| | - Michael B Eisen
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Department of Genome Sciences, Life Sciences Division, Ernest Orlando Lawrence Berkeley National Lab, Berkeley, CA 94720, USA
| |
Collapse
|
26
|
Li H, Coghlan A, Ruan J, Coin LJ, Hériché JK, Osmotherly L, Li R, Liu T, Zhang Z, Bolund L, Wong GKS, Zheng W, Dehal P, Wang J, Durbin R. TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res 2006; 34:D572-80. [PMID: 16381935 PMCID: PMC1347480 DOI: 10.1093/nar/gkj118] [Citation(s) in RCA: 386] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
TreeFam is a database of phylogenetic trees of gene families found in animals. It aims to develop a curated resource that presents the accurate evolutionary history of all animal gene families, as well as reliable ortholog and paralog assignments. Curated families are being added progressively, based on seed alignments and trees in a similar fashion to Pfam. Release 1.1 of TreeFam contains curated trees for 690 families and automatically generated trees for another 11 646 families. These represent over 128 000 genes from nine fully sequenced animal genomes and over 45 000 other animal proteins from UniProt; ∼40–85% of proteins encoded in the fully sequenced animal genomes are included in TreeFam. TreeFam is freely available at and .
Collapse
Affiliation(s)
- Heng Li
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
- Institute of Theoretical Physics, Chinese Academy of SciencesBeijing 100080, China
- Institute of Human Genetics, University of AarhusDK-8000 Aarhus C, Denmark
| | - Avril Coghlan
- Wellcome Trust Sanger InstituteWellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Jue Ruan
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
| | - Lachlan James Coin
- Wellcome Trust Sanger InstituteWellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Jean-Karim Hériché
- Wellcome Trust Sanger InstituteWellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Lara Osmotherly
- Wellcome Trust Sanger InstituteWellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Ruiqiang Li
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
- Department of Biochemistry and Molecular Biology, University of Southern DenmarkDK-5230 Odense M, Denmark
| | - Tao Liu
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
| | - Zhang Zhang
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
- Institute of Computing Technology, Chinese Academy of SciencesBeijing 100080, China
| | - Lars Bolund
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
- Institute of Human Genetics, University of AarhusDK-8000 Aarhus C, Denmark
| | - Gane Ka-Shu Wong
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
- University of Washington Genome Center, Department of Medicine, University of WashingtonSeattle, WA 98195, USA
| | - Weimou Zheng
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
- Institute of Theoretical Physics, Chinese Academy of SciencesBeijing 100080, China
| | - Paramvir Dehal
- Evolutionary Genomics Department, Department of Energy Joint Genome Institute and Lawrence Berkeley National LaboratoryWalnut Creek, California, USA
| | - Jun Wang
- Beijing Institute of Genomics of the Chinese Academy of Sciences, Beijing Genomics InstituteBeijing 101300, China
- Institute of Human Genetics, University of AarhusDK-8000 Aarhus C, Denmark
- Department of Biochemistry and Molecular Biology, University of Southern DenmarkDK-5230 Odense M, Denmark
| | - Richard Durbin
- Wellcome Trust Sanger InstituteWellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
- To whom correspondence should be addressed. Tel: +44 1223 834244; Fax: +44 1223 494919;
| |
Collapse
|
27
|
Wu T, Wang J, Liu C, Zhang Y, Shi B, Zhu X, Zhang Z, Skogerbø G, Chen L, Lu H, Zhao Y, Chen R. NPInter: the noncoding RNAs and protein related biomacromolecules interaction database. Nucleic Acids Res 2006; 34:D150-2. [PMID: 16381834 PMCID: PMC1347388 DOI: 10.1093/nar/gkj025] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The noncoding RNAs and protein related biomacromolecules interaction database (NPInter; http://bioinfo.ibp.ac.cn/NPInter or http://www.bioinfo.org.cn/NPInter) is a database that documents experimentally determined functional interactions between noncoding RNAs (ncRNAs) and protein related biomacromolecules (PRMs) (proteins, mRNAs or genomic DNAs). NPInter intends to provide the scientific community with a comprehensive and integrated tool for efficient browsing and extraction of information on interactions between ncRNAs and PRMs. Beyond cataloguing details of these interactions, the NPInter will be useful for understanding ncRNA function, as it adds a very important functional element, ncRNAs, to the biomolecule interaction network and sets up a bridge between the coding and the noncoding kingdoms.
Collapse
Affiliation(s)
- Tao Wu
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Jie Wang
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Changning Liu
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Institute of Computing TechnologyBeijing 100080, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Yong Zhang
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Baochen Shi
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Xiaopeng Zhu
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Zhihua Zhang
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Geir Skogerbø
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
| | - Lan Chen
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Institute of Computing TechnologyBeijing 100080, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Hongchao Lu
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Institute of Computing TechnologyBeijing 100080, China
- Graduate School of the Chinese Academy of SciencesBeijing, China
| | - Yi Zhao
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Institute of Computing TechnologyBeijing 100080, China
| | - Runsheng Chen
- Bioinformatics Laboratory, Institute of Biophysics, Chinese Academy of SciencesBeijing 100101, China
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Institute of Computing TechnologyBeijing 100080, China
- To whom correspondence should be addressed. Tel: +86 10 6488 8543; Fax: +86 10 6487 7837;
| |
Collapse
|
28
|
Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Mazumder R, O'Donovan C, Redaschi N, Suzek B. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res 2006; 34:D187-91. [PMID: 16381842 PMCID: PMC1347523 DOI: 10.1093/nar/gkj161] [Citation(s) in RCA: 765] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at or downloaded at .
Collapse
Affiliation(s)
| | - Rolf Apweiler
- The EMBL Outstation, The European Bioinformatics InstituteWellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- To whom correspondence should be addressed. Tel: +44 1223 494435; Fax: +44 1223 494468;
| | - Amos Bairoch
- Swiss Institute of Bioinformatics, Centre Medical Universitaire1 rue Michel Servet, 1211 Geneva 4, Switzerland
| | | | - Winona C. Barker
- National Biomedical Research Foundation3900 Reservoir Road, NW, Washington, DC 20057-1414, USA
| | - Brigitte Boeckmann
- Swiss Institute of Bioinformatics, Centre Medical Universitaire1 rue Michel Servet, 1211 Geneva 4, Switzerland
| | - Serenella Ferro
- Swiss Institute of Bioinformatics, Centre Medical Universitaire1 rue Michel Servet, 1211 Geneva 4, Switzerland
| | - Elisabeth Gasteiger
- Swiss Institute of Bioinformatics, Centre Medical Universitaire1 rue Michel Servet, 1211 Geneva 4, Switzerland
| | | | - Rodrigo Lopez
- The EMBL Outstation, The European Bioinformatics InstituteWellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michele Magrane
- The EMBL Outstation, The European Bioinformatics InstituteWellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maria J. Martin
- The EMBL Outstation, The European Bioinformatics InstituteWellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Claire O'Donovan
- The EMBL Outstation, The European Bioinformatics InstituteWellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicole Redaschi
- Swiss Institute of Bioinformatics, Centre Medical Universitaire1 rue Michel Servet, 1211 Geneva 4, Switzerland
| | | |
Collapse
|
29
|
Stover NA, Krieger CJ, Binkley G, Dong Q, Fisk DG, Nash R, Sethuraman A, Weng S, Cherry JM. Tetrahymena Genome Database (TGD): a new genomic resource for Tetrahymena thermophila research. Nucleic Acids Res 2006; 34:D500-3. [PMID: 16381920 PMCID: PMC1347417 DOI: 10.1093/nar/gkj054] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
We have developed a web-based resource (available at ) for researchers studying the model ciliate organism Tetrahymena thermophila. Employing the underlying database structure and programming of the Saccharomyces Genome Database, the Tetrahymena Genome Database (TGD) integrates the wealth of knowledge generated by the Tetrahymena research community about genome structure, genes and gene products with the newly sequenced macronuclear genome determined by The Institute for Genomic Research (TIGR). TGD provides information curated from the literature about each published gene, including a standardized gene name, a link to the genomic locus in our graphical genome browser, gene product annotations utilizing the Gene Ontology, links to published literature about the gene and more. TGD also displays automatic annotations generated for the gene models predicted by TIGR. A variety of tools are available at TGD for searching the Tetrahymena genome, its literature and information about members of the research community.
Collapse
Affiliation(s)
- Nicholas A Stover
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Dolinski K, Botstein D. Changing perspectives in yeast research nearly a decade after the genome sequence. Genome Res 2006; 15:1611-9. [PMID: 16339358 DOI: 10.1101/gr.3727505] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Research with budding yeast (Saccharomyces cerevisiae) has been transformed by the publication, nearly a decade ago, of the entire genome DNA sequence. The introduction of this first eukaryotic genomic sequence changed the yeast research environment significantly, not just because of dramatic progress in technical means but also because the sequence made accessible a new class of scientific questions. A central goal of yeast research remains the determination of the biological role of every sequence feature in the yeast genome. The most remarkable change has been the shift in perspective from focus on individual genes and functionalities to a more global view of how the cellular networks and systems interact and function together to produce the highly evolved organism we see today.
Collapse
Affiliation(s)
- Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544 USA
| | | |
Collapse
|
31
|
Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Geer LY, Helmberg W, Kapustin Y, Kenton DL, Khovayko O, Lipman DJ, Madden TL, Maglott DR, Ostell J, Pruitt KD, Schuler GD, Schriml LM, Sequeira E, Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Suzek TO, Tatusov R, Tatusova TA, Wagner L, Yaschenko E. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2006; 34:D173-80. [PMID: 16381840 PMCID: PMC1347520 DOI: 10.1093/nar/gkj158] [Citation(s) in RCA: 396] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2005] [Revised: 10/03/2005] [Accepted: 10/31/2005] [Indexed: 12/31/2022] Open
Abstract
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
Collapse
Affiliation(s)
- David L Wheeler
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Tanabe L, Thom LH, Matten W, Comeau DC, Wilbur WJ. SemCat: semantically categorized entities for genomics. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2006; 2006:754-8. [PMID: 17238442 PMCID: PMC1839293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
We describe the construction of a semantic database called SemCat consisting of a large number of semantically categorized names relevant to genomics. SemCat can be used to facilitate natural language processing in MEDLINE. We present suitable application areas including biomedical name classification and named entity recognition.
Collapse
Affiliation(s)
- Lorraine Tanabe
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA
| | | | | | | | | |
Collapse
|
33
|
Ruepp A, Mewes HW. Prediction and classification of protein functions. DRUG DISCOVERY TODAY. TECHNOLOGIES 2006; 3:145-151. [PMID: 24980401 DOI: 10.1016/j.ddtec.2006.06.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Data from large-scale genome projects, transcriptomics and proteomics experiments have provided scientists with a wealth of information establishing the basis for the investigation of cellular processes. To understand biological function beyond the single gene by the discovery and characterization of functional protein networks, bioinformatics analysis requires information about two additional attributes associated with the gene products: (i) high-level protein function prediction of experimentally uncharacterized proteins and (ii) systematic classification of protein function. This article describes the basic properties of protein classification systems and discusses examples of their implementation.:
Collapse
Affiliation(s)
- Andreas Ruepp
- Institute for Bioinformatics (MIPS), GSF National Research Center for Environment and Health, Ingolstaedter Landstraße 1, D-85764 Neuherberg, Germany
| | - H Werner Mewes
- Technische Universität München, Chair of Genome Oriented Bioinformatics, Center of Life and Food Science, D-85350 Freising-Weihenstephan, Germany.
| |
Collapse
|
34
|
Abstract
Genetic interactions provide information about genes and processes with overlapping functions in biological systems. For Saccharomyces cerevisiae, computational integration of multiple types of functional genomic data is used to generate genome-wide predictions of genetic interactions. However, this methodology cannot be applied to the vastly more complex genome of metazoans, and only recently has the first metazoan genome-wide prediction of genetic interactions been reported. The prediction for Caenorhabditis elegans was generated by computationally integrating functional genomic data from S. cerevisiae, C. elegans and Drosophila melanogaster. This achievement is an important step toward system-level understanding of biological systems and human diseases.
Collapse
Affiliation(s)
- Shuichi Onami
- Computational and Experimental Systems Biology Group, RIKEN Genomic Sciences Center, Tsurumi, Yokohama 230-0045, Japan.
| | | |
Collapse
|
35
|
Saerens SMG, Verstrepen KJ, Van Laere SDM, Voet ARD, Van Dijck P, Delvaux FR, Thevelein JM. The Saccharomyces cerevisiae EHT1 and EEB1 genes encode novel enzymes with medium-chain fatty acid ethyl ester synthesis and hydrolysis capacity. J Biol Chem 2005; 281:4446-56. [PMID: 16361250 DOI: 10.1074/jbc.m512028200] [Citation(s) in RCA: 200] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Fatty acid ethyl esters are secondary metabolites produced by Saccharomyces cerevisiae and many other fungi. Their natural physiological role is not known but in fermentations of alcoholic beverages and other food products they play a key role as flavor compounds. Information about the metabolic pathways and enzymology of fatty acid ethyl ester biosynthesis, however, is very limited. In this work, we have investigated the role of a three-member S. cerevisiae gene family with moderately divergent sequences (YBR177c/EHT1, YPL095c/EEB1, and YMR210w). We demonstrate that two family members encode an acyl-coenzymeA:ethanol O-acyltransferase, an enzyme required for the synthesis of medium-chain fatty acid ethyl esters. Deletion of either one or both of these genes resulted in severely reduced medium-chain fatty acid ethyl ester production. Purified glutathione S-transferase-tagged Eht1 and Eeb1 proteins both exhibited acyl-coenzymeA:ethanol O-acyltransferase activity in vitro, as well as esterase activity. Overexpression of Eht1 and Eeb1 did not enhance medium-chain fatty acid ethyl ester content, which is probably due to the bifunctional synthesis and hydrolysis activity. Molecular modeling of Eht1 and Eeb1 revealed the presence of a alpha/beta-hydrolase fold, which is generally present in the substrate-binding site of esterase enzymes. Hence, our results identify Eht1 and Eeb1 as novel acyl-coenzymeA:ethanol O-acyltransferases/esterases, whereas the third family member, Ymr210w, does not seem to play an important role in medium-chain fatty acid ethyl ester formation.
Collapse
Affiliation(s)
- Sofie M G Saerens
- Centre for Food and Microbial Technology, Department of Microbial and Molecular Systems, Katholieke Universiteit Leuven, Heverlee, Belgium.
| | | | | | | | | | | | | |
Collapse
|
36
|
Abstract
Telomeres are multifunctional genetic elements that cap chromosome ends, playing essential roles in genome stability, chromosome higher-order organization and proliferation control. The telomere field has largely benefited from the study of unicellular eukaryotic organisms such as yeasts. Easy cultivation in laboratory conditions and powerful genetics have placed mainly Saccharomyces cerevisiae, Kluveromyces lactis and Schizosaccharomyces pombe as crucial model organisms for telomere biology research. Studies in these species have made it possible to elucidate the basic mechanisms of telomere maintenance, function and evolution. Moreover, comparative genomic analyses show that telomeres have evolved rapidly among yeast species and functional plasticity emerges as one of the driving forces of this evolution. This provides a precious opportunity to further our understanding of telomere biology.
Collapse
Affiliation(s)
- M T Teixeira
- Laboratoire de Biologie Moléculaire de la Cellule of Ecole Normale Supérieure de Lyon, UMR CNRS/INRA/ENS, IFR 128 BioSciences Lyon Gerland, 46 Allée d'Italie, 69364 Lyon cedex 07, France.
| | | |
Collapse
|
37
|
Schuldiner M, Collins SR, Thompson NJ, Denic V, Bhamidipati A, Punna T, Ihmels J, Andrews B, Boone C, Greenblatt JF, Weissman JS, Krogan NJ. Exploration of the Function and Organization of the Yeast Early Secretory Pathway through an Epistatic Miniarray Profile. Cell 2005; 123:507-19. [PMID: 16269340 DOI: 10.1016/j.cell.2005.08.031] [Citation(s) in RCA: 669] [Impact Index Per Article: 35.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2005] [Revised: 08/12/2005] [Accepted: 08/22/2005] [Indexed: 10/25/2022]
Abstract
We present a strategy for generating and analyzing comprehensive genetic-interaction maps, termed E-MAPs (epistatic miniarray profiles), comprising quantitative measures of aggravating or alleviating interactions between gene pairs. Crucial to the interpretation of E-MAPs is their high-density nature made possible by focusing on logically connected gene subsets and including essential genes. Described here is the analysis of an E-MAP of genes acting in the yeast early secretory pathway. Hierarchical clustering, together with novel analytical strategies and experimental verification, revealed or clarified the role of many proteins involved in extensively studied processes such as sphingolipid metabolism and retention of HDEL proteins. At a broader level, analysis of the E-MAP delineated pathway organization and components of physical complexes and illustrated the interconnection between the various secretory processes. Extension of this strategy to other logically connected gene subsets in yeast and higher eukaryotes should provide critical insights into the functional/organizational principles of biological systems.
Collapse
Affiliation(s)
- Maya Schuldiner
- Howard Hughes Medical Institute, Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California 94143, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Lee W, St.Onge RP, Proctor M, Flaherty P, Jordan MI, Arkin AP, Davis RW, Nislow C, Giaever G. Genome-wide requirements for resistance to functionally distinct DNA-damaging agents. PLoS Genet 2005; 1:e24. [PMID: 16121259 PMCID: PMC1189734 DOI: 10.1371/journal.pgen.0010024] [Citation(s) in RCA: 131] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2005] [Accepted: 07/01/2005] [Indexed: 11/18/2022] Open
Abstract
The mechanistic and therapeutic differences in the cellular response to DNA-damaging compounds are not completely understood, despite intense study. To expand our knowledge of DNA damage, we assayed the effects of 12 closely related DNA-damaging agents on the complete pool of approximately 4,700 barcoded homozygous deletion strains of Saccharomyces cerevisiae. In our protocol, deletion strains are pooled together and grown competitively in the presence of compound. Relative strain sensitivity is determined by hybridization of PCR-amplified barcodes to an oligonucleotide array carrying the barcode complements. These screens identified genes in well-characterized DNA-damage-response pathways as well as genes whose role in the DNA-damage response had not been previously established. High-throughput individual growth analysis was used to independently confirm microarray results. Each compound produced a unique genome-wide profile. Analysis of these data allowed us to determine the relative importance of DNA-repair modules for resistance to each of the 12 profiled compounds. Clustering the data for 12 distinct compounds uncovered both known and novel functional interactions that comprise the DNA-damage response and allowed us to define the genetic determinants required for repair of interstrand cross-links. Further genetic analysis allowed determination of epistasis for one of these functional groups.
Collapse
Affiliation(s)
- William Lee
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
| | | | - Michael Proctor
- Department of Biochemistry, Stanford University School of Medicine, Stanford Genome Technology Center, Palo Alto, California, United States of America
| | - Patrick Flaherty
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, California, United States of America
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Michael I Jordan
- Division of Computer Science, Department of Statistics, University of California, Berkeley, California, United States of America
| | - Adam P Arkin
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Howard Hughes Medical Institute, Department of Bioengineering, University of California, Berkeley, California, United States of America
| | - Ronald W Davis
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
- Department of Biochemistry, Stanford University School of Medicine, Stanford Genome Technology Center, Palo Alto, California, United States of America
| | - Corey Nislow
- Department of Biochemistry, Stanford University School of Medicine, Stanford Genome Technology Center, Palo Alto, California, United States of America
| | - Guri Giaever
- Department of Biochemistry, Stanford University School of Medicine, Stanford Genome Technology Center, Palo Alto, California, United States of America
- *To whom correspondence should be addressed. E-mail:
| |
Collapse
|
39
|
Saito TL, Sese J, Nakatani Y, Sano F, Yukawa M, Ohya Y, Morishita S. Data mining tools for the Saccharomyces cerevisiae morphological database. Nucleic Acids Res 2005; 33:W753-7. [PMID: 15980577 PMCID: PMC1160212 DOI: 10.1093/nar/gki451] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
For comprehensive understanding of precise morphological changes resulting from loss-of-function mutagenesis, a large collection of 1 899 247 cell images was assembled from 91 271 micrographs of 4782 budding yeast disruptants of non-lethal genes. All the cell images were processed computationally to measure ∼500 morphological parameters in individual mutants. We have recently made this morphological quantitative data available to the public through the Saccharomyces cerevisiae Morphological Database (SCMD). Inspecting the significance of morphological discrepancies between the wild type and the mutants is expected to provide clues to uncover genes that are relevant to the biological processes producing a particular morphology. To facilitate such intensive data mining, a suite of new software tools for visualizing parameter value distributions was developed to present mutants with significant changes in easily understandable forms. In addition, for a given group of mutants associated with a particular function, the system automatically identifies a combination of multiple morphological parameters that discriminates a mutant group from others significantly, thereby characterizing the function effectively. These data mining functions are available through the World Wide Web at .
Collapse
Affiliation(s)
- Taro L. Saito
- Department of Computer Science, Graduate School of Information Science and Technology, University of Tokyo7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
- Japan and Institute for Bioinformatics and Research and Development, Japan Science and Technology CorporationScience Plaza, 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-8666, Japan
| | - Jun Sese
- Department of Computational Biology, Graduate School of Frontier Sciences, University of TokyoBuilding FSB-101, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | - Yoichiro Nakatani
- Department of Computational Biology, Graduate School of Frontier Sciences, University of TokyoBuilding FSB-101, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
- Japan and Institute for Bioinformatics and Research and Development, Japan Science and Technology CorporationScience Plaza, 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-8666, Japan
| | - Fumi Sano
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of TokyoBuilding FSB-101, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
- Japan and Institute for Bioinformatics and Research and Development, Japan Science and Technology CorporationScience Plaza, 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-8666, Japan
| | - Masashi Yukawa
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of TokyoBuilding FSB-101, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
- Japan and Institute for Bioinformatics and Research and Development, Japan Science and Technology CorporationScience Plaza, 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-8666, Japan
| | - Yoshikazu Ohya
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of TokyoBuilding FSB-101, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
- Japan and Institute for Bioinformatics and Research and Development, Japan Science and Technology CorporationScience Plaza, 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-8666, Japan
| | - Shinichi Morishita
- Department of Computational Biology, Graduate School of Frontier Sciences, University of TokyoBuilding FSB-101, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
- Japan and Institute for Bioinformatics and Research and Development, Japan Science and Technology CorporationScience Plaza, 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-8666, Japan
- To whom correspondence should be addressed. Tel: +81 4 7136 3985; Fax: +81 4 7136 3977;
| |
Collapse
|
40
|
Nieduszynski CA, Blow JJ, Donaldson AD. The requirement of yeast replication origins for pre-replication complex proteins is modulated by transcription. Nucleic Acids Res 2005; 33:2410-20. [PMID: 15860777 PMCID: PMC1087785 DOI: 10.1093/nar/gki539] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The mini-chromosome maintenance proteins Mcm2–7 are essential for DNA replication. They are loaded onto replication origins during G1 phase of the cell cycle to form a pre-replication complex (pre-RC) that licenses each origin for subsequent initiation. We have investigated the DNA elements that determine the dependence of yeast replication origins on Mcm2–7 activity, i.e. the sensitivity of an origin to mcm mutations. Using chimaeric constructs from mcm sensitive and mcm insensitive origins, we have identified two main elements affecting the requirement for Mcm2–7 function. First, transcription into an origin increases its dependence on Mcm2–7 function, revealing a conflict between pre-RC assembly and transcription. Second, sequence elements within the minimal origin influence its mcm sensitivity. Replication origins show similar differences in sensitivity to mutations in other pre-RC proteins (such as Origin Recognition Complex and Cdc6), but not to mutations in initiation and elongation factors, demonstrating that the mcm sensitivity of an origin is determined by its ability to establish a pre-RC. We propose that there is a hierarchy of replication origins with respect to the range of pre-RC protein concentrations under which they will function. This hierarchy is both ‘hard-wired’ by the minimal origin sequences and ‘soft-wired’ by local transcriptional context.
Collapse
Affiliation(s)
| | - J. Julian Blow
- Cancer Research UK Chromosome Replication Research Group, Wellcome Trust Biocentre, University of DundeeDow Street, Dundee DD1 5EH, Scotland, UK
| | - Anne D. Donaldson
- To whom correspondence should be addressed. Tel: +44 0 1224 550975; Fax: +44 0 1224 555844;
| |
Collapse
|