1
|
Sun X, Xiao CL, Ge R, Yin X, Li H, Li N, Yang X, Zhu Y, He X, He QY. Putative copper- and zinc-binding motifs in Streptococcus pneumoniae identified by immobilized metal affinity chromatography and mass spectrometry. Proteomics 2011; 11:3288-98. [PMID: 21751346 DOI: 10.1002/pmic.201000396] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Revised: 05/04/2011] [Accepted: 05/11/2011] [Indexed: 11/09/2022]
Abstract
The aim of metalloproteomics is to identify and characterize putative metal-binding proteins and metal-binding motifs. In this study, we performed a systematical metalloproteomic analysis on Streptococcus pneumoniae through the combined use of efficient immobilized metal affinity chromatography enrichment and high-accuracy linear ion trap-Orbitrap MS to identify metal-binding proteins and metal-binding peptides. In total, 232 and 166 putative metal-binding proteins were respectively isolated by Cu- and Zn-immobilized metal affinity chromatography columns, in which 133 proteins were present in both preparations. The putative metalloproteins are mainly involved in protein, nucleotide and carbon metabolisms, oxidation and cell cycle regulation. Based on the sequence of the putative Cu- and Zn-binding peptides, putative Cu-binding motifs were identified: H(X)mH (m=0-11), C(X)(2) C, C(X)nH (n=2-4, 6, 9), H(X)iM (i=0-10) and M(X)tM (t=8 or 12), while putative Zn-binding motifs were identified as follows: H(X)mH (m=1-12), H(X)iM (i=0-12), M(X)tM (t=0, 3 and 4), C(X)nH (n=1, 2, 7, 10 and 11). Equilibrium dialysis and inductively coupled plasma-MS experiments confirmed that the artificially synthesized peptides harboring differential identified metal-binding motifs interacted directly with the metal ions. The metalloproteomic study presented here suggests that the comparably large size and diverse functions of the S. pneumoniae metalloproteome may play important roles in various biological processes and thus contribute to the bacterial pathologies.
Collapse
Affiliation(s)
- Xuesong Sun
- Institute of Life and Health Engineering/National Engineering and Research Center of Genetic Medicine, Jinan University, Guangzhou, P R China.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Stark C, Breitkreutz BJ, Chatr-Aryamontri A, Boucher L, Oughtred R, Livstone MS, Nixon J, Van Auken K, Wang X, Shi X, Reguly T, Rust JM, Winter A, Dolinski K, Tyers M. The BioGRID Interaction Database: 2011 update. Nucleic Acids Res 2010; 39:D698-704. [PMID: 21071413 PMCID: PMC3013707 DOI: 10.1093/nar/gkq1116] [Citation(s) in RCA: 627] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions.
Collapse
Affiliation(s)
- Chris Stark
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto M5G 1X5, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Sun X, Ge F, Xiao CL, Yin XF, Ge R, Zhang LH, He QY. Phosphoproteomic analysis reveals the multiple roles of phosphorylation in pathogenic bacterium Streptococcus pneumoniae. J Proteome Res 2010; 9:275-82. [PMID: 19894762 DOI: 10.1021/pr900612v] [Citation(s) in RCA: 139] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Recent phosphoproteomic characterizations of Bacillus subtilis, Escherichia coli, Lactococcus lactis, Pseudomonas putida, and Pseudomonas aeruginosa have suggested that protein phosphorylation on serine, threonine, and tyrosine residues is a major regulatory post-translational modification in bacteria. In this study, we carried out a global and site-specific phosphoproteomic analysis on the Gram-positive pathogenic bacterium Streptococcus pneumoniae. One hundred and two unique phosphopeptides and 163 phosphorylation sites with distributions of 47%/44%/9% for Ser/Thr/Tyr phosphorylations from 84 S. pneumoniae proteins were identified through the combined use of TiO(2) enrichment and LC-MS/MS determination. The identified phosphoproteins were found to be involved in various biological processes including carbon/protein/nucleotide metabolisms, cell cycle and division regulation. A striking characteristic of S. pneumoniae phosphoproteome is the large number of multiple species-specific phosphorylated sites, indicating that high level of protein phosphorylation may play important roles in regulating many metabolic pathways and bacterial virulence.
Collapse
Affiliation(s)
- Xuesong Sun
- Institute of Life and Health Engineering and National Engineering Research Center for Genetic Medicine, Jinan University, Guangzhou 510632, China
| | | | | | | | | | | | | |
Collapse
|
4
|
Leach SM, Tipney H, Feng W, Baumgartner WA, Kasliwal P, Schuyler RP, Williams T, Spritz RA, Hunter L. Biomedical discovery acceleration, with applications to craniofacial development. PLoS Comput Biol 2009; 5:e1000215. [PMID: 19325874 PMCID: PMC2653649 DOI: 10.1371/journal.pcbi.1000215] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2008] [Accepted: 02/12/2009] [Indexed: 01/17/2023] Open
Abstract
The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work.
Collapse
Affiliation(s)
- Sonia M. Leach
- Center for Computational Pharmacology, University of Colorado at Denver, Denver, Colorado, United States of America
| | - Hannah Tipney
- Center for Computational Pharmacology, University of Colorado at Denver, Denver, Colorado, United States of America
| | - Weiguo Feng
- Department of Craniofacial Biology, University of Colorado at Denver, Denver, Colorado, United States of America
| | - William A. Baumgartner
- Center for Computational Pharmacology, University of Colorado at Denver, Denver, Colorado, United States of America
| | - Priyanka Kasliwal
- Center for Computational Pharmacology, University of Colorado at Denver, Denver, Colorado, United States of America
| | - Ronald P. Schuyler
- Center for Computational Pharmacology, University of Colorado at Denver, Denver, Colorado, United States of America
| | - Trevor Williams
- Department of Craniofacial Biology, University of Colorado at Denver, Denver, Colorado, United States of America
| | - Richard A. Spritz
- Human Medical Genetics Program, University of Colorado at Denver, Denver, Colorado, United States of America
| | - Lawrence Hunter
- Center for Computational Pharmacology, University of Colorado at Denver, Denver, Colorado, United States of America
- * E-mail:
| |
Collapse
|
5
|
Breitkreutz BJ, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bähler J, Wood V, Dolinski K, Tyers M. The BioGRID Interaction Database: 2008 update. Nucleic Acids Res 2007; 36:D637-40. [PMID: 18000002 PMCID: PMC2238873 DOI: 10.1093/nar/gkm1001] [Citation(s) in RCA: 463] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Biological General Repository for Interaction Datasets (BioGRID) database (http://www.thebiogrid.org) was developed to house and distribute collections of protein and genetic interactions from major model organism species. BioGRID currently contains over 198 000 interactions from six different species, as derived from both high-throughput studies and conventional focused studies. Through comprehensive curation efforts, BioGRID now includes a virtually complete set of interactions reported to date in the primary literature for both the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe. A number of new features have been added to the BioGRID including an improved user interface to display interactions based on different attributes, a mirror site and a dedicated interaction management system to coordinate curation across different locations. The BioGRID provides interaction data with monthly updates to Saccharomyces Genome Database, Flybase and Entrez Gene. Source code for the BioGRID and the linked Osprey network visualization system is now freely available without restriction.
Collapse
|
6
|
Liu D, Brockman JM, Dass B, Hutchins LN, Singh P, McCarrey JR, MacDonald CC, Graber JH. Systematic variation in mRNA 3'-processing signals during mouse spermatogenesis. Nucleic Acids Res 2006; 35:234-46. [PMID: 17158511 PMCID: PMC1802579 DOI: 10.1093/nar/gkl919] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Gene expression and processing during mouse male germ cell maturation (spermatogenesis) is highly specialized. Previous reports have suggested that there is a high incidence of alternative 3′-processing in male germ cell mRNAs, including reduced usage of the canonical polyadenylation signal, AAUAAA. We used EST libraries generated from mouse testicular cells to identify 3′-processing sites used at various stages of spermatogenesis (spermatogonia, spermatocytes and round spermatids) and testicular somatic Sertoli cells. We assessed differences in 3′-processing characteristics in the testicular samples, compared to control sets of widely used 3′-processing sites. Using a new method for comparison of degenerate regulatory elements between sequence samples, we identified significant changes in the use of putative 3′-processing regulatory sequence elements in all spermatogenic cell types. In addition, we observed a trend towards truncated 3′-untranslated regions (3′-UTRs), with the most significant differences apparent in round spermatids. In contrast, Sertoli cells displayed a much smaller trend towards 3′-UTR truncation and no significant difference in 3′-processing regulatory sequences. Finally, we identified a number of genes encoding mRNAs that were specifically subject to alternative 3′-processing during meiosis and postmeiotic development. Our results highlight developmental differences in polyadenylation site choice and in the elements that likely control them during spermatogenesis.
Collapse
Affiliation(s)
- Donglin Liu
- The Jackson Laboratory, 600 Main StreetBar Harbor, ME 04609, USA
| | - J. Michael Brockman
- The Jackson Laboratory, 600 Main StreetBar Harbor, ME 04609, USA
- Bioinformatics Program, Boston University24 Cummington Street, Boston, MA 02215, USA
| | - Brinda Dass
- Department of Cell Biology and Biochemistry, Texas Tech University Health Sciences CenterLubbock, TX 79430, USA
| | | | - Priyam Singh
- The Jackson Laboratory, 600 Main StreetBar Harbor, ME 04609, USA
- Bioinformatics Program, Boston University24 Cummington Street, Boston, MA 02215, USA
| | - John R. McCarrey
- Department of Biology, University of Texas at San AntonioSan Antonio, TX 78249, USA
| | - Clinton C. MacDonald
- Department of Cell Biology and Biochemistry, Texas Tech University Health Sciences CenterLubbock, TX 79430, USA
| | - Joel H. Graber
- The Jackson Laboratory, 600 Main StreetBar Harbor, ME 04609, USA
- Bioinformatics Program, Boston University24 Cummington Street, Boston, MA 02215, USA
- To whom correspondence should be addressed. Tel: +1 207 288 6847; Fax: +1 207 288 6073;
| |
Collapse
|
7
|
Chiang JH, Shin JW, Liu HH, Chin CL. GeneLibrarian: an effective gene-information summarization and visualization system. BMC Bioinformatics 2006; 7:392. [PMID: 16939640 PMCID: PMC1564044 DOI: 10.1186/1471-2105-7-392] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2006] [Accepted: 08/29/2006] [Indexed: 11/12/2022] Open
Abstract
Background Abundant information about gene products is stored in online searchable databases such as annotation or literature. To efficiently obtain and digest such information, there is a pressing need for automated information-summarization and functional-similarity clustering of genes. Results We have developed a novel method for semantic measurement of annotation and integrated it with a biomedical literature summarization system to establish a platform, GeneLibrarian, to provide users well-organized information about any specific group of genes (e.g. one cluster of genes from a microarray chip) they might be interested in. The GeneLibrarian generates a summarized viewgraph of candidate genes for a user based on his/her preference and delivers the desired background information effectively to the user. The summarization technique involves optimizing the text mining algorithm and Gene Ontology-based clustering method to enable the discovery of gene relations. Conclusion GeneLibrarian is a Java-based web application that automates the process of retrieving critical information from the literature and expanding the number of potential genes for further analysis. This study concentrates on providing well organized information to users and we believe that will be useful in their researches. GeneLibrarian is available on
Collapse
Affiliation(s)
- Jung-Hsien Chiang
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Jyh-Wei Shin
- Department of Parasitology, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Heng-Hui Liu
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Chong-Liang Chin
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
8
|
Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hon GC, Myers CL, Parsons A, Friesen H, Oughtred R, Tong A, Stark C, Ho Y, Botstein D, Andrews B, Boone C, Troyanskya OG, Ideker T, Dolinski K, Batada NN, Tyers M. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol 2006; 5:11. [PMID: 16762047 PMCID: PMC1561585 DOI: 10.1186/jbiol36] [Citation(s) in RCA: 222] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2005] [Revised: 03/17/2006] [Accepted: 03/30/2006] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND The study of complex biological networks and prediction of gene function has been enabled by high-throughput (HTP) methods for detection of genetic and protein interactions. Sparse coverage in HTP datasets may, however, distort network properties and confound predictions. Although a vast number of well substantiated interactions are recorded in the scientific literature, these data have not yet been distilled into networks that enable system-level inference. RESULTS We describe here a comprehensive database of genetic and protein interactions, and associated experimental evidence, for the budding yeast Saccharomyces cerevisiae, as manually curated from over 31,793 abstracts and online publications. This literature-curated (LC) dataset contains 33,311 interactions, on the order of all extant HTP datasets combined. Surprisingly, HTP protein-interaction datasets currently achieve only around 14% coverage of the interactions in the literature. The LC network nevertheless shares attributes with HTP networks, including scale-free connectivity and correlations between interactions, abundance, localization, and expression. We find that essential genes or proteins are enriched for interactions with other essential genes or proteins, suggesting that the global network may be functionally unified. This interconnectivity is supported by a substantial overlap of protein and genetic interactions in the LC dataset. We show that the LC dataset considerably improves the predictive power of network-analysis approaches. The full LC dataset is available at the BioGRID (http://www.thebiogrid.org) and SGD (http://www.yeastgenome.org/) databases. CONCLUSION Comprehensive datasets of biological interactions derived from the primary literature provide critical benchmarks for HTP methods, augment functional prediction, and reveal system-level attributes of biological networks.
Collapse
Affiliation(s)
- Teresa Reguly
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto ON M5G 1X5, Canada
| | - Ashton Breitkreutz
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto ON M5G 1X5, Canada
| | - Lorrie Boucher
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto ON M5G 1X5, Canada
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto ON M5S 1A8, Canada
| | - Bobby-Joe Breitkreutz
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto ON M5G 1X5, Canada
| | - Gary C Hon
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0412, USA
| | - Chad L Myers
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Washington Road, Princeton, NJ 08544, USA
- Department of Computer Science, Princeton University, NJ 08544, USA
| | - Ainslie Parsons
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto ON M5S 1A8, Canada
- Banting and Best Department of Medical Research, University of Toronto, Toronto ON M5G 1L6, Canada
| | - Helena Friesen
- Banting and Best Department of Medical Research, University of Toronto, Toronto ON M5G 1L6, Canada
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Washington Road, Princeton, NJ 08544, USA
| | - Amy Tong
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto ON M5S 1A8, Canada
- Banting and Best Department of Medical Research, University of Toronto, Toronto ON M5G 1L6, Canada
| | - Chris Stark
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto ON M5G 1X5, Canada
| | - Yuen Ho
- Banting and Best Department of Medical Research, University of Toronto, Toronto ON M5G 1L6, Canada
| | - David Botstein
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Washington Road, Princeton, NJ 08544, USA
| | - Brenda Andrews
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto ON M5S 1A8, Canada
- Banting and Best Department of Medical Research, University of Toronto, Toronto ON M5G 1L6, Canada
| | - Charles Boone
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto ON M5S 1A8, Canada
- Banting and Best Department of Medical Research, University of Toronto, Toronto ON M5G 1L6, Canada
| | - Olga G Troyanskya
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Washington Road, Princeton, NJ 08544, USA
- Department of Computer Science, Princeton University, NJ 08544, USA
| | - Trey Ideker
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0412, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Washington Road, Princeton, NJ 08544, USA
| | - Nizar N Batada
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto ON M5G 1X5, Canada
| | - Mike Tyers
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto ON M5G 1X5, Canada
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto ON M5S 1A8, Canada
| |
Collapse
|