101
|
Qian WJ, Camp DG, Smith RD. High-throughput proteomics using Fourier transform ion cyclotron resonance mass spectrometry. Expert Rev Proteomics 2006; 1:87-95. [PMID: 15966802 DOI: 10.1586/14789450.1.1.87] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The advent of high-throughput proteomic technologies for global detection and quantitation of proteins creates new opportunities and challenges for those seeking to gain greater understanding of the cellular machinery. Here, recent advances in high-resolution capillary liquid chromatography coupled to Fourier transform ion cyclotron resonance mass spectrometry are reviewed along with its potential application to high-throughput proteomics. These technological advances combined with quantitative stable isotope labeling methodologies provide powerful tools for expanding our understanding of biology at the system level.
Collapse
Affiliation(s)
- Wei-Jun Qian
- Biological Science Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | | | | |
Collapse
|
102
|
Kislinger T, Emili A. Multidimensional protein identification technology: current status and future prospects. Expert Rev Proteomics 2006; 2:27-39. [PMID: 15966850 DOI: 10.1586/14789450.2.1.27] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Protein profiling using high-throughput tandem mass spectrometry has become a powerful method for analyzing changes in global protein expression patterns in cells and tissues as a function of developmental, physiologic and disease processes. This review summarizes the utility and practical application of multidimensional protein identification technology as a platform for comprehensive proteomic profiling of complex biologic samples. The strengths and potential problems and limitations associated with this powerful technology are discussed, with an emphasis placed on one of the biggest challenges currently facing large-scale expression profiling projects -- namely, data analysis. Complementary bioinformatic computational data mining strategies, such as clustering, functional annotation and statistical inference, are also discussed as these are increasingly necessary for interpreting the results of global proteomic profiling studies.
Collapse
Affiliation(s)
- Thomas Kislinger
- Banting & Best Department of Medical Research, University of Toronto, Toronto, ON, Canada.
| | | |
Collapse
|
103
|
Roberts TM, Kobor MS, Bastin-Shanower SA, Ii M, Horte SA, Gin JW, Emili A, Rine J, Brill SJ, Brown GW. Slx4 regulates DNA damage checkpoint-dependent phosphorylation of the BRCT domain protein Rtt107/Esc4. Mol Biol Cell 2005; 17:539-48. [PMID: 16267268 PMCID: PMC1345688 DOI: 10.1091/mbc.e05-08-0785] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
RTT107 (ESC4, YHR154W) encodes a BRCA1 C-terminal-domain protein that is important for recovery from DNA damage during S phase. Rtt107 is a substrate of the checkpoint protein kinase Mec1, although the mechanism by which Rtt107 is targeted by Mec1 after checkpoint activation is currently unclear. Slx4, a component of the Slx1-Slx4 structure-specific nuclease, formed a complex with Rtt107. Deletion of SLX4 conferred many of the same DNA-repair defects observed in rtt107delta, including DNA damage sensitivity, prolonged DNA damage checkpoint activation, and increased spontaneous DNA damage. These phenotypes were not shared by the Slx4 binding partner Slx1, suggesting that the functions of the Slx4 and Slx1 proteins in the DNA damage response were not identical. Of particular interest, Slx4, but not Slx1, was required for phosphorylation of Rtt107 by Mec1 in vivo, indicating that Slx4 was a mediator of DNA damage-dependent phosphorylation of the checkpoint effector Rtt107. We propose that Slx4 has roles in the DNA damage response that are distinct from the function of Slx1-Slx4 in maintaining rDNA structure and that Slx4-dependent phosphorylation of Rtt107 by Mec1 is critical for replication restart after alkylation damage.
Collapse
Affiliation(s)
- Tania M Roberts
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| | | | | | | | | | | | | | | | | | | |
Collapse
|
104
|
Cagney G, Park S, Chung C, Tong B, O'Dushlaine C, Shields DC, Emili A. Human Tissue Profiling with Multidimensional Protein Identification Technology. J Proteome Res 2005; 4:1757-67. [PMID: 16212430 DOI: 10.1021/pr0500354] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Profiling of tissues and cell types through systematic characterization of expressed genes or proteins shows promise as a basic research tool, and has potential applications in disease diagnosis and classification. We used multidimensional protein identification protein identification technology (MudPIT) to analyze proteomes for enriched nuclear extracts of eight human tissues: brain, heart, liver, lung, muscle, pancreas, spleen, and testis. We show that the method is approximately 80% reproducible. We address issues of relative abundance, tissue-specificity, and selectivity, and the significance of proteins whose expression does not correlate with that of the corresponding mRNA. Surprisingly, most proteins are detected in a single tissue. These proteins tend to fulfill specialist (and potentially tissue-specific) functions compared to proteins expressed in two or more tissues.
Collapse
Affiliation(s)
- Gerard Cagney
- Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | | | | | | | | | |
Collapse
|
105
|
Abstract
The shotgun proteomic strategy based on digesting proteins into peptides and sequencing them using tandem mass spectrometry and automated database searching has become the method of choice for identifying proteins in most large scale studies. However, the peptide-centric nature of shotgun proteomics complicates the analysis and biological interpretation of the data especially in the case of higher eukaryote organisms. The same peptide sequence can be present in multiple different proteins or protein isoforms. Such shared peptides therefore can lead to ambiguities in determining the identities of sample proteins. In this article we illustrate the difficulties of interpreting shotgun proteomic data and discuss the need for common nomenclature and transparent informatic approaches. We also discuss related issues such as the state of protein sequence databases and their role in shotgun proteomic analysis, interpretation of relative peptide quantification data in the presence of multiple protein isoforms, the integration of proteomic and transcriptional data, and the development of a computational infrastructure for the integration of multiple diverse datasets.
Collapse
|
106
|
Kanaeva IP, Petushkova NA, Lisitsa AV, Lokhov PG, Zgoda VG, Karuzina II, Archakov AI. Proteomic and biochemical analysis of the mouse liver microsomes. Toxicol In Vitro 2005; 19:805-12. [PMID: 15908171 DOI: 10.1016/j.tiv.2005.03.016] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2004] [Revised: 02/11/2005] [Accepted: 03/18/2005] [Indexed: 10/25/2022]
Abstract
The efficiency of the proteomic approach for the revelation of proteins, including components of the liver microsomal monooxygenase system (cytochromes b5 and P450) was demonstrated. The liver microsomes and their ghosts (i.e. membranes devoid of "ballast" proteins) were prepared from the control and phenobarbital-treated mice. Microsomes and their ghosts were characterized using the conventional biochemical assay and analysed by one- and two-dimensional electrophoresis (1-DE and 2-DE, respectively) coupled with MALDI-TOF peptide mass fingerprinting procedure. Catalytic activity of cytochromes P450 was measured using specific fluorogenic substrates for CYP1A, CYP2A, CYP2B and CYP2C families. The protein composition of control and phenobarbital-induced ghosts was analysed. The proteomic 2D-based protein separation method enabled us to reveal up to 1005 proteins, the majority of them being soluble. Among the 34 identified proteins, the cytochrome b5-like protein was revealed; however, cytochromes P450 appeared to be undetectable under 2-DE separation conditions. The separation of microsomal ghosts proteins by 1-DE, followed by mass-spectrometric analysis of bands from the 45 to 66 kDa gel range made it possible to identify hydrophobic proteins including cytochromes P450 (CYP2A4 and CYP2A5) and dimethylaniline monooxygenase. The high O-deethylation rate of 7-ethoxycoumarin-a substrate for rodent CYPs 2A and 2B, in particular for CYP2A5-was observed, in agreement with the results of mass-spectrometric identification. Collectively, the data obtained indicate that a combination of enzyme activity assays and various protein separation techniques coupled with mass-spectrometric protein identification allows a more comprehensive insight into the machinery of the cellular detoxifying system.
Collapse
Affiliation(s)
- I P Kanaeva
- V.N. Orekhovich Institute of Biomedical Chemistry, Russian Academy of Medical Sciences, 119121, Pogodinskaya St., 10, Moscow, Russia
| | | | | | | | | | | | | |
Collapse
|
107
|
Kislinger T, Gramolini AO, MacLennan DH, Emili A. Multidimensional protein identification technology (MudPIT): technical overview of a profiling method optimized for the comprehensive proteomic investigation of normal and diseased heart tissue. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2005; 16:1207-20. [PMID: 15979338 DOI: 10.1016/j.jasms.2005.02.015] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Revised: 12/22/2004] [Accepted: 02/23/2005] [Indexed: 05/03/2023]
Abstract
An optimized analytical expression profiling strategy based on gel-free multidimensional protein identification technology (MudPIT) is reported for the systematic investigation of biochemical (mal)-adaptations associated with healthy and diseased heart tissue. Enhanced shotgun proteomic detection coverage and improved biological inference is achieved by pre-fractionation of excised mouse cardiac muscle into subcellular components, with each organellar fraction investigated exhaustively using multiple repeat MudPIT analyses. Functional-enrichment, high-confidence identification, and relative quantification of hundreds of organelle- and tissue-specific proteins are achieved readily, including detection of low abundance transcriptional regulators, signaling factors, and proteins linked to cardiac disease. Important technical issues relating to data validation, including minimization of artifacts stemming from biased under-sampling and spurious false discovery, together with suggestions for further fine-tuning of sample preparation, are discussed. A framework for follow-up bioinformatic examination, pattern recognition, and data mining is also presented in the context of a stringent application of MudPIT for probing fundamental aspects of heart muscle physiology as well as the discovery of perturbations associated with heart failure.
Collapse
Affiliation(s)
- Thomas Kislinger
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
108
|
Rudnick PA, Wang Y, Evans E, Lee CS, Balgley BM. Large Scale Analysis of MASCOT Results Using a Mass Accuracy-Based THreshold (MATH) Effectively Improves Data Interpretation. J Proteome Res 2005; 4:1353-60. [PMID: 16083287 DOI: 10.1021/pr0500509] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In this report, we take a heuristic approach to studying the effects of mass tolerance settings and database size on the sensitivity and specificity of MASCOT. We also examine the efficacy of the MASCOT Identity Threshold as a discriminator when applied to QqTOF data with an average mass accuracy of 10 ppm or better. As predicted, arbitrarily large mass tolerance settings negatively affect MASCOT's specificity, and to a lesser degree, sensitivity. Increased mass tolerances also render the generation of a significance threshold less effective. To study these effects, we used Bayes' Law to calculate MASCOT's predictive values. With a relatively small search database (Human IPI), MASCOT had a mean positive predictive value of 0.993 when combined with MASCOT's Identity Threshold. However, the corresponding average negative predictive value, or the probability that an ion was not present given no score or a score below threshold, was reduced as mass tolerances were tightened, and had an average value of 0.717. This value was improved upon by extrapolating an empirical threshold using a reversed database search and a new algorithm to rapidly identify false positive identifications. Using the empirical threshold reduced false negative identifications on the average 17% while limiting the false positive rate to below 5%; even larger reductions were obtained using mass tolerances approaching two times the actual error of the experimental data. A simple application of this strategy to the analysis of a microdissected glioblastoma multiforme sample analyzed by IEF/LC-MS/MS is reported, as is a description of the tools required to implement a large scale analysis using this alternative approach.
Collapse
Affiliation(s)
- Paul A Rudnick
- Calibrant Biosystems, 7507 Standish Pl., Rockville, MD 20855, USA.
| | | | | | | | | |
Collapse
|
109
|
Ihling C, Sinz A. Proteome analysis of Escherichia coli using high-performance liquid chromatography and Fourier transform ion cyclotron resonance mass spectrometry. Proteomics 2005; 5:2029-42. [PMID: 15852340 DOI: 10.1002/pmic.200401122] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The basic problem of complexity poses a significant challenge for proteomic studies. To date two-dimensional gel electrophoresis (2-DE) followed by enzymatic in-gel digestion of the peptides, and subsequent identification by mass spectrometry (MS) is the most commonly used method to analyze complex protein mixtures. However, 2-DE is a slow and labor-intensive technique, which is not able to resolve all proteins of a proteome. To overcome these limitations gel-free approaches are developed based on high performance liquid chromatography (HPLC) and Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS). The high resolution and excellent mass accuracy of FT-ICR MS provides a basis for simultaneous analysis of numerous compounds. In the present study, a small protein subfraction of an Escherichia coli cell lysate was prepared by size-exclusion chromatography and proteins were analyzed using C4 reversed phase (RP)-HPLC for pre-separation followed by C18 RP nanoHPLC/nanoESI FT-ICR MS for analysis of the peptide mixtures after tryptic digestion of the protein fractions. We identified 231 proteins and thus demonstrated that a combination of two RP separation steps - one on the protein and one on the peptide level - in combination with high-resolution FT-ICR MS has the potential to become a powerful method for global proteomics studies.
Collapse
Affiliation(s)
- Christian Ihling
- Biotechnological-Biomedical Center, Faculty of Chemistry and Mineralogy, University of Leipzig, Germany
| | | |
Collapse
|
110
|
Kislinger T, Gramolini AO, Pan Y, Rahman K, MacLennan DH, Emili A. Proteome Dynamics during C2C12 Myoblast Differentiation. Mol Cell Proteomics 2005; 4:887-901. [PMID: 15824125 DOI: 10.1074/mcp.m400182-mcp200] [Citation(s) in RCA: 107] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Mouse-derived C2C12 myoblasts serve as an experimentally tractable model system for investigating the molecular basis of skeletal muscle cell specification and development. To examine the biochemical adaptations associated with myocyte formation comprehensively, we used large scale gel-free tandem mass spectrometry to monitor global proteome alterations throughout a time course analysis of the myogenic C2C12 differentiation program. The relative abundance of approximately 1,800 high confidence proteins was tracked across multiple time points using capillary scale multidimensional liquid chromatography coupled to high throughput shotgun sequencing. Hierarchical clustering of the resulting profiles revealed differential waves of expression of proteins linked to intracellular signaling, transcription, cytoarchitecture, adhesion, metabolism, and muscle contraction across the early, mid, and late stages of differentiation. Several hundred previously uncharacterized proteins were likewise detected in a stage-specific manner, suggesting novel roles in myogenesis and/or muscle function. These proteomic data are complementary to recent microarray-based studies of gene expression patterns in developing myotubes and provide a holistic framework for understanding how diverse biochemical processes are coordinated at the cellular level during skeletal muscle development.
Collapse
Affiliation(s)
- Thomas Kislinger
- Program in Proteomics and Bioinformatics, University of Toronto, Toronto, Ontario M5S 3E2, Canada
| | | | | | | | | | | |
Collapse
|
111
|
Bhogal N, Grindon C, Combes R, Balls M. Toxicity testing: creating a revolution based on new technologies. Trends Biotechnol 2005; 23:299-307. [PMID: 15922082 DOI: 10.1016/j.tibtech.2005.04.006] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2004] [Revised: 10/26/2004] [Accepted: 04/05/2005] [Indexed: 11/16/2022]
Abstract
Biotechnology is evolving at a tremendous rate. Although drug discovery is now heavily focused on high throughput and miniaturized screening, the application of these advances to the toxicological assessment of chemicals and chemical products has been slow. Nevertheless, the impending surge in demands for the regulatory toxicity testing of chemicals provides the impetus for the incorporation of novel methodologies into hazard identification and risk assessment. Here, we review the current and likely future value of these new technologies in relation to toxicological evaluation and the protection of human health.
Collapse
Affiliation(s)
- Nirmala Bhogal
- FRAME (Fund for the Replacement of Animals in Medical Experiments), Russell and Burch House, 96-98 North Sherwood Street, Nottingham NG1 4EE, UK
| | | | | | | |
Collapse
|
112
|
Camon EB, Barrell DG, Dimmer EC, Lee V, Magrane M, Maslen J, Binns D, Apweiler R. An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 2005; 6 Suppl 1:S17. [PMID: 15960829 PMCID: PMC1869009 DOI: 10.1186/1471-2105-6-s1-s17] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Gene Ontology Annotation (GOA) database http://www.ebi.ac.uk/GOA aims to provide high-quality supplementary GO annotation to proteins in the UniProt Knowledgebase. Like many other biological databases, GOA gathers much of its content from the careful manual curation of literature. However, as both the volume of literature and of proteins requiring characterization increases, the manual processing capability can become overloaded. Consequently, semi-automated aids are often employed to expedite the curation process. Traditionally, electronic techniques in GOA depend largely on exploiting the knowledge in existing resources such as InterPro. However, in recent years, text mining has been hailed as a potentially useful tool to aid the curation process. To encourage the development of such tools, the GOA team at EBI agreed to take part in the functional annotation task of the BioCreAtIvE (Critical Assessment of Information Extraction systems in Biology) challenge. BioCreAtIvE task 2 was an experiment to test if automatically derived classification using information retrieval and extraction could assist expert biologists in the annotation of the GO vocabulary to the proteins in the UniProt Knowledgebase. GOA provided the training corpus of over 9000 manual GO annotations extracted from the literature. For the test set, we provided a corpus of 200 new Journal of Biological Chemistry articles used to annotate 286 human proteins with GO terms. A team of experts manually evaluated the results of 9 participating groups, each of which provided highlighted sentences to support their GO and protein annotation predictions. Here, we give a biological perspective on the evaluation, explain how we annotate GO using literature and offer some suggestions to improve the precision of future text-retrieval and extraction techniques. Finally, we provide the results of the first inter-annotator agreement study for manual GO curation, as well as an assessment of our current electronic GO annotation strategies. RESULTS The GOA database currently extracts GO annotation from the literature with 91 to 100% precision, and at least 72% recall. This creates a particularly high threshold for text mining systems which in BioCreAtIvE task 2 (GO annotation extraction and retrieval) initial results precisely predicted GO terms only 10 to 20% of the time. CONCLUSION Improvements in the performance and accuracy of text mining for GO terms should be expected in the next BioCreAtIvE challenge. In the meantime the manual and electronic GO annotation strategies already employed by GOA will provide high quality annotations.
Collapse
Affiliation(s)
- Evelyn B Camon
- European Molecular Biology Laboratory, European Bionformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | | | | | | | | | | | | | | |
Collapse
|
113
|
Sadygov RG, Cociorva D, Yates JR. Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nat Methods 2005; 1:195-202. [PMID: 15789030 DOI: 10.1038/nmeth725] [Citation(s) in RCA: 274] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Database searching is an essential element of large-scale proteomics. Because these methods are widely used, it is important to understand the rationale of the algorithms. Most algorithms are based on concepts first developed in SEQUEST and PeptideSearch. Four basic approaches are used to determine a match between a spectrum and sequence: descriptive, interpretative, stochastic and probability-based matching. We review the basic concepts used by most search algorithms, the computational modeling of peptide identification and current challenges and limitations of this approach for protein identification.
Collapse
Affiliation(s)
- Rovshan G Sadygov
- Department of Cell Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
114
|
Sandhu C, Connor M, Kislinger T, Slingerland J, Emili A. Global Protein Shotgun Expression Profiling of Proliferating MCF-7 Breast Cancer Cells. J Proteome Res 2005; 4:674-89. [PMID: 15952714 DOI: 10.1021/pr0498842] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Protein expression becomes altered in breast epithelium during malignant transformation. Knowledge of these perturbations should provide insight into the molecular basis of breast cancer, as well as reveal possible new therapeutic targets. To this end, we have performed an extensive comparative proteomic survey of global protein expression patterns in proliferating MCF-7 breast cancer cells and normal human mammary epithelial cells using gel-free shotgun tandem mass spectrometry. Pathophysiological alterations associated with the malignant breast cancer phenotype were detected, including differences in the apparent levels of key regulators of the cell cycle, signal transduction, apoptosis, transcriptional regulation, and cell metabolism.
Collapse
Affiliation(s)
- Charanjit Sandhu
- Program in Proteomics and Bioinformatics, Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada
| | | | | | | | | |
Collapse
|
115
|
Nagano K, Taoka M, Yamauchi Y, Itagaki C, Shinkawa T, Nunomura K, Okamura N, Takahashi N, Izumi T, Isobe T. Large-scale identification of proteins expressed in mouse embryonic stem cells. Proteomics 2005; 5:1346-61. [PMID: 15742316 DOI: 10.1002/pmic.200400990] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
A protein subset expressed in the mouse embryonic stem (ES) cell line, E14-1, was characterized by mass spectrometry-based protein identification technology and data analysis. In total, 1790 proteins including 365 potential nuclear and 260 membrane proteins were identified from tryptic digests of total cell lysates. The subset contained a variety of proteins in terms of physicochemical characteristics, subcellular localization, and biological function as defined by Gene Ontology annotation groups. In addition to many housekeeping proteins found in common with other cell types, the subset contained a group of regulatory proteins that may determine unique ES cell functions. We identified 39 transcription factors including Oct-3/4, Sox-2, and undifferentiated embryonic cell transcription factor I, which are characteristic of ES cells, 88 plasma membrane proteins including cell surface markers such as CD9 and CD81, 44 potential proteinaceous ligands for cell surface receptors including growth factors, cytokines, and hormones, and 100 cell signaling molecules. The subset also contained the products of 60 ES-specific and 41 stemness genes defined previously by the DNA microarray analysis of Ramalho-Santos et al. (Ramalho-Santos et al., Science 2002, 298, 597-600), as well as a number of components characteristic of differentiated cell types such as hematopoietic and neural cells. We also identified potential post-translational modifications in a number of ES cell proteins including five Lys acetylation sites and a single phosphorylation site. To our knowledge, this study provides the largest proteomic dataset characterized to date for a single mammalian cell species, and serves as a basic catalogue of a major proteomic subset that is expressed in mouse ES cells.
Collapse
Affiliation(s)
- Kohji Nagano
- Division of Proteomics Research, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
116
|
Maziarz M, Chung C, Drucker DJ, Emili A. Integrating global proteomic and genomic expression profiles generated from islet alpha cells: opportunities and challenges to deriving reliable biological inferences. Mol Cell Proteomics 2005; 4:458-74. [PMID: 15741311 DOI: 10.1074/mcp.r500011-mcp200] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Systematic profiling of expressed gene products represents a promising research strategy for elucidating the molecular phenotypes of islet cells. To this end, we have combined complementary genomic and proteomic methods to better assess the molecular composition of murine pancreatic islet glucagon-producing alphaTC-1 cells as a model system, with the expectation of bypassing limitations inherent to either technology alone. Gene expression was measured with an Affymetrix MG_U74Av2 oligonucleotide array, while protein expression was examined by performing high-resolution gel-free shotgun MS/MS on a nuclear-enriched cell extract. Both analyses were carried out in triplicate to control for experimental variability. Using a stringent detection p value cutoff of 0.04, 48% of all potential mRNA transcripts were predicted to be expressed (probes classified as present in at least two of three replicates), while 1,651 proteins were identified with high-confidence using rigorous database searching. Although 762 of 888 cross-referenced cognate mRNA-protein pairs were jointly detected by both platforms, a sizeable number (126) of gene products was detected exclusively by MS alone. Conversely, marginal protein identifications often had convincing microarray support. Based on these findings, we present an operational framework for both interpreting and integrating dual genomic and proteomic datasets so as to obtain a more reliable perspective into islet alpha cell function.
Collapse
Affiliation(s)
- Marlena Maziarz
- Banting and Best Diabtetes Centre, University of Toronto, Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
117
|
Weatherly DB, Atwood JA, Minning TA, Cavola C, Tarleton RL, Orlando R. A Heuristic method for assigning a false-discovery rate for protein identifications from Mascot database search results. Mol Cell Proteomics 2005; 4:762-72. [PMID: 15703444 DOI: 10.1074/mcp.m400215-mcp200] [Citation(s) in RCA: 165] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
MS/MS and database searching has emerged as a valuable technology for rapidly analyzing protein expression, localization, and post-translational modifications. The probability-based search engine Mascot has found widespread use as a tool to correlate tandem mass spectra with peptides in a sequence database. Although the Mascot scoring algorithm provides a probability-based model for peptide identification, the independent peptide scores do not correlate with the significance of the proteins to which they match. Herein, we describe a heuristic method for organizing proteins identified at a specified false-discovery rate using Mascot-matched peptides. We call this method PROVALT, and it uses peptide matches from a random database to calculate false-discovery rates for protein identifications and reduces a complex list of peptide matches to a nonredundant list of homologous protein groups. This method was evaluated using Mascot-identified peptides from a Trypanosoma cruzi epimastigote whole-cell lysate, which was separated by multidimensional LC and analyzed by MS/MS. PROVALT was then compared with the two traditional methods of protein identification when using Mascot, the single peptide score and cumulative protein score methods, and was shown to be superior to both in regards to the number of proteins identified and the inclusion of lower scoring nonrandom peptide matches.
Collapse
Affiliation(s)
- D Brent Weatherly
- Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, Georgia 30602, USA
| | | | | | | | | | | |
Collapse
|
118
|
Qian WJ, Jacobs JM, Camp DG, Monroe ME, Moore RJ, Gritsenko MA, Calvano SE, Lowry SF, Xiao W, Moldawer LL, Davis RW, Tompkins RG, Smith RD. Comparative proteome analyses of human plasma following in vivo lipopolysaccharide administration using multidimensional separations coupled with tandem mass spectrometry. Proteomics 2005; 5:572-84. [PMID: 15627965 PMCID: PMC1781926 DOI: 10.1002/pmic.200400942] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
There is significant interest in characterization of the human plasma proteome due to its potential for providing biomarkers applicable to clinical diagnosis and treatment and for gaining a better understanding of human diseases. We describe here a strategy for comparative proteome analyses of human plasma, which is applicable to biomarker identifications for various disease states. Multidimensional liquid chromatography-mass spectrometry (LC-MS/MS) has been applied to make comparative proteome analyses of plasma samples from an individual prior to and 9 h after lipopolysaccharide (LPS) administration. Peptide peak areas and the number of peptide identifications for each protein were used to evaluate the reproducibility of LC-MS/MS and to compare relative changes in protein concentration between the samples following LPS treatment. A total of 804 distinct plasma proteins (not including immunoglobulins) were confidently identified with 32 proteins observed to be significantly increased in concentration following LPS administration, including several known inflammatory response or acute-phase mediators such as C-reactive protein, serum amyloid A and A2, LPS-binding protein, LPS-responsive and beige-like anchor protein, hepatocyte growth factor activator, and von Willebrand factor, and thus, constituting potential biomarkers for inflammatory response.
Collapse
Affiliation(s)
- Wei-Jun Qian
- Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, P.O. Box 999, MSIN: K8-98, Richland WA 99352, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
119
|
Qian WJ, Liu T, Monroe ME, Strittmatter EF, Jacobs JM, Kangas LJ, Petritis K, Camp DG, Smith RD. Probability-Based Evaluation of Peptide and Protein Identifications from Tandem Mass Spectrometry and SEQUEST Analysis: The Human Proteome. J Proteome Res 2005; 4:53-62. [PMID: 15707357 DOI: 10.1021/pr0498638] [Citation(s) in RCA: 261] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Large-scale protein identifications from highly complex protein mixtures have recently been achieved using multidimensional liquid chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) and subsequent database searching with algorithms such as SEQUEST. Here, we describe a probability-based evaluation of false positive rates associated with peptide identifications from three different human proteome samples. Peptides from human plasma, human mammary epithelial cell (HMEC) lysate, and human hepatocyte (Huh)-7.5 cell lysate were separated by strong cation exchange (SCX) chromatography coupled offline with reversed-phase capillary LC-MS/MS analyses. The MS/MS spectra were first analyzed by SEQUEST, searching independently against both normal and sequence-reversed human protein databases, and the false positive rates of peptide identifications for the three proteome samples were then analyzed and compared. The observed false positive rates of peptide identifications for human plasma were significantly higher than those for the human cell lines when identical filtering criteria were used, suggesting that the false positive rates are significantly dependent on sample characteristics, particularly the number of proteins found within the detectable dynamic range. Two new sets of filtering criteria are proposed for human plasma and human cell lines, respectively, to provide an overall confidence of >95% for peptide identifications. The new criteria were compared, using a normalized elution time (NET) criterion (Petritis et al. Anal. Chem. 2003, 75, 1039-1048), with previously published criteria (Washburn et al. Nat. Biotechnol. 2001, 19, 242-247). The results demonstrate that the present criteria provide significantly higher levels of confidence for peptide identifications from mammalian proteomes without greatly decreasing the number of identifications.
Collapse
Affiliation(s)
- Wei-Jun Qian
- Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
120
|
Schmidt A, Kellermann J, Lottspeich F. A novel strategy for quantitative proteomics using isotope-coded protein labels. Proteomics 2005; 5:4-15. [PMID: 15602776 DOI: 10.1002/pmic.200400873] [Citation(s) in RCA: 366] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Stable isotope labelling in combination with mass spectrometry has emerged as a powerful tool to identify and relatively quantify thousands of proteins within complex protein mixtures. Here we describe a novel method, termed isotope-coded protein label (ICPL), which is capable of high-throughput quantitative proteome profiling on a global scale. Since ICPL is based on stable isotope tagging at the frequent free amino groups of isolated intact proteins, it is applicable to any protein sample, including extracts from tissues or body fluids, and compatible to all separation methods currently employed in proteome studies. The method showed highly accurate and reproducible quantification of proteins and yielded high sequence coverage, indispensable for the detection of post-translational modifications and protein isoforms. The efficiency (e.g. accuracy, dynamic range, sensitivity, speed) of the approach is demonstrated by comparative analysis of two differentially spiked proteomes.
Collapse
|
121
|
Russell SA, Old W, Resing KA, Hunter L. Proteomic informatics. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2004; 61:127-57. [PMID: 15482814 DOI: 10.1016/s0074-7742(04)61006-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- Steven A Russell
- Center for Computational Pharmacology, University of Colorado Health Sciences Center, Aurora, CO 80045, USA
| | | | | | | |
Collapse
|
122
|
Hannich JT, Lewis A, Kroetz MB, Li SJ, Heide H, Emili A, Hochstrasser M. Defining the SUMO-modified proteome by multiple approaches in Saccharomyces cerevisiae. J Biol Chem 2004; 280:4102-10. [PMID: 15590687 DOI: 10.1074/jbc.m413209200] [Citation(s) in RCA: 319] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
SUMO, or Smt3 in Saccharomyces cerevisiae, is a ubiquitin-like protein that is post-translationally attached to multiple proteins in vivo. Many of these substrate modifications are cell cycle-regulated, and SUMO conjugation is essential for viability in most eukaryotes. However, only a limited number of SUMO-modified proteins have been definitively identified to date, and this has hampered study of the mechanisms by which SUMO ligation regulates specific cellular pathways. Here we use a combination of yeast two-hybrid screening, a high copy suppressor selection with a SUMO isopeptidase mutant, and tandem mass spectrometry to define a large set of proteins (>150) that can be modified by SUMO in budding yeast. These three approaches yielded overlapping sets of proteins with the most extensive set by far being those identified by mass spectrometry. The two-hybrid data also yielded a potential SUMO-binding motif. Functional categories of SUMO-modified proteins include SUMO conjugation system enzymes, chromatin- and gene silencing-related factors, DNA repair and genome stability proteins, stress-related proteins, transcription factors, proteins involved in translation and RNA metabolism, and a variety of metabolic enzymes. The results point to a surprisingly broad array of cellular processes regulated by SUMO conjugation and provide a starting point for detailed studies of how SUMO ligation contributes to these different regulatory mechanisms.
Collapse
Affiliation(s)
- J Thomas Hannich
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA
| | | | | | | | | | | | | |
Collapse
|
123
|
Zhang JW, Butland G, Greenblatt JF, Emili A, Zamble DB. A role for SlyD in the Escherichia coli hydrogenase biosynthetic pathway. J Biol Chem 2004; 280:4360-6. [PMID: 15569666 DOI: 10.1074/jbc.m411799200] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The [NiFe] centers at the active sites of the Escherichia coli hydrogenase enzymes are assembled by a team of accessory proteins that includes the products of the hyp genes. To determine whether any other proteins are involved in this process, the sequential peptide affinity system was used. The analysis of the proteins in a complex with HypB revealed the peptidyl-prolyl cis/trans-isomerase SlyD, a metal-binding protein that has not been previously linked to the hydrogenase biosynthetic pathway. The association between HypB and SlyD was confirmed by chemical cross-linking of purified proteins. Deletion of the slyD gene resulted in a marked reduction of the hydrogenase activity in cell extracts prepared from anaerobic cultures, and an in-gel assay was used to demonstrate diminished activities of both hydrogenase 1 and 2. Western analysis revealed a decrease in the final proteolytic processing of the hydrogenase 3 HycE protein, indicating that the metal center was not assembled properly. These deficiencies were all rescued by growth in medium containing excess nickel, but zinc did not have any phenotypic effect. Experiments with radioactive nickel demonstrated that less nickel accumulated in DeltaslyD cells compared with wild type, and overexpression of SlyD from an inducible promoter doubled the level of cellular nickel. These experiments demonstrate that SlyD has a role in the nickel insertion step of the hydrogenase maturation pathway, and the possible functions of SlyD are discussed.
Collapse
Affiliation(s)
- Jie Wei Zhang
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada
| | | | | | | | | |
Collapse
|
124
|
Gramolini AO, Kislinger T, Asahi M, Li W, Emili A, MacLennan DH. Sarcolipin retention in the endoplasmic reticulum depends on its C-terminal RSYQY sequence and its interaction with sarco(endo)plasmic Ca(2+)-ATPases. Proc Natl Acad Sci U S A 2004; 101:16807-12. [PMID: 15556994 PMCID: PMC534750 DOI: 10.1073/pnas.0407815101] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Sarcolipin (SLN) and phospholamban (PLN) are effective inhibitors of the sarco(endo)plasmic reticulum Ca(2+)-ATPase (SERCA). These homologous proteins differ at their N and C termini: the C-terminal Met-Leu-Leu in PLN is replaced by Arg-Ser-Tyr-Gln-Tyr in SLN. The role of the C-terminal sequence of SLN tagged N-terminally with the FLAG epitope (NF-SLN) in endoplasmic reticulum (ER) retention was investigated by transfecting human embryonic kidney-293 cells with cDNAs encoding NF-SLN or a series of NF-SLN mutants in which C-terminal amino acids were deleted progressively. Immunofluorescence and immunoblotting of transfected cells by using anti-FLAG antibodies indicated that NF-SLN and PLN tagged at its N terminus with the FLAG epitope, even when overexpressed, were restricted to the ER. However, C-terminal truncation deletions of SLN, which lacked RSYQY, were not localized to ER and did not inhibit Ca(2+)-dependent Ca2+ uptake by SERCA. The shortest deletion constructs, NF-SLN 1-22 and NF-SLN 1-23, did not express stable protein products. However, all NF-SLN cDNA constructs, including NF-SLN 1-22 and NF-SLN 1-23, were expressed stably and localized to the ER when they were coexpressed with SERCA2a. These results show that NF-SLN subcellular distribution depends on SERCA coexpression and on its luminal, C-terminal RSYQY sequence. By using immunoprecipitation and MS, glucose-regulated protein 78/BiP and glucose-regulated protein 94 were identified as proteins that interact with NF-SLN through the RSYQY sequence. Thus, in the absence of SERCA, retention of NF-SLN in the ER is mediated through its association with other components through the C-terminal RSYQY sequence.
Collapse
Affiliation(s)
- Anthony O Gramolini
- Banting and Best Department of Medical Research, University of Toronto, Charles H. Best Institute, 112 College Street, Toronto, ON, Canada M5G 1L6
| | | | | | | | | | | |
Collapse
|
125
|
Jacobs JM, Mottaz HM, Yu LR, Anderson DJ, Moore RJ, Chen WNU, Auberry KJ, Strittmatter EF, Monroe ME, Thrall BD, Camp DG, Smith RD. Multidimensional proteome analysis of human mammary epithelial cells. J Proteome Res 2004; 3:68-75. [PMID: 14998165 DOI: 10.1021/pr034062a] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Recent multidimensional liquid chromatography MS/MS studies have contributed to the identification of large numbers of expressed proteins for numerous species. The present study couples size exclusion chromatography of intact proteins with the separation of tryptically digested peptides using a combination of strong cation exchange and high resolution, reversed phase capillary chromatography to identify proteins extracted from human mammary epithelial cells (HMECs). In addition to conventional conservative criteria for protein identifications, the confidence levels were additionally increased through the use of peptide normalized elution times (NET) for the liquid chromatographic separation step. The combined approach resulted in a total of 5838 unique peptides identified covering 1574 different proteins with an estimated 4% gene coverage of the human genome, as annotated by the National Center for Biotechnology Information (NCBI). This database provides a baseline for comparison against variations in other genetically and environmentally perturbed systems. Proteins identified were categorized based upon intracellular location and biological process with the identification of numerous receptors, regulatory proteins, and extracellular proteins, demonstrating the usefulness of this application in the global analysis of human cells for future comparative studies.
Collapse
Affiliation(s)
- Jon M Jacobs
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
126
|
Radulovic D, Jelveh S, Ryu S, Hamilton TG, Foss E, Mao Y, Emili A. Informatics platform for global proteomic profiling and biomarker discovery using liquid chromatography-tandem mass spectrometry. Mol Cell Proteomics 2004; 3:984-97. [PMID: 15269249 DOI: 10.1074/mcp.m400061-mcp200] [Citation(s) in RCA: 171] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We have developed an integrated suite of algorithms, statistical methods, and computer applications to support large-scale LC-MS-based gel-free shotgun profiling of complex protein mixtures using basic experimental procedures. The programs automatically detect and quantify large numbers of peptide peaks in feature-rich ion mass chromatograms, compensate for spurious fluctuations in peptide signal intensities and retention times, and reliably match related peaks across many different datasets. Application of this toolkit markedly facilitates pattern recognition and biomarker discovery in global comparative proteomic studies, simplifying mechanistic investigation of physiological responses and the detection of proteomic signatures of disease.
Collapse
|
127
|
Carr S, Aebersold R, Baldwin M, Burlingame A, Clauser K, Nesvizhskii A. The Need for Guidelines in Publication of Peptide and Protein Identification Data. Mol Cell Proteomics 2004; 3:531-3. [PMID: 15075378 DOI: 10.1074/mcp.t400006-mcp200] [Citation(s) in RCA: 396] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
|
128
|
Pan Y, Kislinger T, Gramolini AO, Zvaritch E, Kranias EG, MacLennan DH, Emili A. Identification of biochemical adaptations in hyper- or hypocontractile hearts from phospholamban mutant mice by expression proteomics. Proc Natl Acad Sci U S A 2004; 101:2241-6. [PMID: 14982994 PMCID: PMC356935 DOI: 10.1073/pnas.0308174101] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Phospholamban (PLN) is a critical regulator of cardiac contractility through its binding to and regulation of the activity of the sarco(endo)plasmic reticulum Ca2+ ATPase. To uncover biochemical adaptations associated with extremes of cardiac muscle contractility, we used high-throughput gel-free tandem MS to monitor differences in the relative abundance of membrane proteins in standard microsomal fractions isolated from the hearts of PLN-null mice (PLN-KO) with high contractility and from transgenic mice overexpressing a superinhibitory PLN mutant in a PLN-null background (I40A-KO) with diminished contractility. Significant differential expression was detected for a subset of the 782 proteins identified, including known membrane-associated biomarkers, components of signaling pathways, and previously uninvestigated proteins. Proteins involved in fat and carbohydrate metabolism and proteins linked to G protein-signaling pathways activating protein kinase C were enriched in I40A-KO cardiac muscle, whereas proteins linked to enhanced contractile function were enriched in PLN-KO mutant hearts. These data demonstrate that Ca2+ dysregulation, leading to elevated or depressed cardiac contractility, induces compensatory biochemical responses.
Collapse
Affiliation(s)
- Yan Pan
- Banting and Best Department of Medical Research, University of Toronto, Toronto, ON, Canada M5G 1L6
| | | | | | | | | | | | | |
Collapse
|
129
|
Nisar S, Lane CS, Wilderspin AF, Welham KJ, Griffiths WJ, Patterson LH. A proteomic approach to the identification of cytochrome P450 isoforms in male and female rat liver by nanoscale liquid chromatography-electrospray ionization-tandem mass spectrometry. Drug Metab Dispos 2004; 32:382-6. [PMID: 15039290 DOI: 10.1124/dmd.32.4.382] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Nanoscale reversed-phase liquid chromatography (LC) combined with electrospray ionization-tandem mass spectrometry (ESI-MS/MS) has been used as a method for the direct identification of multiple cytochrome P450 (P450) isoforms found in male and female rat liver. In this targeted proteomic approach, rat liver microsomes were subjected to sodium dodecyl sulfate-polyacrylamide gel electrophoresis followed by in-gel tryptic digestion of the proteins present in the 48- to 62-kDa bands. The resultant peptides were extracted and analyzed by LC-ESI-MS/MS. P450 identifications were made by searching the MS/MS data against a rat protein database containing 21,576 entries including 47 P450s using Sequest software (Thermo Electron, Hemel Hempstead, UK). Twenty-four P450 isoforms from the subfamilies 1A, 2A, 2B, 2C, 2D, 2E, 3A, 4A, 4F, CYP17, and CYP19 were positively identified in rat liver.
Collapse
Affiliation(s)
- S Nisar
- Department of Pharmaceutical & Biological Chemistry, The School of Pharmacy, University of London, 29-39 Brunswick Square, London, United Kingdom
| | | | | | | | | | | |
Collapse
|
130
|
Arnold RJ, Hrncirova P, Annaiah K, Novotny MV. Fast Proteolytic Digestion Coupled with Organelle Enrichment for Proteomic Analysis of Rat Liver. J Proteome Res 2004; 3:653-7. [PMID: 15253449 DOI: 10.1021/pr034110r] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
The use of an acid-labile surfactant as an alternative to urea denaturation allows for same-day proteolytic digestion and fast cleanup of cellular lysate samples. Homogenized rat liver tissue was separated into four fractions enriched in nuclei, mitochondria, microsomes (remaining organelles), and cytosol. Each subcellular fraction was then subjected to proteolytic digestion with trypsin for 2 h after denaturing with an acid-labile surfactant (ALS), separated by nanoflow reversed phase HPLC, and mass analyzed by tandem mass spectrometry in a 3-D ion trap. The results obtained from ALS denaturation for both organelle enrichment and whole cell lysate samples were comparable to those obtained from aliquots of the same samples treated by reduction, alkylation, and urea denaturation. Each method resulted in a similar number of peptides (694 for urea, 674 for ALS) and proteins (225 for urea, 229 for ALS) identified, with generally the same proteins (47% overlap) identified. As expected, organelle enrichment enabled the identification of more proteins (66% more with urea, 60% more with ALS) compared to a whole cell lysate. With organelle enrichment, the number of proteins with equal or increased sequence coverage went up by 73% with urea and 67% with ALS compared to the whole cell lysate. Additional information regarding the subcellular location of many proteins is obtained by organelle enrichment. While organelle enrichment is demonstrated with a bottom-up proteomics approach, it should be easily amenable to top-down proteomics approaches.
Collapse
Affiliation(s)
- Randy J Arnold
- Proteomics R&D Facility, Department of Chemistry, Indiana University, Bloomington, Indiana 47405, USA
| | | | | | | |
Collapse
|
131
|
Sadygov RG, Liu H, Yates JR. Statistical Models for Protein Validation Using Tandem Mass Spectral Data and Protein Amino Acid Sequence Databases. Anal Chem 2004; 76:1664-71. [PMID: 15018565 DOI: 10.1021/ac035112y] [Citation(s) in RCA: 102] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The purpose of this work is to develop and verify statistical models for protein identification using peptide identifications derived from the results of tandem mass spectral database searches. Recently we have presented a probabilistic model for peptide identification that uses hypergeometric distribution to approximate fragment ion matches of database peptide sequences to experimental tandem mass spectra. Here we apply statistical models to the database search results to validate protein identifications. For this we formulate the protein identification problem in terms of two independent models, two-hypothesis binomial and multinomial models, which use the hypergeometric probabilities and cross-correlation scores, respectively. Each database search result is assumed to be a probabilistic event. The Bernoulli event has two outcomes: a protein is either identified or not. The probability of identifying a protein at each Bernoulli event is determined from relative length of the protein in the database (the null hypothesis) or the hypergeometric probability scores of the protein's peptides (the alternative hypothesis). We then calculate the binomial probability that the protein will be observed a certain number of times (number of database matches to its peptides) given the size of the data set (number of spectra) and the probability of protein identification at each Bernoulli event. The ratio of the probabilities from these two hypotheses (maximum likelihood ratio) is used as a test statistic to discriminate between true and false identifications. The significance and confidence levels of protein identifications are calculated from the model distributions. The multinomial model combines the database search results and generates an observed frequency distribution of cross-correlation scores (grouped into bins) between experimental spectra and identified amino acid sequences. The frequency distribution is used to generate p-value probabilities of each score bin. The probabilities are then normalized with respect to score bins to generate normalized probabilities of all score bins. A protein identification probability is the multinomial probability of observing the given set of peptide scores. To reduce the effect of random matches, we employ a marginalized multinomial model for small values of cross-correlation scores. We demonstrate that the combination of the two independent methods provides a useful tool for protein identification from results of database search using tandem mass spectra. A receiver operating characteristic curve demonstrates the sensitivity and accuracy level of the approach. The shortcomings of the models are related to the cases when protein assignment is based on unusual peptide fragmentation patterns that dominate over the model encoded in the peptide identification process. We have implemented the approach in a program called PROT_PROBE.
Collapse
Affiliation(s)
- Rovshan G Sadygov
- Department of Cell Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
132
|
Nesvizhskii AI, Aebersold R. Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MS. Drug Discov Today 2004; 9:173-81. [PMID: 14960397 DOI: 10.1016/s1359-6446(03)02978-7] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Tandem mass spectrometry has been used increasingly for high-throughput analysis of complex protein samples. A major challenge lies in the consistent, objective and transparent analysis of the large amounts of data generated by such experiments and in their dissemination and publication. Here, we review currently available computational tools and discuss the need for statistical criteria in the analysis of large proteomics datasets.
Collapse
|
133
|
Coppinger JA, Cagney G, Toomey S, Kislinger T, Belton O, McRedmond JP, Cahill DJ, Emili A, Fitzgerald DJ, Maguire PB. Characterization of the proteins released from activated platelets leads to localization of novel platelet proteins in human atherosclerotic lesions. Blood 2003; 103:2096-104. [PMID: 14630798 DOI: 10.1182/blood-2003-08-2804] [Citation(s) in RCA: 594] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Proteins secreted by activated platelets can adhere to the vessel wall and promote the development of atherosclerosis and thrombosis. Despite this biologic significance, however, the complement of proteins comprising the platelet releasate is largely unknown. Using a proteomics approach, we have identified more than 300 proteins released by human platelets following thrombin activation. Many of the proteins identified were not previously attributed to platelets, including secretogranin III, a potential monocyte chemoattractant precursor; cyclophilin A, a vascular smooth muscle cell growth factor; calumenin, an inhibitor of the vitamin K epoxide reductase-warfarin interaction, as well as proteins of unknown function that map to expressed sequence tags. Secretogranin III, cyclophilin A, and calumenin were confirmed to localize in platelets and to be released upon activation. Furthermore, while absent in normal vasculature, they were identified in human atherosclerotic lesions. Therefore, these and other proteins released from platelets may contribute to atherosclerosis and to the thrombosis that complicates the disease. Moreover, as soluble extracellular proteins, they may prove suitable as novel therapeutic targets.
Collapse
Affiliation(s)
- Judith A Coppinger
- Department of Clinical Pharmacology, Royal College of Surgeons in Ireland, 123 St Stephen's Green, Dublin 2, Ireland
| | | | | | | | | | | | | | | | | | | |
Collapse
|
134
|
Cagney G, Amiri S, Premawaradena T, Lindo M, Emili A. In silico proteome analysis to facilitate proteomics experiments using mass spectrometry. Proteome Sci 2003; 1:5. [PMID: 12946274 PMCID: PMC194173 DOI: 10.1186/1477-5956-1-5] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2003] [Accepted: 08/13/2003] [Indexed: 11/10/2022] Open
Abstract
Proteomics experiments typically involve protein or peptide separation steps coupled to the identification of many hundreds to thousands of peptides by mass spectrometry. Development of methodology and instrumentation in this field is proceeding rapidly, and effective software is needed to link the different stages of proteomic analysis. We have developed an application, proteogest, written in Perl that generates descriptive and statistical analyses of the biophysical properties of multiple (e.g. thousands) protein sequences submitted by the user, for instance protein sequences inferred from the complete genome sequence of a model organism. The application also carries out in silico proteolytic digestion of the submitted proteomes, or subsets thereof, and the distribution of biophysical properties of the resulting peptides is presented. proteogest is customizable, the user being able to select many options, for instance the cleavage pattern of the digestion treatment or the presence of modifications to specific amino acid residues. We show how proteogest can be used to compare the proteomes and digested proteome products of model organisms, to examine the added complexity generated by modification of residues, and to facilitate the design of proteomics experiments for optimal representation of component proteins.
Collapse
Affiliation(s)
- Gerard Cagney
- Program in Proteomics and Bioinformatics, Banting and Best Department of Medical Research, University of Toronto, Toronto, Canada
- Present Address: Department of Clinical Pharmacology, Royal College of Surgeons, 123 Saint Stephen's Green, Dublin 2, Ireland
| | - Shiva Amiri
- Program in Proteomics and Bioinformatics, Banting and Best Department of Medical Research, University of Toronto, Toronto, Canada
| | - Thanuja Premawaradena
- Program in Proteomics and Bioinformatics, Banting and Best Department of Medical Research, University of Toronto, Toronto, Canada
| | - Micheal Lindo
- Program in Proteomics and Bioinformatics, Banting and Best Department of Medical Research, University of Toronto, Toronto, Canada
| | - Andrew Emili
- Program in Proteomics and Bioinformatics, Banting and Best Department of Medical Research, University of Toronto, Toronto, Canada
- Department of Molecular and Medical Genetics, University of Toronto, Toronto, Canada
| |
Collapse
|