Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

177
(from Reference Citation Analysis)

Article PDFs (35)

Cited by ≥ 1 (91)

Searched Name

Proteomics/statistics & numerical data

Year Published

Show more Refine

Article Statistics

Refine

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Journal Articles

Number	Citation Analysis
51	Hugo A, Baxter DJ, Cannon WR, Kalyanaraman A, Kulkarni G, Callister SJ. Proteotyping of microbial communities by optimization of tandem mass spectrometry data interpretation. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2012:225-234. [PMID: 22174278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Abstract We report the development of a novel high performance computing method for the identification of proteins from unknown (environmental) samples. The method uses computational optimization to provide an effective way to control the false discovery rate for environmental samples and complements de novo peptide sequencing. Furthermore, the method provides information based on the expressed protein in a microbial community, and thus complements DNA-based identification methods. Testing on blind samples demonstrates that the method provides 79-95% overlap with analogous results from searches involving only the correct genomes. We provide scaling and performance evaluations for the software that demonstrate the ability to carry out large-scale optimizations on 1258 genomes containing 4.2M proteins. Collapse Key Words Collapse MESH Headings Computational Biology Computing Methodologies Data Interpretation, Statistical Likelihood Functions Microbiota/genetics Proteins/genetics Proteins/isolation & purification Proteome/genetics Proteome/isolation & purification Proteomics/statistics & numerical data Software Tandem Mass Spectrometry/statistics & numerical data Collapse Grants Collapse
52	Schäfer M, Lkhagvasuren O, Klein HU, Elling C, Wüstefeld T, Müller-Tidow C, Zender L, Koschmieder S, Dugas M, Ickstadt K. Integrative analyses for omics data: a Bayesian mixture model to assess the concordance of ChIP-chip and ChIP-seq measurements. JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH. PART A 2012;75:461-470. [PMID: 22686305 DOI: 10.1080/15287394.2012.674914] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023] Abstract The analysis of different variations in genomics, transcriptomics, epigenomics, and proteomics has increased considerably in recent years. This is especially due to the success of microarray and, more recently, sequencing technology. Apart from understanding mechanisms of disease pathogenesis on a molecular basis, for example in cancer research, the challenge of analyzing such different data types in an integrated way has become increasingly important also for the validation of new sequencing technologies with maximum resolution. For this purpose, a methodological framework for their comparison with microarray techniques in the context of smallest sample sizes, which result from the high costs of experiments, is proposed in this contribution. Based on an adaptation of the externally centered correlation coefficient ( Schäfer et al. 2009 ), it is demonstrated how a Bayesian mixture model can be applied to compare and classify measurements of histone acetylation that stem from chromatin immunoprecipitation combined with either microarray (ChIP-chip) or sequencing techniques (ChIP-seq) for the identification of DNA fragments. Here, the murine hematopoietic cell line 32D, which was transduced with the oncogene BCR-ABL, the hallmark of chronic myeloid leukemia, was characterized. Cells were compared to mock-transduced cells as control. Activation or inhibition of other genes by histone modifications induced by the oncogene is considered critical in such a context for the understanding of the disease. Collapse Key Words Collapse MESH Headings Algorithms Animals Bayes Theorem Capillary Electrochromatography Chromatin Immunoprecipitation DNA/chemistry DNA/genetics Data Interpretation, Statistical Epigenomics/methods Epigenomics/statistics & numerical data Fusion Proteins, bcr-abl/genetics Genomics/methods Genomics/statistics & numerical data Hematopoietic Stem Cells/metabolism Histones/genetics Histones/metabolism Leukemia, Myelogenous, Chronic, BCR-ABL Positive/genetics Markov Chains Mice Microarray Analysis Models, Statistical Monte Carlo Method Oncogenes/genetics Proteomics/methods Proteomics/statistics & numerical data Sample Size Sequence Analysis, DNA/methods Sequence Analysis, DNA/statistics & numerical data Transduction, Genetic Collapse Grants Collapse
53	Oeltze S, Freiler W, Hillert R, Doleisch H, Preim B, Schubert W. Interactive, graph-based visual analysis of high-dimensional, multi-parameter fluorescence microscopy data in toponomics. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2011;17:1882-1891. [PMID: 22034305 DOI: 10.1109/tvcg.2011.217] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Abstract In Toponomics, the function protein pattern in cells or tissue (the toponome) is imaged and analyzed for applications in toxicology, new drug development and patient-drug-interaction. The most advanced imaging technique is robot-driven multi-parameter fluorescence microscopy. This technique is capable of co-mapping hundreds of proteins and their distribution and assembly in protein clusters across a cell or tissue sample by running cycles of fluorescence tagging with monoclonal antibodies or other affinity reagents, imaging, and bleaching in situ. The imaging results in complex multi-parameter data composed of one slice or a 3D volume per affinity reagent. Biologists are particularly interested in the localization of co-occurring proteins, the frequency of co-occurrence and the distribution of co-occurring proteins across the cell. We present an interactive visual analysis approach for the evaluation of multi-parameter fluorescence microscopy data in toponomics. Multiple, linked views facilitate the definition of features by brushing multiple dimensions. The feature specification result is linked to all views establishing a focus+context visualization in 3D. In a new attribute view, we integrate techniques from graph visualization. Each node in the graph represents an affinity reagent while each edge represents two co-occurring affinity reagent bindings. The graph visualization is enhanced by glyphs which encode specific properties of the binding. The graph view is equipped with brushing facilities. By brushing in the spatial and attribute domain, the biologist achieves a better understanding of the function protein patterns of a cell. Furthermore, an interactive table view is integrated which summarizes unique fluorescence patterns. We discuss our approach with respect to a cell probe containing lymphocytes and a prostate tissue section. Collapse Key Words Collapse MESH Headings Computer Graphics Data Interpretation, Statistical Humans Imaging, Three-Dimensional/statistics & numerical data Lymphocytes/metabolism Male Microscopy, Fluorescence/statistics & numerical data Neoplasm Proteins/metabolism Prostatic Neoplasms/metabolism Proteomics/statistics & numerical data Collapse Grants Collapse
54	Ng SK, Tan SH. DISCOVERING PROTEIN–PROTEIN INTERACTIONS. J Bioinform Comput Biol 2011;1:711-41. [PMID: 15290761 DOI: 10.1142/s0219720004000600] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2003] [Revised: 12/12/2003] [Accepted: 12/13/2003] [Indexed: 11/18/2022] Abstract The ongoing genomics and proteomics efforts have helped identify many new genes and proteins in living organisms. However, simply knowing the existence of genes and proteins does not tell us much about the biological processes in which they participate. Many major biological processes are controlled by protein interaction networks. A comprehensive description of protein–protein interactions is therefore necessary to understand the genetic program of life. In this tutorial, we provide an overview of the various current high-throughput methods for discovering protein–protein interactions, covering both the conventional experimental methods and new computational approaches. Collapse Key Words Collapse MESH Headings Artificial Gene Fusion Chromatography, Affinity Computational Biology Databases, Protein Gene Expression Profiling/statistics & numerical data Macromolecular Substances Mass Spectrometry Models, Biological Peptide Library Phylogeny Protein Array Analysis/statistics & numerical data Protein Binding Proteins/chemistry Proteins/metabolism Proteomics/statistics & numerical data RNA, Messenger/genetics Two-Hybrid System Techniques/statistics & numerical data Collapse Grants Collapse
55	Berg D, Wolff C, Langer R, Schuster T, Feith M, Slotta-Huspenina J, Malinowsky K, Becker KF. Discovery of new molecular subtypes in oesophageal adenocarcinoma. PLoS One 2011;6:e23985. [PMID: 21966358 PMCID: PMC3179464 DOI: 10.1371/journal.pone.0023985] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 07/28/2011] [Indexed: 12/22/2022] Open Abstract A large number of patients suffering from oesophageal adenocarcinomas do not respond to conventional chemotherapy; therefore, it is necessary to identify new predictive biomarkers and patient signatures to improve patient outcomes and therapy selections. We analysed 87 formalin-fixed and paraffin-embedded (FFPE) oesophageal adenocarcinoma tissue samples with a reverse phase protein array (RPPA) to examine the expression of 17 cancer-related signalling molecules. Protein expression levels were analysed by unsupervised hierarchical clustering and correlated with clinicopathological parameters and overall patient survival. Proteomic analyses revealed a new, very promising molecular subtype of oesophageal adenocarcinoma patients characterised by low levels of the HSP27 family proteins and high expression of those of the HER family with positive lymph nodes, distant metastases and short overall survival. After confirmation in other independent studies, our results could be the foundation for the development of a Her2-targeted treatment option for this new patient subgroup of oesophageal adenocarcinoma. Collapse Key Words Collapse MESH Headings Adenocarcinoma/metabolism Adenocarcinoma/pathology Adult Aged Aged, 80 and over Biomarkers, Tumor/analysis Cluster Analysis Esophageal Neoplasms/metabolism Esophageal Neoplasms/pathology Female HSP27 Heat-Shock Proteins/analysis Humans Kaplan-Meier Estimate Lymphatic Metastasis Male Middle Aged Prognosis Proportional Hazards Models Proteome/analysis Proteome/classification Proteomics/methods Proteomics/statistics & numerical data Receptor, ErbB-2/analysis Tissue Array Analysis Collapse Grants Collapse
56	Dunn MJ. Best practice in statistical reporting. Proteomics 2011;11:2361. [PMID: 21648086 DOI: 10.1002/pmic.201190051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Abstract Collapse Key Words Collapse MESH Headings Data Interpretation, Statistical Proteomics/methods Proteomics/statistics & numerical data Collapse Grants Collapse
57	Arabnia HR, Tran QN. Improved prediction of MHC class I binders/non-binders peptides through artificial neural network using variable learning rate: SARS corona virus, a case study. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2011;696:223-9. [PMID: 21431562 PMCID: PMC7123181 DOI: 10.1007/978-1-4419-7046-6_22] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/09/2023] Abstract Fundamental step of an adaptive immune response to pathogen or vaccine is the binding of short peptides (also called epitopes) to major histocompatibility complex (MHC) molecules. The various prediction algorithms are being used to capture the MHC peptide binding preference, allowing the rapid scan of entire pathogen proteomes for peptide likely to bind MHC, saving the cost, effort, and time. However, the number of known binders/non-binders (BNB) to a specific MHC molecule is limited in many cases, which still poses a computational challenge for prediction. The training data should be adequate to predict BNB using any machine learning approach. In this study, variable learning rate has been demonstrated for training artificial neural network and predicting BNB for small datasets. The approach can be used for large datasets as well. The dataset for different MHC class I alleles for SARS Corona virus (Tor2 Replicase polyprotein 1ab) has been used for training and prediction of BNB. A total of 90 datasets (nine different MHC class I alleles with tenfold cross validation) have been retrieved from IEDB database for BNB. For fixed learning rate approach, the best value of AROC is 0.65, and in most of the cases it is 0.5, which shows the poor predictions. In case of variable learning rate, of the 90 datasets the value of AROC for 76 datasets is between 0.806 and 1.0 and for 7 datasets the value is between 0.7 and 0.8 and for rest of 7 datasets it is between 0.5 and 0.7, which indicates very good performance in most of the cases. Collapse Key Words Collapse MESH Headings Algorithms Alleles Artificial Intelligence Binding Sites Computational Biology Databases, Genetic Genes, MHC Class I Histocompatibility Antigens Class I/genetics Histocompatibility Antigens Class I/metabolism Humans Neural Networks, Computer Protein Binding Proteomics/statistics & numerical data Severe acute respiratory syndrome-related coronavirus/genetics Severe acute respiratory syndrome-related coronavirus/immunology Severe acute respiratory syndrome-related coronavirus/metabolism Collapse Grants Collapse
58	Rubakhin SS, Romanova EV, Nemes P, Sweedler JV. Profiling metabolites and peptides in single cells. Nat Methods 2011;8:S20-9. [PMID: 21451513 PMCID: PMC3312877 DOI: 10.1038/nmeth.1549] [Citation(s) in RCA: 264] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Abstract The intracellular levels and spatial localizations of metabolites and peptides reflect the state of a cell and its relationship to its surrounding environment. Moreover, the amounts and dynamics of metabolites and peptides are indicative of normal or pathological cellular conditions. Here we highlight established and evolving strategies for characterizing the metabolome and peptidome of single cells. Focused studies of the chemical composition of individual cells and functionally defined groups of cells promise to provide a greater understanding of cell fate, function and homeostatic balance. Single-cell bioanalytical microanalysis has also become increasingly valuable for examining cellular heterogeneity, particularly in the fields of neuroscience, stem cell biology and developmental biology. Collapse Key Words Collapse MESH Headings Animals Cell Separation Chromatography Data Interpretation, Statistical Electrophoresis Humans Magnetic Resonance Spectroscopy Mass Spectrometry Metabolome Metabolomics/methods Metabolomics/statistics & numerical data Microfluidic Analytical Techniques Proteome Proteomics/methods Proteomics/statistics & numerical data Single-Cell Analysis/methods Single-Cell Analysis/statistics & numerical data Single-Cell Analysis/trends Collapse Grants P30 DA018310 NIDA NIH HHS R01 DE018866 NIDCR NIH HHS R01 DE018866-04 NIDCR NIH HHS R01 NS031609 NINDS NIH HHS 5R01NS031609 NINDS NIH HHS P30 DA018310-08 NIDA NIH HHS R01 NS031609-15 NINDS NIH HHS 5R01DE018866 NIDCR NIH HHS Collapse
59	ten Have S, Boulon S, Ahmad Y, Lamond AI. Mass spectrometry-based immuno-precipitation proteomics - the user's guide. Proteomics 2011;11:1153-9. [PMID: 21365760 PMCID: PMC3708439 DOI: 10.1002/pmic.201000548] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2010] [Revised: 12/07/2010] [Accepted: 12/10/2010] [Indexed: 11/07/2022] Abstract Immuno-precipitation (IP) experiments using MS provide a sensitive and accurate way of characterising protein complexes and their response to regulatory mechanisms. Differences in stoichiometry can be determined as well as the reliable identification of specific binding partners. The quality control of IP and protein interaction studies has its basis in the biology that is being observed. Is that unusual protein identification a genuine novelty, or an experimental irregularity? Antibodies and the solid matrices used in these techniques isolate not only the target protein and its specific interaction partners but also many non-specific 'contaminants' requiring a structured analysis strategy. These methodological developments and the speed and accuracy of MS machines, which has been increasing consistently in the last 5 years, have expanded the number of proteins identified and complexity of analysis. The European Science Foundation's Frontiers in Functional Genomics programme 'Quality Control in Proteomics' Workshop provided a forum for disseminating knowledge and experience on this subject. Our aim in this technical brief is to outline clearly, for the scientists wanting to carry out this kind of experiment, and recommend what, in our experience, are the best potential ways to design an IP experiment, to help identify possible pitfalls, discuss important controls and outline how to manage and analyse the large amount of data generated. Detailed experimental methodologies have been referenced but not described in the form of protocols. Collapse Key Words cell biology cumulative analysis immuno-precipitation protein frequency quality control silac Collapse MESH Headings Data Interpretation, Statistical Humans Immunoprecipitation/methods Immunoprecipitation/standards Immunoprecipitation/statistics & numerical data Mass Spectrometry/methods Mass Spectrometry/standards Mass Spectrometry/statistics & numerical data Protein Interaction Mapping/statistics & numerical data Proteins/isolation & purification Proteomics/methods Proteomics/standards Proteomics/statistics & numerical data Quality Control Collapse Grants 097945 Wellcome Trust G0301131 Medical Research Council 073980 Wellcome Trust 083524 Wellcome Trust 081361 Wellcome Trust G0801738 Medical Research Council 037538 Wellcome Trust 073980/Z/03/Z Wellcome Trust Wellcome Trust C12944 Biotechnology and Biological Sciences Research Council C08577 Biotechnology and Biological Sciences Research Council Collapse
60	Halligan BD, Greene AS. Visualize: a free and open source multifunction tool for proteomics data analysis. Proteomics 2011;11:1058-63. [PMID: 21365761 PMCID: PMC3816356 DOI: 10.1002/pmic.201000556] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2010] [Revised: 11/19/2010] [Accepted: 11/29/2010] [Indexed: 12/25/2022] Abstract A major challenge in the field of high-throughput proteomics is the conversion of the large volume of experimental data that is generated into biological knowledge. Typically, proteomics experiments involve the combination and comparison of multiple data sets and the analysis and annotation of these combined results. Although there are some commercial applications that provide some of these functions, there is a need for a free, open source, multifunction tool for advanced proteomics data analysis. We have developed the Visualize program that provides users with the abilities to visualize, analyze, and annotate proteomics data; combine data from multiple runs, and quantitate differences between individual runs and combined data sets. Visualize is licensed under GNU GPL and can be downloaded from http://proteomics.mcw.edu/visualize. It is available as compiled client-based executable files for both Windows and Mac OS X platforms as well as PERL source code. Collapse Key Words bioinformatics protein-automated identification quantitative analysis software Collapse MESH Headings Algorithms Amino Acid Sequence Computational Biology Computer Simulation Data Interpretation, Statistical Databases, Protein/statistics & numerical data Humans Mass Spectrometry/statistics & numerical data Protein Array Analysis/statistics & numerical data Proteins/chemistry Proteins/isolation & purification Proteomics/statistics & numerical data Software Collapse Grants N01 HV028182 NHLBI NIH HHS N01HV28182 NHLBI NIH HHS N01-HV-28182 NHLBI NIH HHS Collapse
61	Gough NR, Yaffe MB. Focus issue: conquering the data mountain. Sci Signal 2011;4:eg2. [PMID: 21325201 DOI: 10.1126/scisignal.2001871] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022] Abstract High-throughput technologies have enabled a rapid increase in the acquisition of data regarding cellular regulation, such as protein-protein interactions, gene expression profiling, proteomic analyses of changes in protein abundance, and global analyses of posttranslational modifications. The challenge now is for the community to devise adequate standards for assessing reliability and annotation, facilities for storage, mechanisms for sharing, and tools for visualization and analysis. In conjunction with Science (http://www.sciencemag.org/special/data), this issue of Science Signaling tackles some of the key issues related to the data deluge faced by cell signaling researchers. Collapse Key Words Collapse MESH Headings Database Management Systems/standards Gene Expression Profiling/statistics & numerical data Proteomics/statistics & numerical data Signal Transduction Collapse Grants Collapse
62	Jung K. Statistics in experimental design, preprocessing, and analysis of proteomics data. Methods Mol Biol 2011;696:259-272. [PMID: 21063953 DOI: 10.1007/978-1-60761-987-1_16] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023] Abstract High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered. Collapse Key Words Collapse MESH Headings Analysis of Variance Confidence Intervals Data Mining/statistics & numerical data Databases, Protein/standards Databases, Protein/statistics & numerical data Proteomics/methods Proteomics/standards Proteomics/statistics & numerical data Reference Standards Research Design/statistics & numerical data Statistics as Topic Collapse Grants Collapse
63	Cooper B, Feng J, Garrett WM. Relative, label-free protein quantitation: spectral counting error statistics from nine replicate MudPIT samples. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2010;21:1534-46. [PMID: 20541435 DOI: 10.1016/j.jasms.2010.05.001] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2010] [Revised: 04/30/2010] [Accepted: 05/03/2010] [Indexed: 05/03/2023] Abstract Nine replicate samples of peptides from soybean leaves, each spiked with a different concentration of bovine apotransferrin peptides, were analyzed on a mass spectrometer using multidimensional protein identification technology (MudPIT). Proteins were detected from the peptide tandem mass spectra, and the numbers of spectra were statistically evaluated for variation between samples. The results corroborate prior knowledge that combining spectra from replicate samples increases the number of identifiable proteins and that a summed spectral count for a protein increases linearly with increasing molar amounts of protein. Furthermore, statistical analysis of spectral counts for proteins in two- and three-way comparisons between replicates and combined replicates revealed little significant variation arising from run-to-run differences or data-dependent instrument ion sampling that might falsely suggest differential protein accumulation. In these experiments, spectral counting was enabled by PANORAMICS, probability-based software that predicts proteins detected by sets of observed peptides. Three alternative approaches to counting spectra were also evaluated by comparison. As the counting thresholds were changed from weaker to more stringent, the accuracy of ratio determination also changed. These results suggest that thresholds for counting can be empirically set to improve relative quantitation. All together, the data confirm the accuracy and reliability of label-free spectral counting in the relative, quantitative analysis of proteins between samples. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Animals Apoproteins/chemistry Artifacts Artificial Intelligence Cattle Databases, Protein Molecular Sequence Data Pattern Recognition, Automated Peptide Mapping/methods Peptide Mapping/statistics & numerical data Plant Extracts/chemistry Plant Leaves/chemistry Plant Proteins/chemistry Proteomics/methods Proteomics/statistics & numerical data Reproducibility of Results Sequence Analysis, Protein Software Glycine max/chemistry Transferrin/chemistry Collapse Grants Collapse
64	Suwa M, Ono Y. Computational overview of GPCR gene universe to support reverse chemical genomics study. Methods Mol Biol 2010;577:41-54. [PMID: 19718507 DOI: 10.1007/978-1-60761-232-2_4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2023] Abstract In order to support high-throughput screening for ligands of G-protein coupled receptors (GPCRs) by using bioinformatics technology, we introduce a database (SEVENS) with genome-scale annotation and software (GRIFFIN) that can simulate GPCR function. SEVENS ( http://sevens.cbrc.jp/ ) is an integrated database that includes GPCR genes that are identified with high accuracy (99.4% sensitivity and 96.6% specificity) from various types of genomes, by a pipeline that integrates such software as a gene finder, a sequence alignment tool, a motif and domain assignment tool, and a transmembrane helix (TMH) predictor. SEVENS provides the user a genome-scale overview of the "GPCR universe" with detailed information of chromosomal mapping, phylogenetic tree, protein sequence and structure, and experimental evidence, all of which are accessible via a user-friendly interface. GRIFFIN ( http://griffin.cbrc.jp/ ) can predict GPCR and G-protein coupling selectivity induced by ligand binding with high sensitivity and specificity (more than 87% on average), based on the support vector machine (SVM) and hidden Markov Model (HMM). SEVENS and GRIFFIN are expected to contribute to revealing the function of orphan and unknown GPCRs. Collapse Key Words Collapse MESH Headings Animals Artificial Intelligence Computational Biology Databases, Genetic Drug Design Drug Evaluation, Preclinical/methods Drug Evaluation, Preclinical/statistics & numerical data Genomics/methods Genomics/statistics & numerical data High-Throughput Screening Assays/statistics & numerical data Humans Ligands Markov Chains Proteomics/methods Proteomics/statistics & numerical data Receptors, G-Protein-Coupled/chemistry Receptors, G-Protein-Coupled/genetics Receptors, G-Protein-Coupled/metabolism Software Collapse Grants Collapse
65	Caffrey RE. A review of experimental design best practices for proteomics based biomarker discovery: focus on SELDI-TOF. Methods Mol Biol 2010;641:167-183. [PMID: 20407947 DOI: 10.1007/978-1-60761-711-2_10] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023] Abstract Surface Enhanced Laser/Desorption Ionization-time of flight (SELDI-TOF) mass spectrometry is a technique uniquely suited to the study of the urine proteome due to its salt tolerance, high-throughput, and small sample requirements. However, due to the extreme sensitivity of the technique, sample collection and storage conditions, as well as instrument protocols and analysis conditions, must be rigorously controlled to ensure that data generated and collected is accurate and free from artifacts. Robust and reproducible data sets can be generated and compared between clinical sites when experimental protocols are carefully standardized. This chapter aims to review known factors that cause irreproducible results so that the experiments may be designed with appropriate sample and process controls for successful biomarker discovery. A suggested protocol follows the review. A number of issues for study design are discussed and these are generally applicable to biomarker discovery experiments. Collapse Key Words Collapse MESH Headings Analytic Sample Preparation Methods Biomarkers/metabolism Biomarkers/urine Female Humans Lasers Male Mass Spectrometry/methods Protein Array Analysis Proteomics/methods Proteomics/statistics & numerical data Research Design Collapse Grants Collapse
66	Huttenhower C, Myers CL, Hibbs MA, Troyanskaya OG. Computational analysis of the yeast proteome: understanding and exploiting functional specificity in genomic data. Methods Mol Biol 2009;548:273-93. [PMID: 19521830 DOI: 10.1007/978-1-59745-540-4_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023] Abstract Modern experimental techniques have produced a wealth of high-throughput data that has enabled the ongoing genomic revolution. As the field continues to integrate experimental and computational analyzes of this data, it is essential that performance evaluations of high-throughput results be carried out in a consistent and biologically informative manner. Here, we present an overview of evaluation techniques for high-throughput experimental data and computational methods, and we discuss a number of potential pitfalls in this process. These primarily involve the biological diversity of genomic data, which can be masked or misrepresented in overly simplified global evaluations. We describe systems for preserving information about biological context during dataset evaluation, which can help to ensure that multiple different evaluations are more directly comparable. This biological variety in high-throughput data can also be taken advantage of computationally through data integration and process specificity to produce richer systems-level predictions of cellular function. An awareness of these considerations can greatly improve the evaluation and analysis of any high-throughput experimental dataset. Collapse Key Words Collapse MESH Headings Computational Biology Data Interpretation, Statistical Databases, Genetic/standards Databases, Genetic/statistics & numerical data Databases, Protein/standards Databases, Protein/statistics & numerical data Genome, Fungal Genomics/standards Genomics/statistics & numerical data Proteome Proteomics/standards Proteomics/statistics & numerical data Saccharomyces cerevisiae/genetics Saccharomyces cerevisiae/metabolism Systems Biology Collapse Grants Collapse
67	Eckel-Passow JE, Oberg AL, Therneau TM, Bergen HR. An insight into high-resolution mass-spectrometry data. Biostatistics 2009;10:481-500. [PMID: 19325168 PMCID: PMC2697344 DOI: 10.1093/biostatistics/kxp006] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2007] [Revised: 03/12/2008] [Accepted: 02/23/2009] [Indexed: 11/15/2022] Open Abstract Mass spectrometry is a powerful tool with much promise in global proteomic studies. The discipline of statistics offers robust methodologies to extract and interpret high-dimensional mass-spectrometry data and will be a valuable contributor to the field. Here, we describe the process by which data are produced, characteristics of the data, and the analytical preprocessing steps that are taken in order to interpret the data and use it in downstream statistical analyses. Because of the complexity of data acquisition, statistical methods developed for gene expression microarray data are not directly applicable to proteomic data. Areas in need of statistical research for proteomic data include alignment, experimental design, abundance normalization, and statistical analysis. Collapse Key Words experimental design fourier transform mass calibration mass spectrometry normalization Collapse MESH Headings Algorithms Biometry Cyclotrons Data Interpretation, Statistical Fourier Analysis Humans Mass Spectrometry/statistics & numerical data Peptides/chemistry Proteins/chemistry Proteomics/statistics & numerical data Sequence Alignment/statistics & numerical data Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/statistics & numerical data Tandem Mass Spectrometry/statistics & numerical data Collapse Grants R25 CA092049 NCI NIH HHS R25 CA92049 NCI NIH HHS Collapse
68	Zheng G, Li H, Wang C, Sheng Q, Fan H, Yang S, Liu B, Dai J, Zeng R, Xie L. A platform to standardize, store, and visualize proteomics experimental data. Acta Biochim Biophys Sin (Shanghai) 2009;41:273-9. [PMID: 19352541 DOI: 10.1093/abbs/gmp010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract With the development of functional genomics research, large-scale proteomics studies are now widespread, presenting significant challenges for data storage, exchange, and analysis. Here we present the Integrated Proteomics Exploring Database (IPED) as a platform for managing proteomics experimental data (both process and result data). IPED is based on the schema of the Proteome Experimental Data Repository (PEDRo), and complies with the General Proteomics Standard (GPS) drafted by the Proteomics Standards Committee of the Human Proteome Organization. In our work, we developed three components for the IPED platform: the IPED client editor, IPED server software, and IPED web interface. The client editor collects experimental data and generates an extensible markup language (XML) data file compliant with PEDRo and GPS; the server software parses the XML data file and loads information into a core database; and the web interface displays experimental results, to provide a convenient graphic representation of data. Given software convenience and data abundance, IPED is a powerful platform for data exchange and presents an important resource for the proteomics community. In its current release, IPED is available at http://www.biosino.org/iped2. Collapse Key Words Collapse MESH Headings Algorithms Cells, Cultured Computational Biology/methods Databases, Protein Electrophoresis, Gel, Two-Dimensional Endothelial Cells/cytology Endothelial Cells/drug effects Endothelial Cells/metabolism Humans Hydroxymethylglutaryl-CoA Reductase Inhibitors/pharmacology Internet Lovastatin/pharmacology Proteome/analysis Proteomics/methods Proteomics/statistics & numerical data Software User-Computer Interface Collapse Grants Collapse
69	Webb-Robertson BJM, McCue LA, Beagley N, McDermott JE, Wunschel DS, Varnum SM, Hu JZ, Isern NG, Buchko GW, Mcateer K, Pounds JG, Skerrett SJ, Liggitt D, Frevert CW. A Bayesian integration model of high-throughput proteomics and metabolomics data for improved early detection of microbial infections. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2009:451-63. [PMID: 19209722 PMCID: PMC4137860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/27/2023] Abstract High-throughput (HTP) technologies offer the capability to evaluate the genome, proteome, and metabolome of an organism at a global scale. This opens up new opportunities to define complex signatures of disease that involve signals from multiple types of biomolecules. However, integrating these data types is difficult due to the heterogeneity of the data. We present a Bayesian approach to integration that uses posterior probabilities to assign class memberships to samples using individual and multiple data sources; these probabilities are based on lower-level likelihood functions derived from standard statistical learning algorithms. We demonstrate this approach on microbial infections of mice, where the bronchial alveolar lavage fluid was analyzed by three HTP technologies, two proteomic and one metabolomic. We demonstrate that integration of the three datasets improves classification accuracy to approximately 89% from the best individual dataset at approximately 83%. In addition, we present a new visualization tool called Visual Integration for Bayesian Evaluation (VIBE) that allows the user to observe classification accuracies at the class level and evaluate classification accuracies on any subset of available data types based on the posterior probability models defined for the individual and integrated data. Collapse Key Words Collapse MESH Headings Algorithms Animals Bayes Theorem Biomarkers/metabolism Biometry/methods Data Interpretation, Statistical Francisella/genetics Francisella/pathogenicity Genes, Bacterial Gram-Negative Bacterial Infections/diagnosis Gram-Negative Bacterial Infections/metabolism Infections/diagnosis Infections/metabolism Least-Squares Analysis Magnetic Resonance Spectroscopy Male Metabolomics/statistics & numerical data Mice Mice, Inbred C57BL Models, Biological Mutation Proteomics/statistics & numerical data Pseudomonas Infections/diagnosis Pseudomonas Infections/metabolism Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization Tandem Mass Spectrometry Virulence/genetics Collapse Grants U54 AI057141 NIAID NIH HHS Collapse
70	Dudley JT, Butte AJ. Identification of discriminating biomarkers for human disease using integrative network biology. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2009:27-38. [PMID: 19209693 PMCID: PMC2749008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/27/2023] Abstract There is a strong clinical imperative to identify discerning molecular biomarkers of disease to inform diagnosis, prognosis, and treatment. Ideally, such biomarkers would be drawn from peripheral sources non-invasively to reduce costs and lower potential for complication. Advances in high-throughput genomics and proteomics have vastly increased the space of prospective molecular biomarkers. Consequently, the elucidation of molecular biomarkers of clinical importance often entails a genome- or proteome-wide search for candidates. Here we present a novel framework for the identification of disease-specific protein biomarkers through the integration of biofluid proteomes and inter-disease genomic relationships using a network paradigm. We created a blood plasma biomarker network by linking expression-based genomic profiles from 136 diseases to 1,028 detectable blood plasma proteins. We also created a urine biomarker network by linking genomic profiles from 127 diseases to 577 proteins detectable in urine. Through analysis of these molecular biomarker networks, we find that the majority (> 80%) of putative protein biomarkers are linked to multiple disease conditions. Thus, prospective disease-specific protein biomarkers are found in only a small subset of the biofluids proteomes. These findings illustrate the importance of considering shared molecular pathology across diseases when evaluating biomarker specificity. The proposed framework is amenable to integration with complimentary network models of biology, which could further constrain the biomarker candidate space, and establish a role for the understanding of multi-scale, inter-disease genomic relationships in biomarker discovery. Collapse Key Words Collapse MESH Headings Biomarkers/blood Biometry Blood Proteins/genetics Disease/genetics Gene Expression Profiling/statistics & numerical data Genomics/statistics & numerical data Humans Oligonucleotide Array Sequence Analysis/statistics & numerical data Proteomics/statistics & numerical data Systems Biology Collapse Grants R01 GM079719 NIGMS NIH HHS R01 LM009719 NLM NIH HHS R01 LM009719-01A1 NLM NIH HHS Collapse
71	Mottaz-Brewer HM, Norbeck AD, Adkins JN, Manes NP, Ansong C, Shi L, Rikihisa Y, Kikuchi T, Wong SW, Estep RD, Heffron F, Pasa-Tolic L, Smith RD. Optimization of proteomic sample preparation procedures for comprehensive protein characterization of pathogenic systems. J Biomol Tech 2008;19:285-295. [PMID: 19183792 PMCID: PMC2628077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023] Abstract Mass spectrometry-based proteomics is a powerful analytical tool for investigating pathogens and their interactions within a host. The sensitivity of such analyses provides broad proteome characterization, but the sample-handling procedures must first be optimized to ensure compatibility with the technique and to maximize the dynamic range of detection. The decision-making process for determining optimal growth conditions, preparation methods, sample analysis methods, and data analysis techniques in our laboratory is discussed herein with consideration of the balance in sensitivity, specificity, and biomass losses during analysis of host-pathogen systems. Collapse Key Words :mass spectrometry methods pathogens proteomics sample preparation Collapse MESH Headings Anaplasma phagocytophilum/chemistry Anaplasma phagocytophilum/pathogenicity Animals Biotechnology Chromatography, High Pressure Liquid Ehrlichia chaffeensis/chemistry Ehrlichia chaffeensis/pathogenicity HeLa Cells Host-Pathogen Interactions Humans Mass Spectrometry/methods Mass Spectrometry/statistics & numerical data Monkeypox virus/chemistry Monkeypox virus/physiology Proteome/isolation & purification Proteomics/methods Proteomics/statistics & numerical data Salmonella/chemistry Salmonella/pathogenicity Sensitivity and Specificity Systems Biology/methods Systems Biology/statistics & numerical data Tandem Mass Spectrometry Vaccinia virus/chemistry Vaccinia virus/pathogenicity Collapse Grants P41 RR018522 NCRR NIH HHS P41 RR018522-06 NCRR NIH HHS Y1-AI-4894-01 NIAID NIH HHS RR018522 NCRR NIH HHS Collapse
72	Schmidt A, Gehlenborg N, Bodenmiller B, Mueller LN, Campbell D, Mueller M, Aebersold R, Domon B. An integrated, directed mass spectrometric approach for in-depth characterization of complex peptide mixtures. Mol Cell Proteomics 2008;7:2138-50. [PMID: 18511481 PMCID: PMC2577211 DOI: 10.1074/mcp.m700498-mcp200] [Citation(s) in RCA: 122] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2007] [Revised: 04/25/2008] [Indexed: 11/06/2022] Open Abstract LC-MS/MS has emerged as the method of choice for the identification and quantification of protein sample mixtures. For very complex samples such as complete proteomes, the most commonly used LC-MS/MS method, data-dependent acquisition (DDA) precursor selection, is of limited utility. The limited scan speed of current mass spectrometers along with the highly redundant selection of the most intense precursor ions generates a bias in the pool of identified proteins toward those of higher abundance. A directed LC-MS/MS approach that alleviates the limitations of DDA precursor ion selection by decoupling peak detection and sequencing of selected precursor ions is presented. In the first stage of the strategy, all detectable peptide ion signals are extracted from high resolution LC-MS feature maps or aligned sets of feature maps. The selected features or a subset thereof are subsequently sequenced in sequential, non-redundant directed LC-MS/MS experiments, and the MS/MS data are mapped back to the original LC-MS feature map in a fully automated manner. The strategy, implemented on an LTQ-FT MS platform, allowed the specific sequencing of 2,000 features per analysis and enabled the identification of more than 1,600 phosphorylation sites using a single reversed phase separation dimension without the need for time-consuming prefractionation steps. Compared with conventional DDA LC-MS/MS experiments, a substantially higher number of peptides could be identified from a sample, and this increase was more pronounced for low intensity precursor ions. Collapse Key Words Collapse MESH Headings Animals Cell Line Chromatography, High Pressure Liquid/methods Databases, Protein Drosophila Proteins/isolation & purification Drosophila melanogaster Peptides/isolation & purification Phosphopeptides/isolation & purification Proteomics/methods Proteomics/statistics & numerical data Reproducibility of Results Tandem Mass Spectrometry/methods Collapse Grants N01HV28179 NHLBI NIH HHS N01-HV-28179 NHLBI NIH HHS Collapse
73	Guan Y, Myers CL, Lu R, Lemischka IR, Bult CJ, Troyanskaya OG. A genomewide functional network for the laboratory mouse. PLoS Comput Biol 2008;4:e1000165. [PMID: 18818725 PMCID: PMC2527685 DOI: 10.1371/journal.pcbi.1000165] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2008] [Accepted: 07/21/2008] [Indexed: 11/19/2022] Open Abstract Establishing a functional network is invaluable to our understanding of gene function, pathways, and systems-level properties of an organism and can be a powerful resource in directing targeted experiments. In this study, we present a functional network for the laboratory mouse based on a Bayesian integration of diverse genetic and functional genomic data. The resulting network includes probabilistic functional linkages among 20,581 protein-coding genes. We show that this network can accurately predict novel functional assignments and network components and present experimental evidence for predictions related to Nanog homeobox (Nanog), a critical gene in mouse embryonic stem cell pluripotency. An analysis of the global topology of the mouse functional network reveals multiple biologically relevant systems-level features of the mouse proteome. Specifically, we identify the clustering coefficient as a critical characteristic of central modulators that affect diverse pathways as well as genes associated with different phenotype traits and diseases. In addition, a cross-species comparison of functional interactomes on a genomic scale revealed distinct functional characteristics of conserved neighborhoods as compared to subnetworks specific to higher organisms. Thus, our global functional network for the laboratory mouse provides the community with a key resource for discovering protein functions and novel pathway components as well as a tool for exploring systems-level topological and evolutionary features of cellular interactomes. To facilitate exploration of this network by the biomedical research community, we illustrate its application in function and disease gene discovery through an interactive, Web-based, publicly available interface at http://mouseNET.princeton.edu. Functionally related proteins interact in diverse ways to carry out biological processes, and each protein often participates in multiple pathways. Proteins are therefore organized into a complex network through which different functions of the cell are carried out. An accurate description of such a network is invaluable to our understanding of both the system-level features of a cell and those of an individual biological process. In this study, we used a probabilistic model to combine information from diverse genome-scale studies as well as individual investigations to generate a global functional network for mouse. Our analysis of the global topology of this network reveals biologically relevant systems-level characteristics of the mouse proteome, including conservation of functional neighborhoods and network features characteristic of known disease genes and key transcriptional regulators. We have made this network publicly available for search and dynamic exploration by researchers in the community. Our Web interface enables users to easily generate hypotheses regarding potential functional roles of uncharacterized proteins, investigate possible links between their proteins of interest and disease, and identify new players in specific biological processes. Collapse Key Words Collapse MESH Headings Animals Bayes Theorem Cell Differentiation/genetics Cluster Analysis Computational Biology/methods Database Management Systems Databases, Genetic Down-Regulation Gene Regulatory Networks Genomics/statistics & numerical data Homeodomain Proteins/genetics Internet MAP Kinase Signaling System/genetics Mice Mice, Knockout/genetics Models, Statistical Nanog Homeobox Protein Phenotype Proteomics/statistics & numerical data Saccharomyces cerevisiae/genetics Species Specificity User-Computer Interface Collapse Grants P50 GM071508 NIGMS NIH HHS R01 GM071966 NIGMS NIH HHS Collapse
74	Mazumder R, Vasudevan S. Structure-guided comparative analysis of proteins: principles, tools, and applications for predicting function. PLoS Comput Biol 2008;4:e1000151. [PMID: 18818720 PMCID: PMC2515338 DOI: 10.1371/journal.pcbi.1000151] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open Abstract Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Protein/statistics & numerical data Molecular Structure Phylogeny Protein Conformation Proteins/chemistry Proteins/genetics Proteins/physiology Proteomics/statistics & numerical data Sequence Alignment/statistics & numerical data Collapse Grants Collapse
75	Kim S, Gupta N, Pevzner PA. Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. J Proteome Res 2008;7:3354-63. [PMID: 18597511 PMCID: PMC2689316 DOI: 10.1021/pr8001244] [Citation(s) in RCA: 332] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Abstract A key problem in computational proteomics is distinguishing between correct and false peptide identifications. We argue that evaluating the error rates of peptide identifications is not unlike computing generating functions in combinatorics. We show that the generating functions and their derivatives ( spectral energy and spectral probability) represent new features of tandem mass spectra that, similarly to Delta-scores, significantly improve peptide identifications. Furthermore, the spectral probability provides a rigorous solution to the problem of computing statistical significance of spectral identifications. The spectral energy/probability approach improves the sensitivity-specificity tradeoff of existing MS/MS search tools, addresses the notoriously difficult problem of "one-hit-wonders" in mass spectrometry, and often eliminates the need for decoy database searches. We therefore argue that the generating function approach has the potential to increase the number of peptide identifications in MS/MS searches. Collapse Key Words Collapse MESH Headings Bacterial Proteins/analysis Data Interpretation, Statistical Databases, Factual/statistics & numerical data Peptides/analysis Probability Proteomics/statistics & numerical data Shewanella/chemistry Tandem Mass Spectrometry/statistics & numerical data Collapse Grants R01 RR016522 NCRR NIH HHS R01 RR016522-01A1 NCRR NIH HHS Howard Hughes Medical Institute 1 R01 RR 16522 NCRR NIH HHS Collapse