1
|
Orchard SE. What have Data Standards ever done for us? Mol Cell Proteomics 2025:100933. [PMID: 40024375 DOI: 10.1016/j.mcpro.2025.100933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 02/21/2025] [Accepted: 02/24/2025] [Indexed: 03/04/2025] Open
Abstract
The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies for both the field of molecular interaction and that of mass spectrometry for more than 20 years. This review explores some of the ways that the proteomics community has benefitted from the development of community standards and takes a look at some of the tools and resources that have been improved or developed as a result of the work of the HUPO-PSI.
Collapse
Affiliation(s)
- S E Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| |
Collapse
|
2
|
Combe CW, Kolbowski L, Fischer L, Koskinen V, Klein J, Leitner A, Jones AR, Vizcaíno JA, Rappsilber J. mzIdentML 1.3.0 - Essential progress on the support of crosslinking and other identifications based on multiple spectra. Proteomics 2024; 24:e2300385. [PMID: 39001627 DOI: 10.1002/pmic.202300385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 02/07/2024] [Accepted: 02/09/2024] [Indexed: 10/10/2024]
Abstract
The mzIdentML data format, originally developed by the Proteomics Standards Initiative in 2011, is the open XML data standard for peptide and protein identification results coming from mass spectrometry. We present mzIdentML version 1.3.0, which introduces new functionality and support for additional use cases. First of all, a new mechanism for encoding identifications based on multiple spectra has been introduced. Furthermore, the main mzIdentML specification document can now be supplemented by extension documents which provide further guidance for encoding specific use cases for different proteomics subfields. One extension document has been added, covering additional use cases for the encoding of crosslinked peptide identifications. The ability to add extension documents facilitates keeping the mzIdentML standard up to date with advances in the proteomics field, without having to change the main specification document. The crosslinking extension document provides further explanation of the crosslinking use cases already supported in mzIdentML version 1.2.0, and provides support for encoding additional scenarios that are critical to reflect developments in the crosslinking field and facilitate its integration in structural biology. These are: (i) support for cleavable crosslinkers, (ii) support for internally linked peptides, (iii) support for noncovalently associated peptides, and (iv) improved support for encoding scores and the corresponding thresholds.
Collapse
Affiliation(s)
- Colin W Combe
- Wellcome Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
- Chair of Bioanalytics, Technische Universität Berlin, Berlin, Germany
| | - Lars Kolbowski
- Chair of Bioanalytics, Technische Universität Berlin, Berlin, Germany
| | - Lutz Fischer
- Chair of Bioanalytics, Technische Universität Berlin, Berlin, Germany
| | | | - Joshua Klein
- Program for Bioinformatics, Boston University, Boston, Massachusetts, USA
| | - Alexander Leitner
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zurich, Switzerland
| | - Andrew R Jones
- Department of Biochemistry & Systems Biology, University of Liverpool, Liverpool, UK
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute, (EMBL-EBI), Hinxton, Cambridge, UK
| | - Juri Rappsilber
- Wellcome Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
- Chair of Bioanalytics, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
3
|
Panni S, Panneerselvam K, Porras P, Duesbury M, Perfetto L, Licata L, Hermjakob H, Orchard S. The landscape of microRNA interaction annotation: analysis of three rare disorders as a case study. Database (Oxford) 2023; 2023:baad066. [PMID: 37819683 PMCID: PMC10566539 DOI: 10.1093/database/baad066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 08/29/2023] [Accepted: 09/15/2023] [Indexed: 10/13/2023]
Abstract
In recent years, a huge amount of data on ncRNA interactions has been described in scientific papers and databases. Although considerable effort has been made to annotate the available knowledge in public repositories, there are still significant discrepancies in how different resources capture and interpret data on ncRNA functional and physical associations. In the present paper, we present a collection of microRNA-mRNA interactions annotated from the scientific literature following recognized standard criteria and focused on microRNAs, which regulate genes associated with rare diseases as a case study. The list of protein-coding genes with a known role in specific rare diseases was retrieved from the Genome England PanelApp, and associated microRNA-mRNA interactions were annotated in the IntAct database and compared with other datasets. RNAcentral identifiers were used for unambiguous, stable identification of ncRNAs. The information about the interaction was enhanced by a detailed description of the cell types and experimental conditions, providing a computer-interpretable summary of the published data, integrated with the huge amount of protein interactions already gathered in the database. Furthermore, for each interaction, the binding sites of the microRNA are precisely mapped on a well-defined mRNA transcript of the target gene. This information is crucial to conceive and design optimal microRNA mimics or inhibitors to interfere in vivo with a deregulated process. As these approaches become more feasible, high-quality, reliable networks of microRNA interactions are needed to help, for instance, in the selection of the best target to be inhibited and to predict potential secondary off-target effects. Database URL https://www.ebi.ac.uk/intact.
Collapse
Affiliation(s)
- Simona Panni
- Dipartimento di Biologia Ecologia e Scienze della Terra, Università della Calabria, Rende 87036, Italy
| | - Kalpana Panneerselvam
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus Hinxton, Cambridge CB10 1SD, UK
| | - Pablo Porras
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus Hinxton, Cambridge CB10 1SD, UK
- Astra Zeneca, Data Office, Data Science and AI, UK Academy House, 136 Hills Road, Cambridge CB2 8PA, UK
| | - Margaret Duesbury
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus Hinxton, Cambridge CB10 1SD, UK
| | - Livia Perfetto
- Department of Biology and Biotechnologies “Charles Darwin”, La Sapienza University, Rome, Italy
| | - Luana Licata
- Department of Biology, University of Tor Vergata, Rome, Italy
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus Hinxton, Cambridge CB10 1SD, UK
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
4
|
Caliskan A, Dangwal S, Dandekar T. Metadata integrity in bioinformatics: Bridging the gap between data and knowledge. Comput Struct Biotechnol J 2023; 21:4895-4913. [PMID: 37860229 PMCID: PMC10582761 DOI: 10.1016/j.csbj.2023.10.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/04/2023] [Accepted: 10/04/2023] [Indexed: 10/21/2023] Open
Abstract
In the fast-evolving landscape of biomedical research, the emergence of big data has presented researchers with extraordinary opportunities to explore biological complexities. In biomedical research, big data imply also a big responsibility. This is not only due to genomics data being sensitive information but also due to genomics data being shared and re-analysed among the scientific community. This saves valuable resources and can even help to find new insights in silico. To fully use these opportunities, detailed and correct metadata are imperative. This includes not only the availability of metadata but also their correctness. Metadata integrity serves as a fundamental determinant of research credibility, supporting the reliability and reproducibility of data-driven findings. Ensuring metadata availability, curation, and accuracy are therefore essential for bioinformatic research. Not only must metadata be readily available, but they must also be meticulously curated and ideally error-free. Motivated by an accidental discovery of a critical metadata error in patient data published in two high-impact journals, we aim to raise awareness for the need of correct, complete, and curated metadata. We describe how the metadata error was found, addressed, and present examples for metadata-related challenges in omics research, along with supporting measures, including tools for checking metadata and software to facilitate various steps from data analysis to published research.
Collapse
Affiliation(s)
- Aylin Caliskan
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Seema Dangwal
- Stanford Cardiovascular Institute, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305-5101, United States
| | - Thomas Dandekar
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| |
Collapse
|
5
|
Naake T, Rainer J, Huber W. MsQuality: an interoperable open-source package for the calculation of standardized quality metrics of mass spectrometry data. Bioinformatics 2023; 39:btad618. [PMID: 37812234 PMCID: PMC10580266 DOI: 10.1093/bioinformatics/btad618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 09/08/2023] [Accepted: 10/06/2023] [Indexed: 10/10/2023] Open
Abstract
MOTIVATION Multiple factors can impact accuracy and reproducibility of mass spectrometry data. There is a need to integrate quality assessment and control into data analytic workflows. RESULTS The MsQuality package calculates 43 low-level quality metrics based on the controlled mzQC vocabulary defined by the HUPO-PSI on a single mass spectrometry-based measurement of a sample. It helps to identify low-quality measurements and track data quality. Its use of community-standard quality metrics facilitates comparability of quality assessment and control (QA/QC) criteria across datasets. AVAILABILITY AND IMPLEMENTATION The R package MsQuality is available through Bioconductor at https://bioconductor.org/packages/MsQuality.
Collapse
Affiliation(s)
- Thomas Naake
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Johannes Rainer
- Institute for Biomedicine (Affiliated to the University of Lübeck), Eurac Research, Bolzano 39100, Italy
| | - Wolfgang Huber
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| |
Collapse
|
6
|
Deutsch EW, Vizcaíno JA, Jones AR, Binz PA, Lam H, Klein J, Bittremieux W, Perez-Riverol Y, Tabb DL, Walzer M, Ricard-Blum S, Hermjakob H, Neumann S, Mak TD, Kawano S, Mendoza L, Van Den Bossche T, Gabriels R, Bandeira N, Carver J, Pullman B, Sun Z, Hoffmann N, Shofstahl J, Zhu Y, Licata L, Quaglia F, Tosatto SCE, Orchard SE. Proteomics Standards Initiative at Twenty Years: Current Activities and Future Work. J Proteome Res 2023; 22:287-301. [PMID: 36626722 PMCID: PMC9903322 DOI: 10.1021/acs.jproteome.2c00637] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Indexed: 01/11/2023]
Abstract
The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI standards. We briefly describe the current state of the many existing PSI standards, some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete. Then the set of proposals currently being developed are described, with an open call to the community for participation in the forging of the next generation of standards. Finally, we describe some synergies and collaborations with other organizations and look to the future in how the PSI will continue to promote the open sharing of data and thus accelerate the progress of the field of proteomics.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Andrew R. Jones
- Institute
of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Pierre-Alain Binz
- Clinical
Chemistry Service, Lausanne University Hospital, 1011 976 Lausanne, Switzerland
| | - Henry Lam
- Department
of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, P. R. China.
| | - Joshua Klein
- Program for
Bioinformatics, Boston University, Boston, Massachusetts 02215, United States
| | - Wout Bittremieux
- Skaggs
School
of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
- Department
of Computer Science, University of Antwerp, 2020 Antwerpen, Belgium
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L. Tabb
- SA MRC
Centre for TB Research, DST/NRF Centre of Excellence for Biomedical
TB Research, Division of Molecular Biology and Human Genetics, Faculty
of Medicine and Health Sciences, Stellenbosch
University, Cape Town 7602, South Africa
| | - Mathias Walzer
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sylvie Ricard-Blum
- Univ.
Lyon, Université Lyon 1, ICBMS, UMR 5246, 69622 Villeurbanne, France
| | - Henning Hermjakob
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Steffen Neumann
- Bioinformatics
and Scientific Data, Leibniz Institute of
Plant Biochemistry, 06120 Halle, Germany
- German
Centre for Integrative Biodiversity Research (iDiv), 04103 Halle-Jena-Leipzig, Germany
| | - Tytus D. Mak
- Mass Spectrometry
Data Center, National Institute of Standards
and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United
States
| | - Shin Kawano
- Database
Center for Life Science, Joint Support Center for Data Science Research, Research Organization of Information and Systems, Chiba 277-0871, Japan
- Faculty
of Contemporary Society, Toyama University
of International Studies, Toyama 930-1292, Japan
- School
of Frontier Engineering, Kitasato University, Sagamihara 252-0373, Japan
| | - Luis Mendoza
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Tim Van Den Bossche
- VIB-UGent
Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9052 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent
Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9052 Ghent, Belgium
| | - Nuno Bandeira
- Skaggs
School
of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Jeremy Carver
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Benjamin Pullman
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Zhi Sun
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Nils Hoffmann
- Institute
for Bio- and Geosciences (IBG-5), Forschungszentrum
Jülich GmbH, 52428 Jülich, Germany
| | - Jim Shofstahl
- Thermo
Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Yunping Zhu
- National
Center for Protein Sciences (Beijing), Beijing
Institute of Lifeomics, #38, Life Science Park, Changping District, Beijing 102206, China
| | - Luana Licata
- Fondazione
Human Technopole, 20157 Milan, Italy
- Department
of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Federica Quaglia
- Institute
of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), 70126 Bari, Italy
- Department
of Biomedical Sciences, University of Padova, 35131 Padova, Italy
| | | | - Sandra E. Orchard
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
7
|
Jones AR, Deutsch EW, Vizcaíno JA. Is DIA proteomics data FAIR? Current data sharing practices, available bioinformatics infrastructure and recommendations for the future. Proteomics 2022; 23:e2200014. [PMID: 36074795 PMCID: PMC10155627 DOI: 10.1002/pmic.202200014] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/27/2022] [Accepted: 08/29/2022] [Indexed: 11/06/2022]
Abstract
Data independent acquisition (DIA) proteomics techniques have matured enormously in recent years, thanks to multiple technical developments in e.g. instrumentation and data analysis approaches. However, there are many improvements that are still possible for DIA data in the area of the FAIR (Findability, Accessibility, Interoperability and Reusability) data principles. These include more tailored data sharing practices and open data standards, since public databases and data standards for proteomics were mostly designed with DDA data in mind. Here we first describe the current state of the art in the context of FAIR data for proteomics in general, and for DIA approaches in particular. For improving the current situation for DIA data, we make the following recommendations for the future: (i) development of an open data standard for spectral libraries; (ii) make mandatory the availability of the spectral libraries used in DIA experiments in ProteomeXchange resources; (iii) improve the support for DIA data in the data standards developed by the Proteomics Standards Initiative; and (iv) improve the support for DIA datasets in ProteomeXchange resources, including more tailored metadata requirements. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Andrew R Jones
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 3BX, UK
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington, 98109, USA
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
8
|
Hoffmann N, Mayer G, Has C, Kopczynski D, Al Machot F, Schwudke D, Ahrends R, Marcus K, Eisenacher M, Turewicz M. A Current Encyclopedia of Bioinformatics Tools, Data Formats and Resources for Mass Spectrometry Lipidomics. Metabolites 2022; 12:584. [PMID: 35888710 PMCID: PMC9319858 DOI: 10.3390/metabo12070584] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/17/2022] [Accepted: 06/19/2022] [Indexed: 12/13/2022] Open
Abstract
Mass spectrometry is a widely used technology to identify and quantify biomolecules such as lipids, metabolites and proteins necessary for biomedical research. In this study, we catalogued freely available software tools, libraries, databases, repositories and resources that support lipidomics data analysis and determined the scope of currently used analytical technologies. Because of the tremendous importance of data interoperability, we assessed the support of standardized data formats in mass spectrometric (MS)-based lipidomics workflows. We included tools in our comparison that support targeted as well as untargeted analysis using direct infusion/shotgun (DI-MS), liquid chromatography-mass spectrometry, ion mobility or MS imaging approaches on MS1 and potentially higher MS levels. As a result, we determined that the Human Proteome Organization-Proteomics Standards Initiative standard data formats, mzML and mzTab-M, are already supported by a substantial number of recent software tools. We further discuss how mzTab-M can serve as a bridge between data acquisition and lipid bioinformatics tools for interpretation, capturing their output and transmitting rich annotated data for downstream processing. However, we identified several challenges of currently available tools and standards. Potential areas for improvement were: adaptation of common nomenclature and standardized reporting to enable high throughput lipidomics and improve its data handling. Finally, we suggest specific areas where tools and repositories need to improve to become FAIRer.
Collapse
Affiliation(s)
- Nils Hoffmann
- Forschungszentrum Jülich GmbH, Institute for Bio- and Geosciences (IBG-5), 52425 Jülich, Germany
| | - Gerhard Mayer
- Institute of Medical Systems Biology, Ulm University, 89081 Ulm, Germany;
| | - Canan Has
- Biological Mass Spectrometry, Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany;
- University Hospital Carl Gustav Carus, 01307 Dresden, Germany
- CENTOGENE GmbH, 18055 Rostock, Germany
| | - Dominik Kopczynski
- Department of Analytical Chemistry, University of Vienna, 1090 Vienna, Austria; (D.K.); (R.A.)
| | - Fadi Al Machot
- Faculty of Science and Technology, Norwegian University for Life Science (NMBU), 1433 Ås, Norway;
| | - Dominik Schwudke
- Bioanalytical Chemistry, Forschungszentrum Borstel, Leibniz Lung Center, 23845 Borstel, Germany;
- Airway Research Center North, German Center for Lung Research (DZL), 23845 Borstel, Germany
- German Center for Infection Research (DZIF), TTU Tuberculosis, 23845 Borstel, Germany
| | - Robert Ahrends
- Department of Analytical Chemistry, University of Vienna, 1090 Vienna, Austria; (D.K.); (R.A.)
| | - Katrin Marcus
- Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Ruhr University Bochum, 44801 Bochum, Germany; (K.M.); (M.E.)
| | - Martin Eisenacher
- Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Ruhr University Bochum, 44801 Bochum, Germany; (K.M.); (M.E.)
- Faculty of Medicine, Medizinisches Proteom-Center, Ruhr University Bochum, 44801 Bochum, Germany
| | - Michael Turewicz
- Institute for Clinical Biochemistry and Pathobiochemistry, German Diabetes Center (DDZ), Leibniz Center for Diabetes Research at Heinrich-Heine-University Düsseldorf, 40225 Düsseldorf, Germany
- German Center for Diabetes Research (DZD), Partner Düsseldorf, 85764 Neuherberg, Germany
| |
Collapse
|
9
|
LeDuc RD, Deutsch EW, Binz PA, Fellers RT, Cesnik AJ, Klein JA, Van Den Bossche T, Gabriels R, Yalavarthi A, Perez-Riverol Y, Carver J, Bittremieux W, Kawano S, Pullman B, Bandeira N, Kelleher NL, Thomas PM, Vizcaíno JA. Proteomics Standards Initiative's ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms. J Proteome Res 2022; 21:1189-1195. [PMID: 35290070 PMCID: PMC7612572 DOI: 10.1021/acs.jproteome.1c00771] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically induced, and artifactual modifications. The Human Proteome Organization Proteomics Standards Initiative in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human- and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical) and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.
Collapse
Affiliation(s)
- Richard D LeDuc
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Pierre-Alain Binz
- Clinical Chemistry Service, Lausanne University Hospital, 1011 Lausanne, Switzerland
| | - Ryan T Fellers
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Anthony J Cesnik
- Department of Genetics, Stanford University, Stanford, California 94305, United States
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, California 94158, United States
- SciLifeLab, School of Engineering Sciences in Chemistry Biotechnology and Health, KTH-Royal Institute of Technology, SE-171 21 Solna, Stockholm, Sweden 113 51
| | - Joshua A Klein
- Program for Bioinformatics, Boston University, Boston, Massachusetts 02215, United States
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, Technologiepark 75-FSVM II, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Technologiepark 75-FSVM II, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Arshika Yalavarthi
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | | | - Shin Kawano
- Toyama University of International Studies, Toyama, 930-1292 Toyama, Higashikuromaki, 6 5-1, Japan
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa, Chiba 277-0871, Japan
| | | | | | - Neil L Kelleher
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Paul M Thomas
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
10
|
Strömert P, Hunold J, Castro A, Neumann S, Koepler O. Ontologies4Chem: the landscape of ontologies in chemistry. PURE APPL CHEM 2022. [DOI: 10.1515/pac-2021-2007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
For a long time, databases such as CAS, Reaxys, PubChem or ChemSpider mostly rely on unique numerical identifiers or chemical structure identifiers like InChI, SMILES or others to link data across heterogeneous data sources. The retrospective processing of information and fragmented data from text publications to maintain these databases is a cumbersome process. Ontologies are a holistic approach to semantically describe data, information and knowledge of a domain. They provide terms, relations and logic to semantically annotate and link data building knowledge graphs. The application of standard taxonomies and vocabularies from the very beginning of data generation and along research workflows in electronic lab notebooks (ELNs), software tools, and their final publication in data repositories create FAIR data straightforwardly. Thus a proper semantic description of an investigation and the why, how, where, when, and by whom data was produced in conjunction with the description and representation of research data is a natural outcome in contrast to the retrospective processing of research publications as we know it. In this work we provide an overview of ontologies in chemistry suitable to represent concepts of research and research data. These ontologies are evaluated against several criteria derived from the FAIR data principles and their possible application in the digitisation of research data management workflows.
Collapse
Affiliation(s)
- Philip Strömert
- TIB – Leibniz Information Centre for Science and Technology , Welfengarten 1 B, 30167 Hannover , Germany
| | - Johannes Hunold
- TIB – Leibniz Information Centre for Science and Technology , Welfengarten 1 B, 30167 Hannover , Germany
| | - André Castro
- TIB – Leibniz Information Centre for Science and Technology , Welfengarten 1 B, 30167 Hannover , Germany
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry , Weinberg 3 , 06120 Halle , Germany
| | - Oliver Koepler
- TIB – Leibniz Information Centre for Science and Technology , Welfengarten 1 B, 30167 Hannover , Germany
| |
Collapse
|
11
|
Touré V, Zobolas J, Kuiper M, Vercruysse S. CausalBuilder: bringing the MI2CAST causal interaction annotation standard to the curator. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6129748. [PMID: 33547799 PMCID: PMC7904049 DOI: 10.1093/database/baaa107] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 11/16/2020] [Accepted: 12/07/2020] [Indexed: 12/23/2022]
Abstract
Molecular causal interactions are defined as regulatory connections between biological components. They are commonly retrieved from biological experiments and can be used for connecting biological molecules together to enable the building of regulatory computational models that represent biological systems. However, including a molecular causal interaction in a model requires assessing its relevance to that model, based on the detailed knowledge about the biomolecules, interaction type and biological context. In order to standardize the representation of this knowledge in 'causal statements', we recently developed the Minimum Information about a Molecular Interaction Causal Statement (MI2CAST) guidelines. Here, we introduce causalBuilder: an intuitive web-based curation interface for the annotation of molecular causal interactions that comply with the MI2CAST standard. The causalBuilder prototype essentially embeds the MI2CAST curation guidelines in its interface and makes its rules easy to follow by a curator. In addition, causalBuilder serves as an original application of the Visual Syntax Method general-purpose curation technology and provides both curators and tool developers with an interface that can be fully configured to allow focusing on selected MI2CAST concepts to annotate. After the information is entered, the causalBuilder prototype produces genuine causal statements that can be exported in different formats.
Collapse
Affiliation(s)
- Vasundra Touré
- Department of Biology, Norwegian University of Science and Technology (NTNU), Høgskoleringen 5, 7491 Trondheim, Norway
| | - John Zobolas
- Department of Biology, Norwegian University of Science and Technology (NTNU), Høgskoleringen 5, 7491 Trondheim, Norway
| | - Martin Kuiper
- Department of Biology, Norwegian University of Science and Technology (NTNU), Høgskoleringen 5, 7491 Trondheim, Norway
| | - Steven Vercruysse
- Department of Biology, Norwegian University of Science and Technology (NTNU), Høgskoleringen 5, 7491 Trondheim, Norway
| |
Collapse
|
12
|
Kanza S, Graham Frey J. Semantic Technologies in Drug Discovery. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11520-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
13
|
Ma J, Chen T, Wu S, Yang C, Bai M, Shu K, Li K, Zhang G, Jin Z, He F, Hermjakob H, Zhu Y. iProX: an integrated proteome resource. Nucleic Acids Res 2020; 47:D1211-D1217. [PMID: 30252093 PMCID: PMC6323926 DOI: 10.1093/nar/gky869] [Citation(s) in RCA: 1153] [Impact Index Per Article: 230.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 09/14/2018] [Indexed: 11/13/2022] Open
Abstract
Sharing of research data in public repositories has become best practice in academia. With the accumulation of massive data, network bandwidth and storage requirements are rapidly increasing. The ProteomeXchange (PX) consortium implements a mode of centralized metadata and distributed raw data management, which promotes effective data sharing. To facilitate open access of proteome data worldwide, we have developed the integrated proteome resource iProX (http://www.iprox.org) as a public platform for collecting and sharing raw data, analysis results and metadata obtained from proteomics experiments. The iProX repository employs a web-based proteome data submission process and open sharing of mass spectrometry-based proteomics datasets. Also, it deploys extensive controlled vocabularies and ontologies to annotate proteomics datasets. Users can use a GUI to provide and access data through a fast Aspera-based transfer tool. iProX is a full member of the PX consortium; all released datasets are freely accessible to the public. iProX is based on a high availability architecture and has been deployed as part of the proteomics infrastructure of China, ensuring long-term and stable resource support. iProX will facilitate worldwide data analysis and sharing of proteomics experiments.
Collapse
Affiliation(s)
- Jie Ma
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| | - Tao Chen
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| | - Songfeng Wu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| | - Chunyuan Yang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| | - Mingze Bai
- Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Kunxian Shu
- Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Kenli Li
- National Supercomputing Center in Changsha, Hunan University, Changsha 410082, China
| | - Guoqing Zhang
- Shanghai Center for Bioinformation Technology, Shanghai Institutes of Biomedicine, Shanghai Academy of Science and Technology, Shanghai 200235, China
| | - Zhong Jin
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
| | - Fuchu He
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| | - Henning Hermjakob
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Yunping Zhu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| |
Collapse
|
14
|
Chen L, Clark JZ, Nelson JW, Kaissling B, Ellison DH, Knepper MA. Renal-Tubule Epithelial Cell Nomenclature for Single-Cell RNA-Sequencing Studies. J Am Soc Nephrol 2019; 30:1358-1364. [PMID: 31253652 DOI: 10.1681/asn.2019040415] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Affiliation(s)
- Lihe Chen
- Epithelial Systems Biology Laboratory, Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - Jevin Z Clark
- Epithelial Systems Biology Laboratory, Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - Jonathan W Nelson
- Division of Nephrology and Hypertension, Oregon Health & Science University, Portland, Oregon; and
| | | | - David H Ellison
- Division of Nephrology and Hypertension, Oregon Health & Science University, Portland, Oregon; and
| | - Mark A Knepper
- Epithelial Systems Biology Laboratory, Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland;
| |
Collapse
|
15
|
Binz PA, Shofstahl J, Vizcaíno JA, Barsnes H, Chalkley RJ, Menschaert G, Alpi E, Clauser K, Eng JK, Lane L, Seymour SL, Sánchez LFH, Mayer G, Eisenacher M, Perez-Riverol Y, Kapp EA, Mendoza L, Baker PR, Collins A, Van Den Bossche T, Deutsch EW. Proteomics Standards Initiative Extended FASTA Format. J Proteome Res 2019; 18:2686-2692. [PMID: 31081335 DOI: 10.1021/acs.jproteome.9b00064] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Mass-spectrometry-based proteomics enables the high-throughput identification and quantification of proteins, including sequence variants and post-translational modifications (PTMs) in biological samples. However, most workflows require that such variations be included in the search space used to analyze the data, and doing so remains challenging with most analysis tools. In order to facilitate the search for known sequence variants and PTMs, the Proteomics Standards Initiative (PSI) has designed and implemented the PSI extended FASTA format (PEFF). PEFF is based on the very popular FASTA format but adds a uniform mechanism for encoding substantially more metadata about the sequence collection as well as individual entries, including support for encoding known sequence variants, PTMs, and proteoforms. The format is very nearly backward compatible, and as such, existing FASTA parsers will require little or no changes to be able to read PEFF files as FASTA files, although without supporting any of the extra capabilities of PEFF. PEFF is defined by a full specification document, controlled vocabulary terms, a set of example files, software libraries, and a file validator. Popular software and resources are starting to support PEFF, including the sequence search engine Comet and the knowledge bases neXtProt and UniProtKB. Widespread implementation of PEFF is expected to further enable proteogenomics and top-down proteomics applications by providing a standardized mechanism for encoding protein sequences and their known variations. All the related documentation, including the detailed file format specification and example files, are available at http://www.psidev.info/peff .
Collapse
Affiliation(s)
- Pierre-Alain Binz
- CHUV Centre Hospitalier Universitaire Vaudois , CH-1011 Lausanne 14 , Switzerland
| | - Jim Shofstahl
- Thermo Fisher Scientific , 355 River Oaks Parkway , San Jose , California 95134 , United States
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , United Kingdom
| | - Harald Barsnes
- Proteomics Unit, Department of Biomedicine , University of Bergen , N-5009 Bergen , Norway.,Computational Biology Unit, Department of Informatics , University of Bergen , N-5008 Bergen , Norway
| | - Robert J Chalkley
- University California at San Francisco , San Francisco , California 94143 , United States
| | - Gerben Menschaert
- Biobix, Department of Data Analysis and Mathematical Modelling , Ghent University , 9000 Ghent , Belgium
| | - Emanuele Alpi
- European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , United Kingdom
| | - Karl Clauser
- Broad Institute , Cambridge , Massachusetts 02142 , United States
| | - Jimmy K Eng
- University of Washington , Seattle , Washington 98195 , United States
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics , CH-1211 Geneva 4 , Switzerland.,Department of Microbiology and Molecular Medicine, Faculty of Medicine , University of Geneva , CH-1211 Geneva 4 , Switzerland
| | - Sean L Seymour
- Seymour Data Science, LLC , San Francisco , California 95000 , United States
| | - Luis Francisco Hernández Sánchez
- K.G. Jebsen Center for Diabetes Research, Department of Clinical Science , University of Bergen , 5021 Bergen , Norway.,Center for Medical Genetics and Molecular Medicine , Haukeland University Hospital , 5021 Bergen , Norway
| | - Gerhard Mayer
- Medical Faculty, Medizinisches Proteom-Center , Ruhr University Bochum , D-44801 Bochum , Germany
| | - Martin Eisenacher
- Medical Faculty, Medizinisches Proteom-Center , Ruhr University Bochum , D-44801 Bochum , Germany
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , United Kingdom
| | - Eugene A Kapp
- Walter & Eliza Hall Institute of Medical Research and the University of Melbourne , Melbourne , VIC 3052 , Australia
| | - Luis Mendoza
- Institute for Systems Biology , Seattle , Washington 98109 , United States
| | - Peter R Baker
- University California at San Francisco , San Francisco , California 94143 , United States
| | - Andrew Collins
- Department of Functional and Comparative Genomics, Institute of Integrated Biology , University of Liverpool , Liverpool L69 7ZB , United Kingdom
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology , Ghent University , 9000 Ghent , Belgium
| | - Eric W Deutsch
- Institute for Systems Biology , Seattle , Washington 98109 , United States
| |
Collapse
|
16
|
Klein J, Zaia J. psims - A Declarative Writer for mzML and mzIdentML for Python. Mol Cell Proteomics 2019; 18:571-575. [PMID: 30563850 PMCID: PMC6398200 DOI: 10.1074/mcp.rp118.001070] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 12/12/2018] [Indexed: 01/04/2023] Open
Abstract
mzML and mzIdentML are commonly used, powerful tools for representing mass spectrometry data and derived identification information. These formats are complex, requiring non-trivial logic to translate data into the appropriate representation. Most published implementations are tightly coupled to data structures. The most complete implementations are written in compiled languages that cannot expose the complete flexibility of the implementation to external programs or bindings. To our knowledge, there are no complete implementations for mzML or mzIdentML available to scripting languages like Python or R. We present psims, a library written in Python for writing mzML and mzIdentML. The library allows writing either XML format using built-in Python data structures. It includes a controlled vocabulary resolution system to simplify the encoding process and an identity tracking system to manage entity relationships. The source code is available at https://github.com/mobiusklein/psims, and through the Python Package Index as psims, licensed under the Apache 2 common license.
Collapse
Affiliation(s)
- Joshua Klein
- From the ‡Program for Bioinformatics, Boston University, Boston, Massachusetts 02215
| | - Joseph Zaia
- From the ‡Program for Bioinformatics, Boston University, Boston, Massachusetts 02215;
- §Department of Biochemistry, Boston University, Boston, Massachusetts 02118
| |
Collapse
|
17
|
Sivade Dumousseau M, Koch M, Shrivastava A, Alonso-López D, De Las Rivas J, Del-Toro N, Combe CW, Meldal BHM, Heimbach J, Rappsilber J, Sullivan J, Yehudi Y, Orchard S. JAMI: a Java library for molecular interactions and data interoperability. BMC Bioinformatics 2018; 19:133. [PMID: 29642846 PMCID: PMC5896107 DOI: 10.1186/s12859-018-2119-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2017] [Accepted: 03/20/2018] [Indexed: 11/22/2022] Open
Abstract
Background A number of different molecular interactions data download formats now exist, designed to allow access to these valuable data by diverse user groups. These formats include the PSI-XML and MITAB standard interchange formats developed by Molecular Interaction workgroup of the HUPO-PSI in addition to other, use-specific downloads produced by other resources. The onus is currently on the user to ensure that a piece of software is capable of read/writing all necessary versions of each format. This problem may increase, as data providers strive to meet ever more sophisticated user demands and data types. Results A collaboration between EMBL-EBI and the University of Cambridge has produced JAMI, a single library to unify standard molecular interaction data formats such as PSI-MI XML and PSI-MITAB. The JAMI free, open-source library enables the development of molecular interaction computational tools and pipelines without the need to produce different versions of software to read different versions of the data formats. Conclusion Software and tools developed on top of the JAMI framework are able to integrate and support both PSI-MI XML and PSI-MITAB. The use of JAMI avoids the requirement to chain conversions between formats in order to reach a desired output format and prevents code and unit test duplication as the code becomes more modular. JAMI’s model interfaces are abstracted from the underlying format, hiding the complexity and requirements of each data format from developers using JAMI as a library.
Collapse
Affiliation(s)
- M Sivade Dumousseau
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - M Koch
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - A Shrivastava
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - D Alonso-López
- Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Científicas (CSIC) and Universidad de Salamanca (USAL), 37007, Salamanca, Spain
| | - J De Las Rivas
- Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Científicas (CSIC) and Universidad de Salamanca (USAL), 37007, Salamanca, Spain
| | - N Del-Toro
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - C W Combe
- Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3BF, UK
| | - B H M Meldal
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - J Heimbach
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, UK.,Department of Genetics, University of Cambridge, Cambridge, UK
| | - J Rappsilber
- Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3BF, UK.,Bioanalytics, Institute for Biotechnology, Technische Universität Berlin, 13355, Berlin, Germany
| | - J Sullivan
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, UK.,Department of Genetics, University of Cambridge, Cambridge, UK
| | - Y Yehudi
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge, UK.,Department of Genetics, University of Cambridge, Cambridge, UK
| | - S Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK.
| |
Collapse
|
18
|
Deutsch EW, Orchard S, Binz PA, Bittremieux W, Eisenacher M, Hermjakob H, Kawano S, Lam H, Mayer G, Menschaert G, Perez-Riverol Y, Salek RM, Tabb DL, Tenzer S, Vizcaíno JA, Walzer M, Jones AR. Proteomics Standards Initiative: Fifteen Years of Progress and Future Work. J Proteome Res 2017; 16:4288-4298. [PMID: 28849660 PMCID: PMC5715286 DOI: 10.1021/acs.jproteome.7b00370] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Indexed: 12/21/2022]
Abstract
The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 years. Under the guidance of the chair, cochairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Furthermore, new standards are currently either in the final stages of completion (proBed and proBAM for proteogenomics results as well as PEFF) or in early stages of design (a spectral library standard format, a universal spectrum identifier, the qcML quality control format, and the Protein Expression Interface (PROXI) web services Application Programming Interface). In this work we review the current status of all of these aspects of the PSI, describe synergies with other efforts such as the ProteomeXchange Consortium, the Human Proteome Project, and the metabolomics community, and provide a look at future directions of the PSI.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Sandra Orchard
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Pierre-Alain Binz
- CHUV
Centre Hospitalier Universitaire Vaudois, 1011 Lausanne, Switzerland
| | - Wout Bittremieux
- Department
of Mathematics and Computer Science, University
of Antwerp, Middelheimlaan
1, 2020 Antwerp, Belgium
| | - Martin Eisenacher
- Medizinisches
Proteom Center (MPC), Ruhr-Universität
Bochum, D-44801 Bochum, Germany
| | - Henning Hermjakob
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing
Institute of Radiation Medicine, National
Center for Protein Sciences, Beijing, Beijing 102206, China
| | - Shin Kawano
- Database
Center for Life Science, Joint Support Center for Data Science Research,
Research Organization of Information and Systems, Kashiwa, Chiba 277-0871, Japan
| | - Henry Lam
- Division
of Biomedical Engineering, The Hong Kong
University of Science and Technology, Clear Water Bay, Hong Kong, P. R. China
- Department
of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, P. R. China
| | - Gerhard Mayer
- Medizinisches
Proteom Center (MPC), Ruhr-Universität
Bochum, D-44801 Bochum, Germany
| | - Gerben Menschaert
- Lab of Bioinformatics
and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Ghent University, 9000 Ghent, Belgium
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Reza M. Salek
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L. Tabb
- SA
MRC Centre
for TB Research, DST/NRF Centre of Excellence for Biomedical TB Research,
Division of Molecular Biology and Human Genetics, Faculty of Medicine
and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Stefan Tenzer
- Institute
for Immunology, University Medical Center
of the Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mathias Walzer
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Andrew R. Jones
- Institute
of Integrative Biology, University of Liverpool, South Wirral L64 4AY, United Kingdom
| |
Collapse
|
19
|
Askenazi M, Ben Hamidane H, Graumann J. The arc of Mass Spectrometry Exchange Formats is long, but it bends toward HDF5. MASS SPECTROMETRY REVIEWS 2017; 36:668-673. [PMID: 27741559 PMCID: PMC6088231 DOI: 10.1002/mas.21522] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 06/28/2016] [Accepted: 08/18/2016] [Indexed: 05/18/2023]
Abstract
The evolution of data exchange in Mass Spectrometry spans decades and has ranged from human-readable text files representing individual scans or collections thereof (McDonald et al., 2004) through the official standard XML-based (Harold, Means, & Udemadu, 2005) data interchange standard (Deutsch, 2012), to increasingly compressed (Teleman et al., 2014) variants of this standard sometimes requiring purely binary adjunct files (Römpp et al., 2011). While the desire to maintain even partial human readability is understandable, the inherent mismatch between XML's textual and irregular format relative to the numeric and highly regular nature of actual spectral data, along with the explosive growth in dataset scales and the resulting need for efficient (binary and indexed) access has led to a phenomenon referred to as "technical drift" (Davis, 2013). While the drift is being continuously corrected using adjunct formats, compression schemes, and programs (Röst et al., 2015), we propose that the future of Mass Spectrometry Exchange Formats lies in the continued reliance and development of the PSI-MS (Mayer et al., 2014) controlled vocabulary, along with an expedited shift to an alternative, thriving and well-supported ecosystem for scientific data-exchange, storage, and access in binary form, namely that of HDF5 (Koranne, 2011). Indeed, pioneering efforts to leverage this universal, binary, and hierarchical data-format have already been published (Wilhelm et al., 2012; Rübel et al., 2013) though they have under-utilized self-description, a key property shared by HDF5 and XML. We demonstrate that a straightforward usage of plain ("vanilla") HDF5 yields immediate returns including, but not limited to, highly efficient data access, platform independent data viewers, a variety of libraries (Collette, 2014) for data retrieval and manipulation in many programming languages and remote data access through comprehensive RESTful data-servers. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:668-673, 2017.
Collapse
|
20
|
Bittremieux W, Walzer M, Tenzer S, Zhu W, Salek RM, Eisenacher M, Tabb DL. The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass Spectrometry. Anal Chem 2017; 89:4474-4479. [PMID: 28318237 DOI: 10.1021/acs.analchem.6b04310] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
To have confidence in results acquired during biological mass spectrometry experiments, a systematic approach to quality control is of vital importance. Nonetheless, until now, only scattered initiatives have been undertaken to this end, and these individual efforts have often not been complementary. To address this issue, the Human Proteome Organization-Proteomics Standards Initiative has established a new working group on quality control at its meeting in the spring of 2016. The goal of this working group is to provide a unifying framework for quality control data. The initial focus will be on providing a community-driven standardized file format for quality control. For this purpose, the previously proposed qcML format will be adapted to support a variety of use cases for both proteomics and metabolomics applications, and it will be established as an official PSI format. An important consideration is to avoid enforcing restrictive requirements on quality control but instead provide the basic technical necessities required to support extensive quality control for any type of mass spectrometry-based workflow. We want to emphasize that this is an open community effort, and we seek participation from all scientists with an interest in this field.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp , Middelheimlaan 1, 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , Wilrijkstraat 10, 2650 Edegem, Belgium
| | - Mathias Walzer
- Department of Computer Science, University of Tübingen , Tübingen 72076, Germany.,Center for Bioinformatics, University of Tübingen , Tübingen 72074, Germany
| | - Stefan Tenzer
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz D 55131, Germany
| | - Weimin Zhu
- National Center for Protein Science , No. 38, Science Park Road, Changping District, Beijing 102206, China
| | - Reza M Salek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Martin Eisenacher
- Medical Bioinformatics, Medizinisches Proteom-Center, Ruhr-University Bochum , Bochum 44801, Germany
| | - David L Tabb
- Division of Molecular Biology and Human Genetics, Stellenbosch University Faculty of Medicine and Health Sciences , Tygerberg Hospital, Francie Van Zijl Drive, Cape Town 7505, South Africa
| |
Collapse
|
21
|
Perez-Riverol Y, Alpi E, Wang R, Hermjakob H, Vizcaíno JA. Making proteomics data accessible and reusable: current state of proteomics databases and repositories. Proteomics 2015; 15:930-49. [PMID: 25158685 PMCID: PMC4409848 DOI: 10.1002/pmic.201400302] [Citation(s) in RCA: 141] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 08/06/2014] [Accepted: 08/22/2014] [Indexed: 01/10/2023]
Abstract
Compared to other data-intensive disciplines such as genomics, public deposition and storage of MS-based proteomics, data are still less developed due to, among other reasons, the inherent complexity of the data and the variety of data types and experimental workflows. In order to address this need, several public repositories for MS proteomics experiments have been developed, each with different purposes in mind. The most established resources are the Global Proteome Machine Database (GPMDB), PeptideAtlas, and the PRIDE database. Additionally, there are other useful (in many cases recently developed) resources such as ProteomicsDB, Mass Spectrometry Interactive Virtual Environment (MassIVE), Chorus, MaxQB, PeptideAtlas SRM Experiment Library (PASSEL), Model Organism Protein Expression Database (MOPED), and the Human Proteinpedia. In addition, the ProteomeXchange consortium has been recently developed to enable better integration of public repositories and the coordinated sharing of proteomics information, maximizing its benefit to the scientific community. Here, we will review each of the major proteomics resources independently and some tools that enable the integration, mining and reuse of the data. We will also discuss some of the major challenges and current pitfalls in the integration and sharing of the data.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | | | | |
Collapse
|
22
|
Lapatas V, Stefanidakis M, Jimenez RC, Via A, Schneider MV. Data integration in biological research: an overview. JOURNAL OF BIOLOGICAL RESEARCH (THESSALONIKE, GREECE) 2015; 22:9. [PMID: 26336651 PMCID: PMC4557916 DOI: 10.1186/s40709-015-0032-5] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 08/10/2015] [Indexed: 11/16/2022]
Abstract
Data sharing, integration and annotation are essential to ensure the reproducibility of the analysis and interpretation of the experimental findings. Often these activities are perceived as a role that bioinformaticians and computer scientists have to take with no or little input from the experimental biologist. On the contrary, biological researchers, being the producers and often the end users of such data, have a big role in enabling biological data integration. The quality and usefulness of data integration depend on the existence and adoption of standards, shared formats, and mechanisms that are suitable for biological researchers to submit and annotate the data, so it can be easily searchable, conveniently linked and consequently used for further biological analysis and discovery. Here, we provide background on what is data integration from a computational science point of view, how it has been applied to biological research, which key aspects contributed to its success and future directions.
Collapse
Affiliation(s)
- Vasileios Lapatas
- />Department of Informatics, Ionian University, 7 Tsirigoti Square, Corfu, 49100 Greece
| | - Michalis Stefanidakis
- />Department of Informatics, Ionian University, 7 Tsirigoti Square, Corfu, 49100 Greece
| | | | - Allegra Via
- />Biocomputing Group, Sapienza University, Piazzale Aldo Moro 5, Rome, 00185 Italy
| | | |
Collapse
|
23
|
Perez-Riverol Y, Xu QW, Wang R, Uszkoreit J, Griss J, Sanchez A, Reisinger F, Csordas A, Ternent T, Del-Toro N, Dianes JA, Eisenacher M, Hermjakob H, Vizcaíno JA. PRIDE Inspector Toolsuite: Moving Toward a Universal Visualization Tool for Proteomics Data Standard Formats and Quality Assessment of ProteomeXchange Datasets. Mol Cell Proteomics 2015; 15:305-17. [PMID: 26545397 PMCID: PMC4762524 DOI: 10.1074/mcp.o115.050229] [Citation(s) in RCA: 144] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Indexed: 12/25/2022] Open
Abstract
The original PRIDE Inspector tool was developed as an open source standalone tool to enable the visualization and validation of mass-spectrometry (MS)-based proteomics data before data submission or already publicly available in the Proteomics Identifications (PRIDE) database. The initial implementation of the tool focused on visualizing PRIDE data by supporting the PRIDE XML format and a direct access to private (password protected) and public experiments in PRIDE. The ProteomeXchange (PX) Consortium has been set up to enable a better integration of existing public proteomics repositories, maximizing its benefit to the scientific community through the implementation of standard submission and dissemination pipelines. Within the Consortium, PRIDE is focused on supporting submissions of tandem MS data. The increasing use and popularity of the new Proteomics Standards Initiative (PSI) data standards such as mzIdentML and mzTab, and the diversity of workflows supported by the PX resources, prompted us to design and implement a new suite of algorithms and libraries that would build upon the success of the original PRIDE Inspector and would enable users to visualize and validate PX “complete” submissions. The PRIDE Inspector Toolsuite supports the handling and visualization of different experimental output files, ranging from spectra (mzML, mzXML, and the most popular peak lists formats) and peptide and protein identification results (mzIdentML, PRIDE XML, mzTab) to quantification data (mzTab, PRIDE XML), using a modular and extensible set of open-source, cross-platform libraries. We believe that the PRIDE Inspector Toolsuite represents a milestone in the visualization and quality assessment of proteomics data. It is freely available at http://github.com/PRIDE-Toolsuite/.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Qing-Wei Xu
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Rui Wang
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Julian Uszkoreit
- §Ruhr-Universität Bochum, Medizinisches Proteom-Zenter, Medical Bioinformatics, ZKF, E.142, Universitätsstr. 150, D-44801 Bochum, Germany
| | - Johannes Griss
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK; ¶Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Austria
| | - Aniel Sanchez
- ‖Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ciudad de la Habana, Cuba
| | - Florian Reisinger
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Attila Csordas
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Tobias Ternent
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Noemi Del-Toro
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jose A Dianes
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Martin Eisenacher
- §Ruhr-Universität Bochum, Medizinisches Proteom-Zenter, Medical Bioinformatics, ZKF, E.142, Universitätsstr. 150, D-44801 Bochum, Germany
| | - Henning Hermjakob
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Juan Antonio Vizcaíno
- From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK;
| |
Collapse
|
24
|
Zaman U, Richter FM, Hofele R, Kramer K, Sachsenberg T, Kohlbacher O, Lenz C, Urlaub H. Dithiothreitol (DTT) Acts as a Specific, UV-inducible Cross-linker in Elucidation of Protein-RNA Interactions. Mol Cell Proteomics 2015; 14:3196-210. [PMID: 26450613 DOI: 10.1074/mcp.m115.052795] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Indexed: 11/06/2022] Open
Abstract
Protein-RNA cross-linking by UV irradiation at 254 nm wavelength has been established as an unbiased method to identify proteins in direct contact with RNA, and has been successfully applied to investigate the spatial arrangement of protein and RNA in large macromolecular assemblies, e.g. ribonucleoprotein-complex particles (RNPs). The mass spectrometric analysis of such peptide-RNA cross-links provides high resolution structural data to the point of mapping protein-RNA interactions to specific peptides or even amino acids. However, the approach suffers from the low yield of cross-linking products, which can be addressed by improving enrichment and analysis methods. In the present article, we introduce dithiothreitol (DTT) as a potent protein-RNA cross-linker. In order to evaluate the efficiency and specificity of DTT, we used two systems, a small synthetic peptide from smB protein incubated with U1 snRNA oligonucleotide and native ribonucleoprotein complexes from S. cerevisiae. Our results unambiguously show that DTT covalently participates in cysteine-uracil crosslinks, which is observable as a mass increment of 151.9966 Da (C(4)H(8)S(2)O(2)) upon mass spectrometric analysis. DTT presents advantages for cross-linking of cysteine containing regions of proteins. This is evidenced by comparison to experiments where (tris(2-carboxyethyl)phosphine) is used as reducing agent, and significantly less cross-links encompassing cysteine residues are found. We further propose insertion of DTT between the cysteine and uracil reactive sites as the most probable structure of the cross-linking products.
Collapse
Affiliation(s)
- Uzma Zaman
- From the ‡Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, D-37077 Göttingen, Germany; §Bioanalytics, Institute for Clinical Chemistry, University Medical Center Göttingen, Robert-Koch-Strasse 40, D-37075 Göttingen, Germany
| | - Florian M Richter
- From the ‡Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, D-37077 Göttingen, Germany
| | - Romina Hofele
- From the ‡Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, D-37077 Göttingen, Germany; §Bioanalytics, Institute for Clinical Chemistry, University Medical Center Göttingen, Robert-Koch-Strasse 40, D-37075 Göttingen, Germany
| | - Katharina Kramer
- From the ‡Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, D-37077 Göttingen, Germany; §Bioanalytics, Institute for Clinical Chemistry, University Medical Center Göttingen, Robert-Koch-Strasse 40, D-37075 Göttingen, Germany
| | - Timo Sachsenberg
- ¶Center for Bioinformatics, ‖Department of Computer Science, University of Tübingen, Sand 14, D-72076 Tübingen, Germany
| | - Oliver Kohlbacher
- ¶Center for Bioinformatics, ‖Department of Computer Science, University of Tübingen, Sand 14, D-72076 Tübingen, Germany; ¶¶Biomolecular Interactions, Max Planck Institute for Developmental Biology, Spemannstraße 35, D-72076 Tübingen, Germany
| | - Christof Lenz
- From the ‡Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, D-37077 Göttingen, Germany; §Bioanalytics, Institute for Clinical Chemistry, University Medical Center Göttingen, Robert-Koch-Strasse 40, D-37075 Göttingen, Germany
| | - Henning Urlaub
- From the ‡Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, D-37077 Göttingen, Germany; §Bioanalytics, Institute for Clinical Chemistry, University Medical Center Göttingen, Robert-Koch-Strasse 40, D-37075 Göttingen, Germany;
| |
Collapse
|
25
|
Horvatovich P, Lundberg EK, Chen YJ, Sung TY, He F, Nice EC, Goode RJ, Yu S, Ranganathan S, Baker MS, Domont GB, Velasquez E, Li D, Liu S, Wang Q, He QY, Menon R, Guan Y, Corrales FJ, Segura V, Casal JI, Pascual-Montano A, Albar JP, Fuentes M, Gonzalez-Gonzalez M, Diez P, Ibarrola N, Degano RM, Mohammed Y, Borchers CH, Urbani A, Soggiu A, Yamamoto T, Salekdeh GH, Archakov A, Ponomarenko E, Lisitsa A, Lichti CF, Mostovenko E, Kroes RA, Rezeli M, Végvári Á, Fehniger TE, Bischoff R, Vizcaíno JA, Deutsch EW, Lane L, Nilsson CL, Marko-Varga G, Omenn GS, Jeong SK, Lim JS, Paik YK, Hancock WS. Quest for Missing Proteins: Update 2015 on Chromosome-Centric Human Proteome Project. J Proteome Res 2015; 14:3415-3431. [PMID: 26076068 DOI: 10.1021/pr5013009] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
This paper summarizes the recent activities of the Chromosome-Centric Human Proteome Project (C-HPP) consortium, which develops new technologies to identify yet-to-be annotated proteins (termed "missing proteins") in biological samples that lack sufficient experimental evidence at the protein level for confident protein identification. The C-HPP also aims to identify new protein forms that may be caused by genetic variability, post-translational modifications, and alternative splicing. Proteogenomic data integration forms the basis of the C-HPP's activities; therefore, we have summarized some of the key approaches and their roles in the project. We present new analytical technologies that improve the chemical space and lower detection limits coupled to bioinformatics tools and some publicly available resources that can be used to improve data analysis or support the development of analytical assays. Most of this paper's content has been compiled from posters, slides, and discussions presented in the series of C-HPP workshops held during 2014. All data (posters, presentations) used are available at the C-HPP Wiki (http://c-hpp.webhosting.rug.nl/) and in the Supporting Information.
Collapse
Affiliation(s)
- Péter Horvatovich
- Analytical Biochemistry, Department of Pharmacy, University of Groningen , A. Deusinglaan 1, 9713 AV Groningen, The Netherlands
| | - Emma K Lundberg
- Science for Life Laboratory, KTH - Royal Institute of Technology , SE-171 21 Stockholm, Sweden
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica , 128 Academia Road Sec. 2, Taipei 115, Taiwan
| | - Ting-Yi Sung
- Institute of Information Science, Academia Sinica , 128 Academia Road Sec. 2, Taipei 115, Taiwan
| | - Fuchu He
- The State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , No. 27 Taiping Road, Haidian District, Beijing 100850, China
| | - Edouard C Nice
- Department of Biochemistry and Molecular Biology, Monash University , Clayton, Victoria 3800, Australia
| | - Robert J Goode
- Department of Biochemistry and Molecular Biology, Monash University , Clayton, Victoria 3800, Australia
| | - Simon Yu
- Department of Biochemistry and Molecular Biology, Monash University , Clayton, Victoria 3800, Australia
| | - Shoba Ranganathan
- Department of Chemistry and Biomolecular Sciences and ARC Centre of Excellence in Bioinformatics, Macquarie University , Sydney, New South Wales 2109, Australia
| | - Mark S Baker
- Australian School of Advanced Medicine, Macquarie University , Sydney, NSW 2109, Australia
| | - Gilberto B Domont
- Proteomics Unit, Institute of Chemistry, Federal University of Rio de Janeiro , Cidade Universitária, Av Athos da Silveira Ramos 149, CT-A542, 21941-909 Rio de Janeriro, Rj, Brazil
| | - Erika Velasquez
- Proteomics Unit, Institute of Chemistry, Federal University of Rio de Janeiro , Cidade Universitária, Av Athos da Silveira Ramos 149, CT-A542, 21941-909 Rio de Janeriro, Rj, Brazil
| | - Dong Li
- The State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , No. 27 Taiping Road, Haidian District, Beijing 100850, China
| | - Siqi Liu
- Beijing Institute of Genomics and BGI Shenzhen , No. 1 Beichen West Road, Chaoyang District, Beijing 100101, China
- BGI Shenzhen , Beishan Road, Yantian District, Shenzhen, 518083, China
| | - Quanhui Wang
- Beijing Institute of Genomics and BGI Shenzhen , No. 1 Beichen West Road, Chaoyang District, Beijing 100101, China
| | - Qing-Yu He
- ■ Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, College of Life Science and Technology, Jinan University , Guangzhou 510632, China
| | - Rajasree Menon
- Department of Computational Medicine & Bioinformatics, University of Michigan , 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
| | - Yuanfang Guan
- Departments of Computational Medicine & Bioinformatics and Computer Sciences, University of Michigan , 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
| | - Fernando J Corrales
- ProteoRed-ISCIII, Biomolecular and Bioinformatics Resources Platform (PRB2), Spanish Consortium of C-HPP (Chr-16), CIMA, University of Navarra, 31008 Pamplona, Spain
- Chr16 SpHPP Consortium , CIMA, University of Navarra, 31008 Pamplona, Spain
| | - Victor Segura
- ProteoRed-ISCIII, Biomolecular and Bioinformatics Resources Platform (PRB2), Spanish Consortium of C-HPP (Chr-16), CIMA, University of Navarra, 31008 Pamplona, Spain
- Chr16 SpHPP Consortium , CIMA, University of Navarra, 31008 Pamplona, Spain
| | - J Ignacio Casal
- Department of Cellular and Molecular Medicine, Centro de Investigaciones Biológicas (CIB-CSIC) , 28040 Madrid, Spain
| | | | - Juan P Albar
- Centro Nacional de Biotecnologia (CNB-CSIC) , Cantoblanco, 28049 Madrid, Spain
| | - Manuel Fuentes
- Cancer Research Center. Proteomics Unit and General Service of Cytometry, Department of Medicine, University of Salmanca-CSIC , IBSAL, Campus Miguel de Unamuno s/n, 37007 Salamanca, Spain
| | - Maria Gonzalez-Gonzalez
- Cancer Research Center. Proteomics Unit and General Service of Cytometry, Department of Medicine, University of Salmanca-CSIC , IBSAL, Campus Miguel de Unamuno s/n, 37007 Salamanca, Spain
| | - Paula Diez
- Cancer Research Center. Proteomics Unit and General Service of Cytometry, Department of Medicine, University of Salmanca-CSIC , IBSAL, Campus Miguel de Unamuno s/n, 37007 Salamanca, Spain
| | - Nieves Ibarrola
- Cancer Research Center. Proteomics Unit and General Service of Cytometry, Department of Medicine, University of Salmanca-CSIC , IBSAL, Campus Miguel de Unamuno s/n, 37007 Salamanca, Spain
| | - Rosa M Degano
- Cancer Research Center. Proteomics Unit and General Service of Cytometry, Department of Medicine, University of Salmanca-CSIC , IBSAL, Campus Miguel de Unamuno s/n, 37007 Salamanca, Spain
| | - Yassene Mohammed
- University of Victoria -Genome British Columbia Proteomics Centre, Vancouver Island Technology Park, #3101-4464 Markham Street, Victoria, British Columbia V8Z 7X8, Canada
- Center for Proteomics and Metabolomics, Leiden University Medical Center , 2333 ZA Leiden, The Netherlands
| | - Christoph H Borchers
- University of Victoria -Genome British Columbia Proteomics Centre, Vancouver Island Technology Park, #3101-4464 Markham Street, Victoria, British Columbia V8Z 7X8, Canada
| | - Andrea Urbani
- Proteomics and Metabonomic, Laboratory, Fondazione Santa Lucia , Rome, Italy
- Department of Experimental Medicine and Surgery, University of Rome "Tor Vergata" , Rome, Italy
| | - Alessio Soggiu
- Department of Veterinary Science and Public Health (DIVET), University of Milano , via Celoria 10, 20133 Milano, Italy
| | - Tadashi Yamamoto
- Institute of Nephrology, Graduate School of Medical and Dental Sciences, Niigata University , Niigata, Japan
| | - Ghasem Hosseini Salekdeh
- Department of Molecular Systems Biology at Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran
- Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran, Karaj, Iran
| | | | | | - Andrey Lisitsa
- Orechovich Institute of Biomedical Chemistry , Moscow, Russia
| | - Cheryl F Lichti
- Department of Pharmacology and Toxicology, The University of Texas Medical Branch , Galveston, Texas 77555-0617, United States
| | - Ekaterina Mostovenko
- Department of Pharmacology and Toxicology, The University of Texas Medical Branch , Galveston, Texas 77555-0617, United States
| | - Roger A Kroes
- Falk Center for Molecular Therapeutics, Department of Biomedical Engineering, Northwestern University , 1801 Maple Ave., Suite 4300, Evanston, Illinois 60201, United States
| | - Melinda Rezeli
- Clinical Protein Science & Imaging, Department of Biomedical Engineering, Lund University , BMC D13, 221 84 Lund, Sweden
| | - Ákos Végvári
- Clinical Protein Science & Imaging, Department of Biomedical Engineering, Lund University , BMC D13, 221 84 Lund, Sweden
| | - Thomas E Fehniger
- Clinical Protein Science & Imaging, Department of Biomedical Engineering, Lund University , BMC D13, 221 84 Lund, Sweden
| | - Rainer Bischoff
- Analytical Biochemistry, Department of Pharmacy, University of Groningen , A. Deusinglaan 1, 9713 AV Groningen, The Netherlands
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, United Kingdom
| | - Eric W Deutsch
- Institute for Systems Biology , 401 Terry Avenue North, Seattle, Washington 98109, United States
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics , Geneva, Switzerland
- Department of Human Protein Science, Faculty of Medicine, University of Geneva , Geneva, Switzerland
| | - Carol L Nilsson
- Department of Pharmacology and Toxicology, The University of Texas Medical Branch , Galveston, Texas 77555-0617, United States
| | - György Marko-Varga
- Clinical Protein Science & Imaging, Department of Biomedical Engineering, Lund University , BMC D13, 221 84 Lund, Sweden
| | - Gilbert S Omenn
- Departments of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics and School of Public Health, University of Michigan , 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
| | - Seul-Ki Jeong
- Departments of Integrated Omics for Biomedical Science & Biochemistry, College of Life Science and Technology, Yonsei Proteome Research Center, Yonsei University , Seoul, 120-749, Korea
| | - Jong-Sun Lim
- Departments of Integrated Omics for Biomedical Science & Biochemistry, College of Life Science and Technology, Yonsei Proteome Research Center, Yonsei University , Seoul, 120-749, Korea
| | - Young-Ki Paik
- Departments of Integrated Omics for Biomedical Science & Biochemistry, College of Life Science and Technology, Yonsei Proteome Research Center, Yonsei University , Seoul, 120-749, Korea
| | - William S Hancock
- The Barnett Institute of Chemical and Biological Analysis, Northeastern University , 140 The Fenway, Boston, Massachusetts 02115, United States
| |
Collapse
|
26
|
Deutsch EW, Albar JP, Binz PA, Eisenacher M, Jones AR, Mayer G, Omenn GS, Orchard S, Vizcaíno JA, Hermjakob H. Development of data representation standards by the human proteome organization proteomics standards initiative. J Am Med Inform Assoc 2015; 22:495-506. [PMID: 25726569 PMCID: PMC4457114 DOI: 10.1093/jamia/ocv001] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Revised: 09/29/2014] [Accepted: 01/05/2015] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVE To describe the goals of the Proteomics Standards Initiative (PSI) of the Human Proteome Organization, the methods that the PSI has employed to create data standards, the resulting output of the PSI, lessons learned from the PSI's evolution, and future directions and synergies for the group. MATERIALS AND METHODS The PSI has 5 categories of deliverables that have guided the group. These are minimum information guidelines, data formats, controlled vocabularies, resources and software tools, and dissemination activities. These deliverables are produced via the leadership and working group organization of the initiative, driven by frequent workshops and ongoing communication within the working groups. Official standards are subjected to a rigorous document process that includes several levels of peer review prior to release. RESULTS We have produced and published minimum information guidelines describing what information should be provided when making data public, either via public repositories or other means. The PSI has produced a series of standard formats covering mass spectrometer input, mass spectrometer output, results of informatics analysis (both qualitative and quantitative analyses), reports of molecular interaction data, and gel electrophoresis analyses. We have produced controlled vocabularies that ensure that concepts are uniformly annotated in the formats and engaged in extensive software development and dissemination efforts so that the standards can efficiently be used by the community.Conclusion In its first dozen years of operation, the PSI has produced many standards that have accelerated the field of proteomics by facilitating data exchange and deposition to data repositories. We look to the future to continue developing standards for new proteomics technologies and workflows and mechanisms for integration with other omics data types. Our products facilitate the translation of genomics and proteomics findings to clinical and biological phenotypes. The PSI website can be accessed at http://www.psidev.info.
Collapse
Affiliation(s)
| | - Juan Pablo Albar
- Died July 18, 2014 Proteomics Facility, Centro Nacional de Biotecnología - CSIC, Madrid, Spain ProteoRed Consortium, Spanish National Institute of Proteomics, Madrid, Spain
| | - Pierre-Alain Binz
- CHUV Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
| | - Martin Eisenacher
- Medizinisches Proteom Center (MPC), Ruhr-Universität Bochum, Bochum, Germany
| | - Andrew R Jones
- Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - Gerhard Mayer
- Medizinisches Proteom Center (MPC), Ruhr-Universität Bochum, Bochum, Germany
| | - Gilbert S Omenn
- Institute for Systems Biology, Seattle, USA Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, USA
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
27
|
Walzer M, Pernas LE, Nasso S, Bittremieux W, Nahnsen S, Kelchtermans P, Pichler P, van den Toorn HWP, Staes A, Vandenbussche J, Mazanek M, Taus T, Scheltema RA, Kelstrup CD, Gatto L, van Breukelen B, Aiche S, Valkenborg D, Laukens K, Lilley KS, Olsen JV, Heck AJR, Mechtler K, Aebersold R, Gevaert K, Vizcaíno JA, Hermjakob H, Kohlbacher O, Martens L. qcML: an exchange format for quality control metrics from mass spectrometry experiments. Mol Cell Proteomics 2014; 13:1905-13. [PMID: 24760958 PMCID: PMC4125725 DOI: 10.1074/mcp.m113.035907] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Revised: 03/13/2014] [Indexed: 12/22/2022] Open
Abstract
Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteomics. Several recent papers discuss relevant parameters for quality control and present applications to extract these from the instrumental raw data. What has been missing, however, is a standard data exchange format for reporting these performance metrics. We therefore developed the qcML format, an XML-based standard that follows the design principles of the related mzML, mzIdentML, mzQuantML, and TraML standards from the HUPO-PSI (Proteomics Standards Initiative). In addition to the XML format, we also provide tools for the calculation of a wide range of quality metrics as well as a database format and interconversion tools, so that existing LIMS systems can easily add relational storage of the quality control data to their existing schema. We here describe the qcML specification, along with possible use cases and an illustrative example of the subsequent analysis possibilities. All information about qcML is available at http://code.google.com/p/qcml.
Collapse
Affiliation(s)
- Mathias Walzer
- From the ‡Applied Bioinformatics, Center for Bioinformatics, Quantitative Biology Center, and Dept. of Computer Science, University of Tuebingen, Germany
| | - Lucia Espona Pernas
- §Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule Zürich, 8092 Zurich, Switzerland
| | - Sara Nasso
- ¶Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland; §Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule Zürich, 8092 Zurich, Switzerland
| | - Wout Bittremieux
- ‖Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium; **Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Sven Nahnsen
- From the ‡Applied Bioinformatics, Center for Bioinformatics, Quantitative Biology Center, and Dept. of Computer Science, University of Tuebingen, Germany
| | - Pieter Kelchtermans
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium; ¶¶Flemish Institute for Technological Research (VITO), Boeretang 200, B-2400 Mol Belgium
| | - Peter Pichler
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Henk W P van den Toorn
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, Netherlands; Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, Netherlands
| | - An Staes
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Jonathan Vandenbussche
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Michael Mazanek
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Thomas Taus
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Richard A Scheltema
- Department of Proteomics and Signal Transduction, Max-Planck Institute of Biochemistry, Am Klopferspitz 18, D-82152 Martinsried, Germany
| | - Christian D Kelstrup
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3b, DK-2200 Copenhagen, Denmark
| | - Laurent Gatto
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, CB2 1GA, United Kingdom; Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Bas van Breukelen
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, Netherlands; Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, Netherlands
| | - Stephan Aiche
- Department of Mathematics and Computer Science, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany
| | - Dirk Valkenborg
- ¶¶Flemish Institute for Technological Research (VITO), Boeretang 200, B-2400 Mol Belgium; I-BioStat, Hasselt University, Belgium; CFP-CeProMa, University of Antwerp, Belgium
| | - Kris Laukens
- ‖Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium; **Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, CB2 1GA, United Kingdom
| | - Jesper V Olsen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3b, DK-2200 Copenhagen, Denmark
| | - Albert J R Heck
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, Netherlands; Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, Netherlands
| | - Karl Mechtler
- ‖‖Research Institute of Molecular Pathology (IMP), Dr. Bohr-Gasse 7, A-1030 Vienna, Austria; Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Dr. Bohr-Gasse 3, A-1030 Vienna, Austria
| | - Ruedi Aebersold
- §Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule Zürich, 8092 Zurich, Switzerland; Faculty of Science, University of Zurich, Zurich, Switzerland
| | - Kris Gevaert
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Oliver Kohlbacher
- From the ‡Applied Bioinformatics, Center for Bioinformatics, Quantitative Biology Center, and Dept. of Computer Science, University of Tuebingen, Germany
| | - Lennart Martens
- ‡‡Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium; §§Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium;
| |
Collapse
|