1
|
Omenn GS. Reflections on the HUPO Human Proteome Project, the Flagship Project of the Human Proteome Organization, at 10 Years. Mol Cell Proteomics 2021; 20:100062. [PMID: 33640492 PMCID: PMC8058560 DOI: 10.1016/j.mcpro.2021.100062] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 02/04/2021] [Accepted: 02/05/2021] [Indexed: 02/08/2023] Open
Abstract
We celebrate the 10th anniversary of the launch of the HUPO Human Proteome Project (HPP) and its major milestone of confident detection of at least one protein from each of 90% of the predicted protein-coding genes, based on the output of the entire proteomics community. The Human Genome Project reached a similar decadal milestone 20 years ago. The HPP has engaged proteomics teams around the world, strongly influenced data-sharing, enhanced quality assurance, and issued stringent guidelines for claims of detecting previously "missing proteins." This invited perspective complements papers on "A High-Stringency Blueprint of the Human Proteome" and "The Human Proteome Reaches a Major Milestone" in special issues of Nature Communications and Journal of Proteome Research, respectively, released in conjunction with the October 2020 virtual HUPO Congress and its celebration of the 10th anniversary of the HUPO HPP.
Collapse
Affiliation(s)
- Gilbert S Omenn
- University of Michigan Medical School, Departments of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, Ann Arbor, Michigan, USA.
| |
Collapse
|
2
|
Omenn GS, Lane L, Overall CM, Cristea IM, Corrales FJ, Lindskog C, Paik YK, Van Eyk JE, Liu S, Pennington SR, Snyder MP, Baker MS, Bandeira N, Aebersold R, Moritz RL, Deutsch EW. Research on the Human Proteome Reaches a Major Milestone: >90% of Predicted Human Proteins Now Credibly Detected, According to the HUPO Human Proteome Project. J Proteome Res 2020; 19:4735-4746. [PMID: 32931287 PMCID: PMC7718309 DOI: 10.1021/acs.jproteome.0c00485] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
According to the 2020 Metrics of the HUPO Human Proteome Project (HPP), expression has now been detected at the protein level for >90% of the 19 773 predicted proteins coded in the human genome. The HPP annually reports on progress made throughout the world toward credibly identifying and characterizing the complete human protein parts list and promoting proteomics as an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2020-01 classified 17 874 proteins as PE1, having strong protein-level evidence, up 180 from 17 694 one year earlier. These represent 90.4% of the 19 773 predicted coding genes (all PE1,2,3,4 proteins in neXtProt). Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), was reduced by 230 from 2129 to 1899 since the neXtProt 2019-01 release. PeptideAtlas is the primary source of uniform reanalysis of raw mass spectrometry data for neXtProt, supplemented this year with extensive data from MassIVE. PeptideAtlas 2020-01 added 362 canonical proteins between 2019 and 2020 and MassIVE contributed 84 more, many of which converted PE1 entries based on non-MS evidence to the MS-based subgroup. The 19 Biology and Disease-driven B/D-HPP teams continue to pursue the identification of driver proteins that underlie disease states, the characterization of regulatory mechanisms controlling the functions of these proteins, their proteoforms, and their interactions, and the progression of transitions from correlation to coexpression to causal networks after system perturbations. And the Human Protein Atlas published Blood, Brain, and Metabolic Atlases.
Collapse
Affiliation(s)
- Gilbert S Omenn
- University of Michigan, Ann Arbor, Michigan 48109, United States
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | | | - Ileana M Cristea
- Princeton University, Princeton, New Jersey 08544, United States
| | | | | | | | | | - Siqi Liu
- BGI Group, Shenzhen 518083, China
| | | | | | - Mark S Baker
- Macquarie University, Macquarie Park, NSW 2109, Australia
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, California 92093, United States
| | - Ruedi Aebersold
- ETH-Zurich and University of Zurich, 8092 Zurich, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
3
|
Is It Possible to Find Needles in a Haystack? Meta-Analysis of 1000+ MS/MS Files Provided by the Russian Proteomic Consortium for Mining Missing Proteins. Proteomes 2020; 8:proteomes8020012. [PMID: 32456206 PMCID: PMC7356824 DOI: 10.3390/proteomes8020012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 05/18/2020] [Accepted: 05/19/2020] [Indexed: 12/04/2022] Open
Abstract
Despite direct or indirect efforts of the proteomic community, the fraction of blind spots on the protein map is still significant. Almost 11% of human genes encode missing proteins; the existence of which proteins is still in doubt. Apparently, proteomics has reached a stage when more attention and curiosity need to be exerted in the identification of every novel protein in order to expand the unusual types of biomaterials and/or conditions. It seems that we have exhausted the current conventional approaches to the discovery of missing proteins and may need to investigate alternatives. Here, we present an approach to deciphering missing proteins based on the use of non-standard methodological solutions and encompassing diverse MS/MS data, obtained for rare types of biological samples by members of the Russian Proteomic community in the last five years. These data were re-analyzed in a uniform manner by three search engines, which are part of the SearchGUI package. The study resulted in the identification of two missing and five uncertain proteins detected with two peptides. Moreover, 149 proteins were detected with a single proteotypic peptide. Finally, we analyzed the gene expression levels to suggest feasible targets for further validation of missing and uncertain protein observations, which will fully meet the requirements of the international consortium. The MS data are available on the ProteomeXchange platform (PXD014300).
Collapse
|
4
|
Omenn GS, Lane L, Overall CM, Corrales FJ, Schwenk JM, Paik YK, Van Eyk JE, Liu S, Pennington S, Snyder MP, Baker MS, Deutsch EW. Progress on Identifying and Characterizing the Human Proteome: 2019 Metrics from the HUPO Human Proteome Project. J Proteome Res 2019; 18:4098-4107. [PMID: 31430157 PMCID: PMC6898754 DOI: 10.1021/acs.jproteome.9b00434] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
The Human Proteome Project (HPP) annually reports on progress made throughout the field in credibly identifying and characterizing the complete human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2019-01-11 contains 17 694 proteins with strong protein-level evidence (PE1), compliant with HPP Guidelines for Interpretation of MS Data v2.1; these represent 89% of all 19 823 neXtProt predicted coding genes (all PE1,2,3,4 proteins), up from 17 470 one year earlier. Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), has been reduced from 2949 to 2129 since 2016 through efforts throughout the community, including the chromosome-centric HPP. PeptideAtlas is the source of uniformly reanalyzed raw mass spectrometry data for neXtProt; PeptideAtlas added 495 canonical proteins between 2018 and 2019, especially from studies designed to detect hard-to-identify proteins. Meanwhile, the Human Protein Atlas has released version 18.1 with immunohistochemical evidence of expression of 17 000 proteins and survival plots as part of the Pathology Atlas. Many investigators apply multiplexed SRM-targeted proteomics for quantitation of organ-specific popular proteins in studies of various human diseases. The 19 teams of the Biology and Disease-driven B/D-HPP published a total of 160 publications in 2018, bringing proteomics to a broad array of biomedical research.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, Michel-Servet 1, 1211 Geneva 4, Switzerland
| | - Christopher M. Overall
- Life Sciences Institute, Faculty of Dentistry, University of British Columbia, 2350 Health Sciences Mall, Room 4.401, Vancouver, British Columbia V6T 1Z3, Canada
| | | | - Jochen M. Schwenk
- Science for Life Laboratory, KTH Royal Institute of Technology, Tomtebodavägen 23A, 17165 Solna, Sweden
| | - Young-Ki Paik
- Yonsei Proteome Research Center, Yonsei University, Room 425, Building #114, 50 Yonsei-ro, Seodaemoon-ku, Seoul 120-749, South Korea
| | - Jennifer E. Van Eyk
- Advanced Clinical BioSystems Research Institute, Cedars Sinai Precision Biomarker Laboratories, Barbra Streisand Women’s Heart Center, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Siqi Liu
- BGI Group-Shenzhen, Yantian District, Shenzhen 518083, China
| | - Stephen Pennington
- School of Medicine, University College Dublin, Conway Institute Belfield, Dublin 4, Ireland
| | - Michael P. Snyder
- Department of Genetics, Stanford University, Alway Building, 300 Pasteur Drive and 3165 Porter Drive, Palo Alto, California 94304, United States
| | - Mark S. Baker
- Department of Biomedical Sciences, Faculty of Medicine & Health Sciences, Macquarie University, 75 Talavera Road, North Ryde, NSW 2109, Australia
| | - Eric W. Deutsch
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| |
Collapse
|
5
|
Omenn GS, Lane L, Overall CM, Corrales FJ, Schwenk JM, Paik YK, Van Eyk JE, Liu S, Snyder M, Baker MS, Deutsch EW. Progress on Identifying and Characterizing the Human Proteome: 2018 Metrics from the HUPO Human Proteome Project. J Proteome Res 2018; 17:4031-4041. [PMID: 30099871 PMCID: PMC6387656 DOI: 10.1021/acs.jproteome.8b00441] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The Human Proteome Project (HPP) annually reports on progress throughout the field in credibly identifying and characterizing the human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2018-01-17, the baseline for this sixth annual HPP special issue of the Journal of Proteome Research, contains 17 470 PE1 proteins, 89% of all neXtProt predicted PE1-4 proteins, up from 17 008 in release 2017-01-23 and 13 975 in release 2012-02-24. Conversely, the number of neXtProt PE2,3,4 missing proteins has been reduced from 2949 to 2579 to 2186 over the past two years. Of the PE1 proteins, 16 092 are based on mass spectrometry results, and 1378 on other kinds of protein studies, notably protein-protein interaction findings. PeptideAtlas has 15 798 canonical proteins, up 625 over the past year, including 269 from SUMOylation studies. The largest reason for missing proteins is low abundance. Meanwhile, the Human Protein Atlas has released its Cell Atlas, Pathology Atlas, and updated Tissue Atlas, and is applying recommendations from the International Working Group on Antibody Validation. Finally, there is progress using the quantitative multiplex organ-specific popular proteins targeted proteomics approach in various disease categories.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, Michel-Servet 1, 1211 Geneva 4, Switzerland
| | - Christopher M. Overall
- Life Sciences Institute, Faculty of Dentistry, University of British Columbia, 2350 Health Sciences Mall, Room 4.401, Vancouver, BC Canada V6T 1Z3
| | | | - Jochen M. Schwenk
- Science for Life Laboratory, KTH Royal Institute of Technology, Tomtebodavägen 23A, 17165 Solna, Sweden
| | - Young-Ki Paik
- Yonsei Proteome Research Center, Room 425, Building #114, Yonsei University,50 Yonsei-ro, Seodaemoon-ku, Seoul 120-749, Korea
| | - Jennifer E. Van Eyk
- Advanced Clinical BioSystems Research Institute, Cedars Sinai Precision Biomarker Laboratories, Barbra Streisand Women’s Heart Center, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States
| | - Siqi Liu
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390-9148, United States
| | - Michael Snyder
- Department of Genetics, Stanford University, Alway Building, 300 Pasteur Drive, 3165 Porter Drive, Palo Alto, 94304, United States
| | - Mark S. Baker
- Department of Biomedical Sciences, Macquarie University, NSW 2109, Australia
| | - Eric W. Deutsch
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| |
Collapse
|
6
|
Paik YK. Toward Completion of the Human Proteome Parts List: Progress Uncovering Proteins That Are Missing or Have Unknown Function and Developing Analytical Methods. J Proteome Res 2018; 17:4023-4030. [PMID: 30985145 PMCID: PMC6288998 DOI: 10.1021/acs.jproteome.8b00885] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Young-Ki Paik
- Yonsei
Proteome Research Center, College of Life Science and
Technology, Yonsei University
| |
Collapse
|