1
|
Omenn GS, Orchard S, Lane L, Lindskog C, Pineau C, Overall CM, Budnik B, Mudge JM, Packer NH, Weintraub ST, Roehrl MHA, Nice E, Guo T, Van Eyk JE, Völker U, Zhang G, Bandeira N, Aebersold R, Moritz RL, Deutsch EW. The 2024 Report on the Human Proteome from the HUPO Human Proteome Project. J Proteome Res 2024; 23:5296-5311. [PMID: 39514846 PMCID: PMC11781352 DOI: 10.1021/acs.jproteome.4c00776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
The Human Proteome Project (HPP), the flagship initiative of the Human Proteome Organization (HUPO), has pursued two goals: (1) to credibly identify at least one isoform of every protein-coding gene and (2) to make proteomics an integral part of multiomics studies of human health and disease. The past year has seen major transitions for the HPP. neXtProt was retired as the official HPP knowledge base, UniProtKB became the reference proteome knowledge base, and Ensembl-GENCODE provides the reference protein target list. A function evidence FE1-5 scoring system has been developed for functional annotation of proteins, parallel to the PE1-5 UniProtKB/neXtProt scheme for evidence of protein expression. This report includes updates from neXtProt (version 2023-09) and UniProtKB release 2024_04, with protein expression detected (PE1) for 18138 of the 19411 GENCODE protein-coding genes (93%). The number of non-PE1 proteins ("missing proteins") is now 1273. The transition to GENCODE is a net reduction of 367 proteins (19,411 PE1-5 instead of 19,778 PE1-4 last year in neXtProt). We include reports from the Biology and Disease-driven HPP, the Human Protein Atlas, and the HPP Grand Challenge Project. We expect the new Functional Evidence FE1-5 scheme to energize the Grand Challenge Project for functional annotation of human proteins throughout the global proteomics community, including π-HuB in China.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- University of Michigan, Ann Arbor, Michigan 48109, United States
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK, CB10 1SD
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, 1015 Lausanne, Switzerland
| | - Cecilia Lindskog
- Department of Immunology Genetics and Pathology, Cancer Precision Medicine, Uppsala University, 752 36 Uppsala, Sweden
| | - Charles Pineau
- Univ Rennes, Inserm, EHESP, Irset, UMR_S 1085,35000 Rennes, France
| | - Christopher M. Overall
- University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Yonsei Frontier Lab, Yonsei University, 50 Yonsei-ro, Sudaemoon-ku, Seoul, 03722, Republic of Korea
| | - Bogdan Budnik
- Hansjörg Wyss Institute for Biologically Inspired Engineering at Harvard University
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK, CB10 1SD
| | | | - Susan T. Weintraub
- University of Texas Health Science Center at San Antonio, San Antonio, Texas 78229-3900, United States
| | - Michael H. A. Roehrl
- Department of Pathology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, United States
| | | | - Tiannan Guo
- Center for Intelligent Proteomics, Westlake Laboratory, Westlake University, Hangzhou 310024, Zhejiang Province, China
| | - Jennifer E. Van Eyk
- Advanced Clinical Biosystems Research Institute, Smidt Heart Institute, Cedars-Sinai Medical Center, 127 South San Vicente Boulevard, Pavilion, 9th Floor, Los Angeles, CA, 90048, United States
| | - Uwe Völker
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, 17475 Greifswald, Germany
| | - Gong Zhang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, CA, 92093, United States
| | | | - Robert L. Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W. Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
2
|
Song Y, Zhang C, Omenn GS, O’Meara MJ, Welch JD. Predicting the Structural Impact of Human Alternative Splicing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.21.572928. [PMID: 38187531 PMCID: PMC10769328 DOI: 10.1101/2023.12.21.572928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Protein structure prediction with neural networks is a powerful new method for linking protein sequence, structure, and function, but structures have generally been predicted for only a single isoform of each gene, neglecting splice variants. To investigate the structural implications of alternative splicing, we used AlphaFold2 to predict the structures of more than 11,000 human isoforms. We employed multiple metrics to identify splicing-induced structural alterations, including template matching score, secondary structure composition, surface charge distribution, radius of gyration, accessibility of post-translational modification sites, and structure-based function prediction. We identified examples of how alternative splicing induced clear changes in each of these properties. Structural similarity between isoforms largely correlated with degree of sequence identity, but we identified a subset of isoforms with low structural similarity despite high sequence similarity. Exon skipping and alternative last exons tended to increase the surface charge and radius of gyration. Splicing also buried or exposed numerous post-translational modification sites, most notably among the isoforms of BAX. Functional prediction nominated numerous functional differences among isoforms of the same gene, with loss of function compared to the reference predominating. Finally, we used single-cell RNA-seq data from the Tabula Sapiens to determine the cell types in which each structure is expressed. Our work represents an important resource for studying the structure and function of splice isoforms across the cell types of the human body.
Collapse
Affiliation(s)
- Yuxuan Song
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Matthew J. O’Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Joshua D. Welch
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
3
|
Banerjee A, Biswas D, Barpanda A, Halder A, Sibal S, Kattimani R, Shah A, Mahadevan A, Goel A, Srivastava S. The First Pituitary Proteome Landscape From Matched Anterior and Posterior Lobes for a Better Understanding of the Pituitary Gland. Mol Cell Proteomics 2022; 22:100478. [PMID: 36470533 PMCID: PMC9877467 DOI: 10.1016/j.mcpro.2022.100478] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/12/2022] Open
Abstract
To date, very few mass spectrometry (MS)-based proteomics studies are available on the anterior and posterior lobes of the pituitary. In the past, MS-based investigations have focused exclusively on the whole pituitary gland or anterior pituitary lobe. In this study, for the first time, we performed a deep MS-based analysis of five anterior and five posterior matched lobes to build the first lobe-specific pituitary proteome map, which documented 4090 proteins with isoforms, mostly mapped into chromosomes 1, 2, and 11. About 1446 differentially expressed significant proteins were identified, which were studied for lobe specificity, biological pathway enrichment, protein-protein interaction, regions specific to comparison of human brain and other neuroendocrine glands from Human Protein Atlas to identify pituitary-enriched proteins. Hormones specific to each lobe were also identified and validated with parallel reaction monitoring-based target verification. The study identified and validated hormones, growth hormone and thyroid-stimulating hormone subunit beta, exclusively to the anterior lobe whereas oxytocin-neurophysin 1 and arginine vasopressin to the posterior lobe. The study also identified proteins POU1F1 (pituitary-specific positive transcription factor 1), POMC (pro-opiomelanocortin), PCOLCE2 (procollagen C-endopeptidase enhancer 2), and NPTX2 (neuronal pentraxin-2) as pituitary-enriched proteins and was validated for their lobe specificity using parallel reaction monitoring. In addition, three uPE1 proteins, namely THEM6 (mesenchymal stem cell protein DSCD75), FSD1L (coiled-coil domain-containing protein 10), and METTL26 (methyltransferase-like 26), were identified using the NeXtProt database, and depicted tumor markers S100 proteins having high expression in the posterior lobe. In summary, the study documents the first matched anterior and posterior pituitary proteome map acting as a reference control for a better understanding of functional and nonfunctional pituitary adenomas and extrapolating the aim of the Human Proteome Project towards the investigation of the proteome of life.
Collapse
Affiliation(s)
- Arghya Banerjee
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, India
| | - Deepatarup Biswas
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, India
| | - Abhilash Barpanda
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, India
| | - Ankit Halder
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, India
| | - Shamira Sibal
- Lokmanya Tilak Municipal Medical College, Mumbai, India
| | | | - Abhidha Shah
- Department of Neurosurgery at King Edward Memorial Hospital and Seth G. S. Medical College, Mumbai, India
| | - Anita Mahadevan
- Human Brain Bank, National Institute of Mental Health and Neuro Sciences (NIMHANS), Bangalore, India
| | - Atul Goel
- Department of Neurosurgery at King Edward Memorial Hospital and Seth G. S. Medical College, Mumbai, India
| | - Sanjeeva Srivastava
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, India.
| |
Collapse
|
4
|
I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat Protoc 2022; 17:2326-2353. [PMID: 35931779 DOI: 10.1038/s41596-022-00728-0] [Citation(s) in RCA: 223] [Impact Index Per Article: 74.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 05/24/2022] [Indexed: 01/17/2023]
Abstract
Most proteins in cells are composed of multiple folding units (or domains) to perform complex functions in a cooperative manner. Relative to the rapid progress in single-domain structure prediction, there are few effective tools available for multi-domain protein structure assembly, mainly due to the complexity of modeling multi-domain proteins, which involves higher degrees of freedom in domain-orientation space and various levels of continuous and discontinuous domain assembly and linker refinement. To meet the challenge and the high demand of the community, we developed I-TASSER-MTD to model the structures and functions of multi-domain proteins through a progressive protocol that combines sequence-based domain parsing, single-domain structure folding, inter-domain structure assembly and structure-based function annotation in a fully automated pipeline. Advanced deep-learning models have been incorporated into each of the steps to enhance both the domain modeling and inter-domain assembly accuracy. The protocol allows for the incorporation of experimental cross-linking data and cryo-electron microscopy density maps to guide the multi-domain structure assembly simulations. I-TASSER-MTD is built on I-TASSER but substantially extends its ability and accuracy in modeling large multi-domain protein structures and provides meaningful functional insights for the targets at both the domain- and full-chain levels from the amino acid sequence alone.
Collapse
|
5
|
Li H, Zhang S, Chen L, Pan X, Li Z, Huang T, Cai YD. Identifying Functions of Proteins in Mice With Functional Embedding Features. Front Genet 2022; 13:909040. [PMID: 35651937 PMCID: PMC9149260 DOI: 10.3389/fgene.2022.909040] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 04/28/2022] [Indexed: 12/02/2022] Open
Abstract
In current biology, exploring the biological functions of proteins is important. Given the large number of proteins in some organisms, exploring their functions one by one through traditional experiments is impossible. Therefore, developing quick and reliable methods for identifying protein functions is necessary. Considerable accumulation of protein knowledge and recent developments on computer science provide an alternative way to complete this task, that is, designing computational methods. Several efforts have been made in this field. Most previous methods have adopted the protein sequence features or directly used the linkage from a protein–protein interaction (PPI) network. In this study, we proposed some novel multi-label classifiers, which adopted new embedding features to represent proteins. These features were derived from functional domains and a PPI network via word embedding and network embedding, respectively. The minimum redundancy maximum relevance method was used to assess the features, generating a feature list. Incremental feature selection, incorporating RAndom k-labELsets to construct multi-label classifiers, used such list to construct two optimum classifiers, corresponding to two key measurements: accuracy and exact match. These two classifiers had good performance, and they were superior to classifiers that used features extracted by traditional methods.
Collapse
Affiliation(s)
- Hao Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - ShiQi Zhang
- Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| | - ZhanDong Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
6
|
Ilgisonis EV, Pogodin PV, Kiseleva OI, Tarbeeva SN, Ponomarenko EA. Evolution of Protein Functional Annotation: Text Mining Study. J Pers Med 2022; 12:jpm12030479. [PMID: 35330478 PMCID: PMC8952229 DOI: 10.3390/jpm12030479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 11/23/2022] Open
Abstract
Within the Human Proteome Project initiative framework for creating functional annotations of uPE1 proteins, the neXt-CP50 Challenge was launched in 2018. In analogy with the missing-protein challenge, each command deciphers the functional features of the proteins in the chromosome-centric mode. However, the neXt-CP50 Challenge is more complicated than the missing-protein challenge: the approaches and methods for solving the problem are clear, but neither the concept of protein function nor specific experimental and/or bioinformatics protocols have been standardized to address it. We proposed using a retrospective analysis of the key HPP repository, the neXtProt database, to identify the most frequently used experimental and bioinformatic methods for analyzing protein functions, and the dynamics of accumulation of functional annotations. It has been shown that the dynamics of the increase in the number of proteins with known functions are greater than the progress made in the experimental confirmation of the existence of questionable proteins in the framework of the missing-protein challenge. At the same time, the functional annotation is based on the guilty-by-association postulate, according to which, based on large-scale experiments on API-MS and Y2H, proteins with unknown functions are most likely mapped through “handshakes” to biochemical processes.
Collapse
|
7
|
Mohan HM, Trzeciakiewicz H, Pithadia A, Crowley EV, Pacitto R, Safren N, Trotter B, Zhang C, Zhou X, Zhang Y, Basrur V, Paulson HL, Sharkey LM. RTL8 promotes nuclear localization of UBQLN2 to subnuclear compartments associated with protein quality control. Cell Mol Life Sci 2022; 79:176. [PMID: 35247097 PMCID: PMC9376861 DOI: 10.1007/s00018-022-04170-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 01/17/2022] [Accepted: 01/25/2022] [Indexed: 11/24/2022]
Abstract
The brain-expressed ubiquilins (UBQLNs) 1, 2 and 4 are a family of ubiquitin adaptor proteins that participate broadly in protein quality control (PQC) pathways, including the ubiquitin proteasome system (UPS). One family member, UBQLN2, has been implicated in numerous neurodegenerative diseases including ALS/FTD. UBQLN2 typically resides in the cytoplasm but in disease can translocate to the nucleus, as in Huntington's disease where it promotes the clearance of mutant Huntingtin. How UBQLN2 translocates to the nucleus and clears aberrant nuclear proteins, however, is not well understood. In a mass spectrometry screen to discover UBQLN2 interactors, we identified a family of small (13 kDa), highly homologous uncharacterized proteins, RTL8, and confirmed the interaction between UBQLN2 and RTL8 both in vitro using recombinant proteins and in vivo using mouse brain tissue. Under endogenous and overexpressed conditions, RTL8 localizes to nucleoli. When co-expressed with UBQLN2, RTL8 promotes nuclear translocation of UBQLN2. RTL8 also facilitates UBQLN2's nuclear translocation during heat shock. UBQLN2 and RTL8 colocalize within ubiquitin-enriched subnuclear structures containing PQC components. The robust effect of RTL8 on the nuclear translocation and subnuclear localization of UBQLN2 does not extend to the other brain-expressed ubiquilins, UBQLN1 and UBQLN4. Moreover, compared to UBQLN1 and UBQLN4, UBQLN2 preferentially stabilizes RTL8 levels in human cell lines and in mouse brain, supporting functional heterogeneity among UBQLNs. As a novel UBQLN2 interactor that recruits UBQLN2 to specific nuclear compartments, RTL8 may regulate UBQLN2 function in nuclear protein quality control.
Collapse
Affiliation(s)
- Harihar Milaganur Mohan
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA.,Graduate Program in Cellular and Molecular Biology, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | | | - Amit Pithadia
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | - Emily V Crowley
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | - Regina Pacitto
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | - Nathaniel Safren
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA.,Department of Neurology, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Bryce Trotter
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | - Xiaogen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2200, USA
| | - Venkatesha Basrur
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Henry L Paulson
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA. .,Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI, 48109-2200, USA.
| | - Lisa M Sharkey
- Department of Neurology, University of Michigan, Ann Arbor, MI, 48109-2200, USA. .,Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI, 48109-2200, USA.
| |
Collapse
|
8
|
Omenn GS, Lane L, Overall CM, Paik YK, Cristea IM, Corrales FJ, Lindskog C, Weintraub S, Roehrl MHA, Liu S, Bandeira N, Srivastava S, Chen YJ, Aebersold R, Moritz RL, Deutsch EW. Progress Identifying and Analyzing the Human Proteome: 2021 Metrics from the HUPO Human Proteome Project. J Proteome Res 2021; 20:5227-5240. [PMID: 34670092 PMCID: PMC9340669 DOI: 10.1021/acs.jproteome.1c00590] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The 2021 Metrics of the HUPO Human Proteome Project (HPP) show that protein expression has now been credibly detected (neXtProt PE1 level) for 18 357 (92.8%) of the 19 778 predicted proteins coded in the human genome, a gain of 483 since 2020 from reports throughout the world reanalyzed by the HPP. Conversely, the number of neXtProt PE2, PE3, and PE4 missing proteins has been reduced by 478 to 1421. This represents remarkable progress on the proteome parts list. The utilization of proteomics in a broad array of biological and clinical studies likewise continues to expand with many important findings and effective integration with other omics platforms. We present highlights from the Immunopeptidomics, Glycoproteomics, Infectious Disease, Cardiovascular, Musculo-Skeletal, Liver, and Cancers B/D-HPP teams and from the Knowledgebase, Mass Spectrometry, Antibody Profiling, and Pathology resource pillars, as well as ethical considerations important to the clinical utilization of proteomics and protein biomarkers.
Collapse
Affiliation(s)
- Gilbert S Omenn
- University of Michigan, Ann Arbor, Michigan 48109, United States
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | | | - Young-Ki Paik
- Yonsei Proteome Research Center and Yonsei University, Seoul 03722, Korea
| | - Ileana M Cristea
- Princeton University, Princeton, New Jersey 08544, United States
| | | | | | - Susan Weintraub
- University of Texas Health, San Antonio, San Antonio, Texas 78229-3900, United States
| | - Michael H A Roehrl
- Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Siqi Liu
- BGI Group, Shenzhen 518083, China
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, California 92093, United States
| | | | - Yu-Ju Chen
- National Taiwan University, Academia Sinica, Nankang, Taipei 11529, Taiwan
| | - Ruedi Aebersold
- ETH-Zurich and University of Zurich, 8092 Zurich, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
9
|
Yang Y, Hwang H, Im JE, Lee K, Bhoo SH, Yoo JS, Kim YH, Kim JY. Flashlight into the Function of Unannotated C11orf52 using Affinity Purification Mass Spectrometry. J Proteome Res 2021; 20:5340-5346. [PMID: 34739247 DOI: 10.1021/acs.jproteome.1c00540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
For an enhanced understanding of the biological mechanisms of human disease, it is essential to investigate protein functions. In a previous study, we developed a prediction method of gene ontology (GO) terms by the I-TASSER/COFACTOR result, and we applied this to uPE1 in chromosome 11. Here, to validate the bioinformatics prediction of C11orf52, we utilized affinity purification and mass spectrometry to identify interacting partners of C11orf52. Using immunoprecipitation methods with three different peptide tags (Myc, Flag, and 2B8) in HEK 293T cell lines, we identified 79 candidate proteins that are expected to interact with C11orf52. The results of a pathway analysis of the GO and STRING database with candidate proteins showed that C11orf52 could be related to signaling receptor binding, cell-cell adhesion, and ribosome biogenesis. Then, we selected three partner candidates of DSG1, JUP, and PTPN11 for verification of the interaction with C11orf52 and confirmed them by colocalization at the cell-cell junctions by coimmunofluorescence experiments. On the basis of this study, we expect that C11orf52 is related to the Wnt signaling pathway via DSG1 from the protein-protein interactions, given the results of a comprehensive analysis of the bioinformatic predictions. The data set is available at the ProteomeXchange consortium via PRIDE repository (PXD026986).
Collapse
Affiliation(s)
- Yeji Yang
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Heeyoun Hwang
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Ji Eun Im
- Division of Convergence Technology, Research Institute of National Cancer Center, Goyang 10408, Republic of Korea
| | - Kyungha Lee
- Graduate School of Biotechnology, Kyung Hee University, Yongin 17104, Republic of Korea
| | - Seong Hee Bhoo
- Graduate School of Biotechnology, Kyung Hee University, Yongin 17104, Republic of Korea
| | - Jong Shin Yoo
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea.,Graduate School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| | - Yun-Hee Kim
- Division of Convergence Technology, Research Institute of National Cancer Center, Goyang 10408, Republic of Korea.,Department of Cancer Biomedical Science, The National Cancer Center Graduate School of Cancer Science and Policy, Goyang 10408, Republic of Korea
| | - Jin Young Kim
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| |
Collapse
|
10
|
Duek P, Mary C, Zahn-Zabal M, Bairoch A, Lane L. Functionathon: a manual data mining workflow to generate functional hypotheses for uncharacterized human proteins and its application by undergraduate students. Database (Oxford) 2021; 2021:baab046. [PMID: 34318869 PMCID: PMC8317215 DOI: 10.1093/database/baab046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 07/06/2021] [Accepted: 07/12/2021] [Indexed: 12/11/2022]
Abstract
About 10% of human proteins have no annotated function in protein knowledge bases. A workflow to generate hypotheses for the function of these uncharacterized proteins has been developed, based on predicted and experimental information on protein properties, interactions, tissular expression, subcellular localization, conservation in other organisms, as well as phenotypic data in mutant model organisms. This workflow has been applied to seven uncharacterized human proteins (C6orf118, C7orf25, CXorf58, RSRP1, SMLR1, TMEM53 and TMEM232) in the frame of a course-based undergraduate research experience named Functionathon organized at the University of Geneva to teach undergraduate students how to use biological databases and bioinformatics tools and interpret the results. C6orf118, CXorf58 and TMEM232 were proposed to be involved in cilia-related functions; TMEM53 and SMLR1 were proposed to be involved in lipid metabolism and C7orf25 and RSRP1 were proposed to be involved in RNA metabolism and gene expression. Experimental strategies to test these hypotheses were also discussed. The results of this manual data mining study may contribute to the project recently launched by the Human Proteome Organization (HUPO) Human Proteome Project aiming to fill gaps in the functional annotation of human proteins. Database URL: http://www.nextprot.org.
Collapse
Affiliation(s)
- Paula Duek
- CALIPHO group, SIB Swiss Institute of Bioinformatics
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| | - Camille Mary
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| | | | - Amos Bairoch
- CALIPHO group, SIB Swiss Institute of Bioinformatics
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| | - Lydie Lane
- CALIPHO group, SIB Swiss Institute of Bioinformatics
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
11
|
Omenn GS. Reflections on the HUPO Human Proteome Project, the Flagship Project of the Human Proteome Organization, at 10 Years. Mol Cell Proteomics 2021; 20:100062. [PMID: 33640492 PMCID: PMC8058560 DOI: 10.1016/j.mcpro.2021.100062] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 02/04/2021] [Accepted: 02/05/2021] [Indexed: 02/08/2023] Open
Abstract
We celebrate the 10th anniversary of the launch of the HUPO Human Proteome Project (HPP) and its major milestone of confident detection of at least one protein from each of 90% of the predicted protein-coding genes, based on the output of the entire proteomics community. The Human Genome Project reached a similar decadal milestone 20 years ago. The HPP has engaged proteomics teams around the world, strongly influenced data-sharing, enhanced quality assurance, and issued stringent guidelines for claims of detecting previously "missing proteins." This invited perspective complements papers on "A High-Stringency Blueprint of the Human Proteome" and "The Human Proteome Reaches a Major Milestone" in special issues of Nature Communications and Journal of Proteome Research, respectively, released in conjunction with the October 2020 virtual HUPO Congress and its celebration of the 10th anniversary of the HUPO HPP.
Collapse
Affiliation(s)
- Gilbert S Omenn
- University of Michigan Medical School, Departments of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, Ann Arbor, Michigan, USA.
| |
Collapse
|
12
|
Zhang C, Zheng W, Cheng M, Omenn GS, Freddolino PL, Zhang Y. Functions of Essential Genes and a Scale-Free Protein Interaction Network Revealed by Structure-Based Function and Interaction Prediction for a Minimal Genome. J Proteome Res 2021; 20:1178-1189. [PMID: 33393786 PMCID: PMC7867644 DOI: 10.1021/acs.jproteome.0c00359] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
When the JCVI-syn3.0 genome was designed and implemented in 2016 as the minimal genome of a free-living organism, approximately one-third of the 438 protein-coding genes had no known function. Subsequent refinement into JCVI-syn3A led to inclusion of 16 additional protein-coding genes, including several unknown functions, resulting in an improved growth phenotype. Here, we seek to unveil the biological roles and protein-protein interaction (PPI) networks for these poorly characterized proteins using state-of-the-art deep learning contact-assisted structure prediction, followed by structure-based annotation of functions and PPI predictions. Our pipeline is able to confidently assign functions for many previously unannotated proteins such as putative vitamin transporters, which suggest the importance of nutrient uptake even in a minimized genome. Remarkably, despite the artificial selection of genes in the minimal syn3 genome, our reconstructed PPI network still shows a power law distribution of node degrees typical of naturally evolved bacterial PPI networks. Making use of our framework for combined structure/function/interaction modeling, we are able to identify both fundamental aspects of network biology that are retained in a minimal proteome and additional essential functions not yet recognized among the poorly annotated components of the syn3.0 and syn3A proteomes.
Collapse
Affiliation(s)
- Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Micah Cheng
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
- Departments of Internal Medicine and Human Genetics and School of Public Health, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
13
|
Omenn GS, Lane L, Overall CM, Cristea IM, Corrales FJ, Lindskog C, Paik YK, Van Eyk JE, Liu S, Pennington SR, Snyder MP, Baker MS, Bandeira N, Aebersold R, Moritz RL, Deutsch EW. Research on the Human Proteome Reaches a Major Milestone: >90% of Predicted Human Proteins Now Credibly Detected, According to the HUPO Human Proteome Project. J Proteome Res 2020; 19:4735-4746. [PMID: 32931287 PMCID: PMC7718309 DOI: 10.1021/acs.jproteome.0c00485] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
According to the 2020 Metrics of the HUPO Human Proteome Project (HPP), expression has now been detected at the protein level for >90% of the 19 773 predicted proteins coded in the human genome. The HPP annually reports on progress made throughout the world toward credibly identifying and characterizing the complete human protein parts list and promoting proteomics as an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2020-01 classified 17 874 proteins as PE1, having strong protein-level evidence, up 180 from 17 694 one year earlier. These represent 90.4% of the 19 773 predicted coding genes (all PE1,2,3,4 proteins in neXtProt). Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), was reduced by 230 from 2129 to 1899 since the neXtProt 2019-01 release. PeptideAtlas is the primary source of uniform reanalysis of raw mass spectrometry data for neXtProt, supplemented this year with extensive data from MassIVE. PeptideAtlas 2020-01 added 362 canonical proteins between 2019 and 2020 and MassIVE contributed 84 more, many of which converted PE1 entries based on non-MS evidence to the MS-based subgroup. The 19 Biology and Disease-driven B/D-HPP teams continue to pursue the identification of driver proteins that underlie disease states, the characterization of regulatory mechanisms controlling the functions of these proteins, their proteoforms, and their interactions, and the progression of transitions from correlation to coexpression to causal networks after system perturbations. And the Human Protein Atlas published Blood, Brain, and Metabolic Atlases.
Collapse
Affiliation(s)
- Gilbert S Omenn
- University of Michigan, Ann Arbor, Michigan 48109, United States
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | | | - Ileana M Cristea
- Princeton University, Princeton, New Jersey 08544, United States
| | | | | | | | | | - Siqi Liu
- BGI Group, Shenzhen 518083, China
| | | | | | - Mark S Baker
- Macquarie University, Macquarie Park, NSW 2109, Australia
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, California 92093, United States
| | - Ruedi Aebersold
- ETH-Zurich and University of Zurich, 8092 Zurich, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
14
|
Affiliation(s)
- Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, Faculty of Dentistry, The University of British Columbia, Vancouver, British Columbia V6T 1Z3, Canada
| |
Collapse
|
15
|
Hwang H, Im JE, Yang Y, Kim H, Kwon KH, Kim YH, Kim JY, Yoo JS. Bioinformatic Prediction of Gene Ontology Terms of Uncharacterized Proteins from Chromosome 11. J Proteome Res 2020; 19:4907-4912. [PMID: 33089979 DOI: 10.1021/acs.jproteome.0c00482] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
In chromosome 11, 71 out of its 1254 proteins remain functionally uncharacterized on the basis of their existence evidence (uPE1s) following the latest version of neXtProt (release 2020-01-17). Because in vivo and in vitro experimental strategies are often time-consuming and labor-intensive, there is a need for a bioinformatics tool to predict the function annotation. Here, we used I-TASSER/COFACTOR provided on the neXtProt web site, which predicts gene ontology (GO) terms based on the 3D structure of the protein. I-TASSER/COFACTOR predicted 2413 GO terms with a benchmark dataset of the 22 proteins belonging to PE1 of chromosome 11. In this study, we developed a filtering algorithm in order to select specific GO terms using the GO map generated by I-TASSER/COFACTOR. As a result, 187 specific GO terms showed a higher average precision-recall score at the least cellular component term compared to 2413 predicted GO terms. Next, we applied 65 proteins belonging to uPE1s of chromosome 11, and then 409 out of 6684 GO terms survived, where 103 and 142 GO terms of molecular function and biological process, respectively, were included. Representatively, the cellular component GO terms of CCDC90B, C11orf52, and the SMAP were predicted and validated using the overexpression system into 293T cells and immunofluorescence staining. We will further study their biological and molecular functions toward the goal of the neXt-CP50 project as a part of C-HPP. We shared all results and programs in Github (https://github.com/heeyounh/I-TASSER-COFACTOR-filtering.git).
Collapse
Affiliation(s)
- Heeyoun Hwang
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Ji Eun Im
- Division of Convergence Technology, Research Institute of National Cancer Center, Goyang 10408, Republic of Korea
| | - Yeji Yang
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Hyejin Kim
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea.,Graduate School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| | - Kyung-Hoon Kwon
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Yun-Hee Kim
- Division of Convergence Technology, Research Institute of National Cancer Center, Goyang 10408, Republic of Korea.,Department of Cancer Biomedical Science, The National Cancer Center Graduate School of Cancer Science and Policy, Goyang 10408, Republic of Korea
| | - Jin Young Kim
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Jong Shin Yoo
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea.,Graduate School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| |
Collapse
|
16
|
González-Gomariz J, Serrano G, Tilve-Álvarez CM, Corrales FJ, Guruceaga E, Segura V. UPEFinder: A Bioinformatic Tool for the Study of Uncharacterized Proteins Based on Gene Expression Correlation and the PageRank Algorithm. J Proteome Res 2020; 19:4795-4807. [PMID: 33155801 DOI: 10.1021/acs.jproteome.0c00364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The Human Proteome Project (HPP) is leading the international effort to characterize the human proteome. Although the main goal of this project was first focused on the detection of missing proteins, a new challenge arose from the need to assign biological functions to the uncharacterized human proteins and describe their implications in human diseases. Not only the proteins with experimental evidence (uPE1 proteins) but also the uncharacterized missing proteins (uMPs) were the objects of study in this challenge, neXt-CP50. In this work, we developed a new bioinformatic approach to infer biological annotations for the uPE1 proteins and uMPs based on a "guilt-by-association" analysis using public RNA-Seq data sets. We used the correlation of these proteins with the well-characterized PE1 proteins to construct a network. In this way, we applied the PageRank algorithm to this network to identify the most relevant nodes, which were the biological annotations of the uncharacterized proteins. All of the generated information was stored in a database. In addition, we implemented the web application UPEFinder (https://upefinder.proteored.org) to facilitate the access to this new resource. This information is especially relevant for the researchers of the HPP who are interested in the generation and validation of new hypotheses about the functions of these proteins. Both the database and the web application are publicly available (https://github.com/ubioinformat/UPEfinder).
Collapse
Affiliation(s)
| | - Guillermo Serrano
- Bioinformatics Platform, CIMA University of Navarra, Pamplona E-31008, Spain
| | - Carlos M Tilve-Álvarez
- Fundación Profesor Nóvoa-Santos, Instituto de Investigación Biomédica da Coruña, Coruña E-15006, Spain
| | - Fernando J Corrales
- Proteomics Unit, National Center for Biotechnology, CSIC, Madrid E-28049, Spain
| | - Elizabeth Guruceaga
- IdiSNA, Navarra Institute for Health Research, Pamplona E-31008, Spain
- Bioinformatics Platform, CIMA University of Navarra, Pamplona E-31008, Spain
| | - Victor Segura
- Tracasa Instrumental, Sarriguren E-31621, Spain
- Sección de Ingeniería del Dato, Dirección General de Telecomunicaciones y Digitalización, Gobierno de Navarra, Sarriguren E-31621, Spain
| |
Collapse
|
17
|
Adhikari S, Nice EC, Deutsch EW, Lane L, Omenn GS, Pennington SR, Paik YK, Overall CM, Corrales FJ, Cristea IM, Van Eyk JE, Uhlén M, Lindskog C, Chan DW, Bairoch A, Waddington JC, Justice JL, LaBaer J, Rodriguez H, He F, Kostrzewa M, Ping P, Gundry RL, Stewart P, Srivastava S, Srivastava S, Nogueira FCS, Domont GB, Vandenbrouck Y, Lam MPY, Wennersten S, Vizcaino JA, Wilkins M, Schwenk JM, Lundberg E, Bandeira N, Marko-Varga G, Weintraub ST, Pineau C, Kusebauch U, Moritz RL, Ahn SB, Palmblad M, Snyder MP, Aebersold R, Baker MS. A high-stringency blueprint of the human proteome. Nat Commun 2020; 11:5301. [PMID: 33067450 PMCID: PMC7568584 DOI: 10.1038/s41467-020-19045-9] [Citation(s) in RCA: 152] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 09/25/2020] [Indexed: 02/07/2023] Open
Abstract
The Human Proteome Organization (HUPO) launched the Human Proteome Project (HPP) in 2010, creating an international framework for global collaboration, data sharing, quality assurance and enhancing accurate annotation of the genome-encoded proteome. During the subsequent decade, the HPP established collaborations, developed guidelines and metrics, and undertook reanalysis of previously deposited community data, continuously increasing the coverage of the human proteome. On the occasion of the HPP's tenth anniversary, we here report a 90.4% complete high-stringency human proteome blueprint. This knowledge is essential for discerning molecular processes in health and disease, as we demonstrate by highlighting potential roles the human proteome plays in our understanding, diagnosis and treatment of cancers, cardiovascular and infectious diseases.
Collapse
Grants
- WT101477MA Wellcome Trust
- R24 GM127667 NIGMS NIH HHS
- U24 CA210985 NCI NIH HHS
- U19 AG023122 NIA NIH HHS
- U24 CA210967 NCI NIH HHS
- R01 GM087221 NIGMS NIH HHS
- R01 GM114141 NIGMS NIH HHS
- U24 CA115102 NCI NIH HHS
- P30 ES017885 NIEHS NIH HHS
- R01 HL111362 NHLBI NIH HHS
- Wellcome Trust
- 208391/Z/17/Z Wellcome Trust
- International Macquarie Research Excellence Scholarship
- NHMRC 1010303 (MSB, ECN); Cancer Council NSW RG19-04 (MSB, SBA, ECN); Cancer Institute NSW Fellowship 15/ECF/1-38 (SBA), Sydney Vital CINSW Translational Cancer Research Centre grant (MSB, SBA, SA), “Fight on the Beaches” (MSB, SBA, ECN, SA)
- Department of Health | National Health and Medical Research Council (NHMRC)
- Cancer Institute NSW (Cancer Institute New South Wales)
- “Fight on the Beaches” research grant
Collapse
Affiliation(s)
- Subash Adhikari
- Faculty of Medicine, Health and Human Sciences, Department of Biomedical Sciences, Macquarie University, North Ryde, NSW, 2109, Australia
| | - Edouard C Nice
- Faculty of Medicine, Health and Human Sciences, Department of Biomedical Sciences, Macquarie University, North Ryde, NSW, 2109, Australia
- Faculty of Medicine, Nursing and Health Sciences, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC, 3800, Australia
| | - Eric W Deutsch
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA, 98109, USA
| | - Lydie Lane
- Faculty of Medicine, SIB-Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, University of Geneva, CMU, Michel-Servet 1, 1211, Geneva, Switzerland
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
| | - Stephen R Pennington
- UCD Conway Institute of Biomolecular and Biomedical Research, School of Medicine, University College Dublin, Dublin, Ireland
| | - Young-Ki Paik
- Yonsei Proteome Research Center, 50 Yonsei-ro, Sudaemoon-ku, Seoul, 120-749, South Korea
| | | | - Fernando J Corrales
- Functional Proteomics Laboratory, Centro Nacional de Biotecnología-CSIC, Proteored-ISCIII, 28049, Madrid, Spain
| | - Ileana M Cristea
- Department of Molecular Biology, Princeton University, Princeton, NJ, 08544, USA
| | - Jennifer E Van Eyk
- Cedars Sinai Medical Center, Advanced Clinical Biosystems Research Institute, The Smidt Heart Institute, Los Angeles, CA, 90048, USA
| | - Mathias Uhlén
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, 17121, Solna, Sweden
| | - Cecilia Lindskog
- Rudbeck Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, 75185, Uppsala, Sweden
| | - Daniel W Chan
- Department of Pathology and Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, 21224, USA
| | - Amos Bairoch
- Faculty of Medicine, SIB-Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, University of Geneva, CMU, Michel-Servet 1, 1211, Geneva, Switzerland
| | - James C Waddington
- UCD Conway Institute of Biomolecular and Biomedical Research, School of Medicine, University College Dublin, Dublin, Ireland
| | - Joshua L Justice
- Department of Molecular Biology, Princeton University, Princeton, NJ, 08544, USA
| | - Joshua LaBaer
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, NIH, Bethesda, MD, 20892, USA
| | - Fuchu He
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Markus Kostrzewa
- Bruker Daltonik GmbH, Microbiology and Diagnostics, Fahrenheitstrasse, 428359, Bremen, Germany
| | - Peipei Ping
- Cardiac Proteomics and Signaling Laboratory, Department of Physiology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Rebekah L Gundry
- CardiOmics Program, Center for Heart and Vascular Research, Division of Cardiovascular Medicine and Department of Cellular and Integrative Physiology, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Peter Stewart
- Department of Chemical Pathology, Royal Prince Alfred Hospital, Camperdown, NSW, Australia
| | | | - Sudhir Srivastava
- Cancer Biomarkers Research Branch, National Cancer Institute, National Institutes of Health, 9609 Medical Center Drive, Suite 5E136, Rockville, MD, 20852, USA
| | - Fabio C S Nogueira
- Proteomics Unit and Laboratory of Proteomics, Institute of Chemistry, Federal University of Rio de Janeiro, Av Athos da Silveria Ramos, 149, 21941-909, Rio de Janeiro, RJ, Brazil
| | - Gilberto B Domont
- Proteomics Unit and Laboratory of Proteomics, Institute of Chemistry, Federal University of Rio de Janeiro, Av Athos da Silveria Ramos, 149, 21941-909, Rio de Janeiro, RJ, Brazil
| | - Yves Vandenbrouck
- University of Grenoble Alpes, Inserm, CEA, IRIG-BGE, U1038, 38000, Grenoble, France
| | - Maggie P Y Lam
- Departments of Medicine-Cardiology and Biochemistry and Molecular Genetics, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
- Consortium for Fibrosis Research and Translation, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Sara Wennersten
- Division of Cardiology, Department of Medicine, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Juan Antonio Vizcaino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Marc Wilkins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Jochen M Schwenk
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, 17121, Solna, Sweden
| | - Emma Lundberg
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, 17121, Solna, Sweden
| | - Nuno Bandeira
- Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, Mail Code 0404, La Jolla, CA, 92093-0404, USA
| | | | - Susan T Weintraub
- Department of Biochemistry and Structural Biology, University of Texas Health Science Center San Antonio, UT Health, 7703 Floyd Curl Drive, San Antonio, TX, 78229-3900, USA
| | - Charles Pineau
- University of Rennes, Inserm, EHESP, IREST, UMR_S 1085, F-35042, Rennes, France
| | - Ulrike Kusebauch
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA, 98109, USA
| | - Robert L Moritz
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA, 98109, USA
| | - Seong Beom Ahn
- Faculty of Medicine, Health and Human Sciences, Department of Biomedical Sciences, Macquarie University, North Ryde, NSW, 2109, Australia
| | - Magnus Palmblad
- Leiden University Medical Center, Leiden, 2333, The Netherlands
| | - Michael P Snyder
- Department of Genetics, Stanford School of Medicine, Stanford, CA, 94305, USA
| | - Ruedi Aebersold
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA, 98109, USA
- Faculty of Science, University of Zurich, Zurich, Switzerland
| | - Mark S Baker
- Faculty of Medicine, Health and Human Sciences, Department of Biomedical Sciences, Macquarie University, North Ryde, NSW, 2109, Australia.
- Department of Genetics, Stanford School of Medicine, Stanford, CA, 94305, USA.
| |
Collapse
|
18
|
Poverennaya E, Kiseleva O, Romanova A, Pyatnitskiy M. Predicting Functions of Uncharacterized Human Proteins: From Canonical to Proteoforms. Genes (Basel) 2020; 11:E677. [PMID: 32575886 PMCID: PMC7350264 DOI: 10.3390/genes11060677] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/09/2020] [Accepted: 06/19/2020] [Indexed: 01/22/2023] Open
Abstract
Despite tremendous efforts in genomics, transcriptomics, and proteomics communities, there is still no comprehensive data about the exact number of protein-coding genes, translated proteoforms, and their function. In addition, by now, we lack functional annotation for 1193 genes, where expression was confirmed at the proteomic level (uPE1 proteins). We re-analyzed results of AP-MS experiments from the BioPlex 2.0 database to predict functions of uPE1 proteins and their splice forms. By building a protein-protein interaction network for 12 ths. identified proteins encoded by 11 ths. genes, we were able to predict Gene Ontology categories for a total of 387 uPE1 genes. We predicted different functions for canonical and alternatively spliced forms for four uPE1 genes. In total, functional differences were revealed for 62 proteoforms encoded by 31 genes. Based on these results, it can be carefully concluded that the dynamics and versatility of the interactome is ensured by changing the dominant splice form. Overall, we propose that analysis of large-scale AP-MS experiments performed for various cell lines and under various conditions is a key to understanding the full potential of genes role in cellular processes.
Collapse
Affiliation(s)
- Ekaterina Poverennaya
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
- Institute of Environmental and Agricultural Biology (X-BIO),Tyumen State University, 625003 Tyumen, Russia
| | - Olga Kiseleva
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
| | - Anastasia Romanova
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
- Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, 141701 Moscow, Russia
| | - Mikhail Pyatnitskiy
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
- Department of Molecular Biology and Genetics, Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia
| |
Collapse
|
19
|
Affiliation(s)
- Monique Zahn-Zabal
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, Geneva, Switzerland
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, Geneva, Switzerland
| |
Collapse
|
20
|
Omenn GS, Lane L, Overall CM, Corrales FJ, Schwenk JM, Paik YK, Van Eyk JE, Liu S, Pennington S, Snyder MP, Baker MS, Deutsch EW. Progress on Identifying and Characterizing the Human Proteome: 2019 Metrics from the HUPO Human Proteome Project. J Proteome Res 2019; 18:4098-4107. [PMID: 31430157 PMCID: PMC6898754 DOI: 10.1021/acs.jproteome.9b00434] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
The Human Proteome Project (HPP) annually reports on progress made throughout the field in credibly identifying and characterizing the complete human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2019-01-11 contains 17 694 proteins with strong protein-level evidence (PE1), compliant with HPP Guidelines for Interpretation of MS Data v2.1; these represent 89% of all 19 823 neXtProt predicted coding genes (all PE1,2,3,4 proteins), up from 17 470 one year earlier. Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), has been reduced from 2949 to 2129 since 2016 through efforts throughout the community, including the chromosome-centric HPP. PeptideAtlas is the source of uniformly reanalyzed raw mass spectrometry data for neXtProt; PeptideAtlas added 495 canonical proteins between 2018 and 2019, especially from studies designed to detect hard-to-identify proteins. Meanwhile, the Human Protein Atlas has released version 18.1 with immunohistochemical evidence of expression of 17 000 proteins and survival plots as part of the Pathology Atlas. Many investigators apply multiplexed SRM-targeted proteomics for quantitation of organ-specific popular proteins in studies of various human diseases. The 19 teams of the Biology and Disease-driven B/D-HPP published a total of 160 publications in 2018, bringing proteomics to a broad array of biomedical research.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, Michel-Servet 1, 1211 Geneva 4, Switzerland
| | - Christopher M. Overall
- Life Sciences Institute, Faculty of Dentistry, University of British Columbia, 2350 Health Sciences Mall, Room 4.401, Vancouver, British Columbia V6T 1Z3, Canada
| | | | - Jochen M. Schwenk
- Science for Life Laboratory, KTH Royal Institute of Technology, Tomtebodavägen 23A, 17165 Solna, Sweden
| | - Young-Ki Paik
- Yonsei Proteome Research Center, Yonsei University, Room 425, Building #114, 50 Yonsei-ro, Seodaemoon-ku, Seoul 120-749, South Korea
| | - Jennifer E. Van Eyk
- Advanced Clinical BioSystems Research Institute, Cedars Sinai Precision Biomarker Laboratories, Barbra Streisand Women’s Heart Center, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Siqi Liu
- BGI Group-Shenzhen, Yantian District, Shenzhen 518083, China
| | - Stephen Pennington
- School of Medicine, University College Dublin, Conway Institute Belfield, Dublin 4, Ireland
| | - Michael P. Snyder
- Department of Genetics, Stanford University, Alway Building, 300 Pasteur Drive and 3165 Porter Drive, Palo Alto, California 94304, United States
| | - Mark S. Baker
- Department of Biomedical Sciences, Faculty of Medicine & Health Sciences, Macquarie University, 75 Talavera Road, North Ryde, NSW 2109, Australia
| | - Eric W. Deutsch
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| |
Collapse
|