1
|
Vašíček J, Kuznetsova KG, Skiadopoulou D, Unger L, Chera S, Ghila LM, Bandeira N, Njølstad PR, Johansson S, Bruckner S, Käll L, Vaudel M. ProHap enables human proteomic database generation accounting for population diversity. Nat Methods 2025; 22:273-277. [PMID: 39653819 DOI: 10.1038/s41592-024-02506-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 10/10/2024] [Indexed: 02/12/2025]
Abstract
Amid the advances in genomics, the availability of large reference panels of human haplotypes is key to account for human diversity within and across populations. However, mass spectrometry-based proteomics does not benefit from this information. To address this gap, we introduce ProHap, a Python-based tool that constructs protein sequence databases from phased genotypes of reference panels. ProHap enables researchers to account for haplotype diversity in proteomic searches.
Collapse
Affiliation(s)
- Jakub Vašíček
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Ksenia G Kuznetsova
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Dafni Skiadopoulou
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Lucas Unger
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Simona Chera
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Luiza M Ghila
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, CA, USA
| | - Pål R Njølstad
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
- Children and Youth Clinic, Haukeland University Hospital, Bergen, Norway
| | - Stefan Johansson
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway
- Department of Medical Genetics, Haukeland University Hospital, Bergen, Norway
| | - Stefan Bruckner
- Institute for Visual and Analytic Computing, University of Rostock, Rostock, Germany
| | - Lukas Käll
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH - Royal Institute of Technology, Stockholm, Sweden
| | - Marc Vaudel
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, Bergen, Norway.
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.
- Department of Genetics and Bioinformatics, Health Data and Digitalization, Norwegian Institute of Public Health, Oslo, Norway.
| |
Collapse
|
2
|
Cao X, Sun S, Xing J. A Massive Proteogenomic Screen Identifies Thousands of Novel Peptides From the Human "Dark" Proteome. Mol Cell Proteomics 2024; 23:100719. [PMID: 38242438 PMCID: PMC10867589 DOI: 10.1016/j.mcpro.2024.100719] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 01/01/2024] [Accepted: 01/16/2024] [Indexed: 01/21/2024] Open
Abstract
Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.
Collapse
Affiliation(s)
- Xiaolong Cao
- Department of Anesthesiology, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Siqi Sun
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Jinchuan Xing
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA.
| |
Collapse
|
3
|
ElAbd H, Franke A. Mass Spectrometry-Based Immunopeptidomics of Peptides Presented on Human Leukocyte Antigen Proteins. Methods Mol Biol 2024; 2758:425-443. [PMID: 38549028 DOI: 10.1007/978-1-0716-3646-6_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Human leukocyte antigen (HLA) proteins are a group of glycoproteins that are expressed at the cell surface, where they present peptides to T cells through physical interactions with T-cell receptors (TCRs). Hence, characterizing the set of peptides presented by HLA proteins, referred to hereafter as the immunopeptidome, is fundamental for neoantigen identification, immunotherapy, and vaccine development. As a result, different methods have been used over the years to identify peptides presented by HLA proteins, including competition assays, peptide microarrays, and yeast display systems. Nonetheless, over the last decade, mass spectrometry-based immunopeptidomics (MS-immunopeptidomics) has emerged as the gold-standard method for identifying peptides presented by HLA proteins. MS-immunopeptidomics enables the direct identification of the immunopeptidome in different tissues and cell types in different physiological and pathological states, for example, solid tumors or virally infected cells. Despite its advantages, it is still an experimentally and computationally challenging technique with different aspects that need to be considered before planning an MS-immunopeptidomics experiment, while conducting the experiment and with analyzing and interpreting the results. Hence, we aim in this chapter to provide an overview of this method and discuss different practical considerations at different stages starting from sample collection until data analysis. These points should aid different groups aiming at utilizing MS-immunopeptidomics, as well as, identifying future research directions to improve the method.
Collapse
Affiliation(s)
- Hesham ElAbd
- Institute of Clinical Molecular Biology, University of Kiel, Kiel, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology, University of Kiel, Kiel, Germany.
| |
Collapse
|
4
|
Reis-de-Oliveira G, Smith BJ, Martins-de-Souza D. Postmortem Brains: What Can Proteomics Tell us About the Sources of Schizophrenia? ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1400:1-13. [DOI: 10.1007/978-3-030-97182-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|