1
|
Aparicio B, Theunissen P, Hervas-Stubbs S, Fortes P, Sarobe P. Relevance of mutation-derived neoantigens and non-classical antigens for anticancer therapies. Hum Vaccin Immunother 2024; 20:2303799. [PMID: 38346926 PMCID: PMC10863374 DOI: 10.1080/21645515.2024.2303799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 01/06/2024] [Indexed: 02/15/2024] Open
Abstract
Efficacy of cancer immunotherapies relies on correct recognition of tumor antigens by lymphocytes, eliciting thus functional responses capable of eliminating tumor cells. Therefore, important efforts have been carried out in antigen identification, with the aim of understanding mechanisms of response to immunotherapy and to design safer and more efficient strategies. In addition to classical tumor-associated antigens identified during the last decades, implementation of next-generation sequencing methodologies is enabling the identification of neoantigens (neoAgs) arising from mutations, leading to the development of new neoAg-directed therapies. Moreover, there are numerous non-classical tumor antigens originated from other sources and identified by new methodologies. Here, we review the relevance of neoAgs in different immunotherapies and the results obtained by applying neoAg-based strategies. In addition, the different types of non-classical tumor antigens and the best approaches for their identification are described. This will help to increase the spectrum of targetable molecules useful in cancer immunotherapies.
Collapse
Affiliation(s)
- Belen Aparicio
- Program of Immunology and Immunotherapy, Center for Applied Medical Research (CIMA) University of Navarra, Pamplona, Spain
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
| | - Patrick Theunissen
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
- DNA and RNA Medicine Division, Center for Applied Medical Research (CIMA), University of Navarra, Pamplona, Spain
| | - Sandra Hervas-Stubbs
- Program of Immunology and Immunotherapy, Center for Applied Medical Research (CIMA) University of Navarra, Pamplona, Spain
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
| | - Puri Fortes
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
- DNA and RNA Medicine Division, Center for Applied Medical Research (CIMA), University of Navarra, Pamplona, Spain
- Spanish Network for Advanced Therapies (TERAV ISCIII), Spain
| | - Pablo Sarobe
- Program of Immunology and Immunotherapy, Center for Applied Medical Research (CIMA) University of Navarra, Pamplona, Spain
- Cancer Center Clinica Universidad de Navarra (CCUN), Pamplona, Spain
- Navarra Institute for Health Research (IDISNA), Pamplona, Spain
- CIBERehd, Pamplona, Spain
| |
Collapse
|
2
|
Peters-Clarke TM, Coon JJ, Riley NM. Instrumentation at the Leading Edge of Proteomics. Anal Chem 2024; 96:7976-8010. [PMID: 38738990 DOI: 10.1021/acs.analchem.3c04497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2024]
Affiliation(s)
- Trenton M Peters-Clarke
- Department of Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
- Department of Biomolecular Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| | - Joshua J Coon
- Department of Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
- Department of Biomolecular Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
- Morgridge Institute for Research, Madison, Wisconsin 53715, United States
| | - Nicholas M Riley
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
3
|
Peters-Clarke TM, Liang Y, Mertz KL, Lee KW, Westphall MS, Hinkle JD, McAlister GC, Syka JEP, Kelly RT, Coon JJ. Boosting the Sensitivity of Quantitative Single-Cell Proteomics with Infrared-Tandem Mass Tags. J Proteome Res 2024. [PMID: 38713017 DOI: 10.1021/acs.jproteome.4c00076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Single-cell proteomics is a powerful approach to precisely profile protein landscapes within individual cells toward a comprehensive understanding of proteomic functions and tissue and cellular states. The inherent challenges associated with limited starting material demand heightened analytical sensitivity. Just as advances in sample preparation maximize the amount of material that makes it from the cell to the mass spectrometer, we strive to maximize the number of ions that make it from ion source to the detector. In isobaric tagging experiments, limited reporter ion generation limits quantitative accuracy and precision. The combination of infrared photoactivation and ion parking circumvents the m/z dependence inherent in HCD, maximizing reporter generation and avoiding unintended degradation of TMT reporter molecules in infrared-tandem mass tags (IR-TMT). The method was applied to single-cell human proteomes using 18-plex TMTpro, resulting in 4-5-fold increases in reporter signal compared to conventional SPS-MS3 approaches. IR-TMT enables faster duty cycles, higher throughput, and increased peptide identification and quantification. Comparative experiments showcase 4-5-fold lower injection times for IR-TMT, providing superior sensitivity without compromising accuracy. In all, IR-TMT enhances the dynamic range of proteomic experiments and is compatible with gas-phase fractionation and real-time searching, promising increased gains in the study of cellular heterogeneity.
Collapse
Affiliation(s)
- Trenton M Peters-Clarke
- Department of Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
- Department of Biomolecular Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| | - Yiran Liang
- Department of Chemistry, Brigham Young University, Provo, Utah 84602, United States
| | - Keaton L Mertz
- Department of Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
- Department of Biomolecular Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| | - Kenneth W Lee
- Department of Biomolecular Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| | - Michael S Westphall
- Department of Biomolecular Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
| | - Joshua D Hinkle
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | | | - John E P Syka
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Ryan T Kelly
- Department of Chemistry, Brigham Young University, Provo, Utah 84602, United States
| | - Joshua J Coon
- Department of Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
- Department of Biomolecular Chemistry, University of Wisconsin─Madison, Madison, Wisconsin 53706, United States
- National Center for Quantitative Biology of Complex Systems, Madison, Wisconsin 53706, United States
- Morgridge Institute for Research, Madison, Wisconsin 53515, United States
| |
Collapse
|
4
|
Lundgren T, Clark PL, Champion MM. Fit for Purpose Approach To Evaluate Detection of Amino Acid Substitutions in Shotgun Proteomics. J Proteome Res 2024; 23:1263-1271. [PMID: 38478054 PMCID: PMC11003417 DOI: 10.1021/acs.jproteome.3c00730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/04/2024] [Accepted: 02/27/2024] [Indexed: 04/06/2024]
Abstract
Amino acid substitutions (AASs) alter proteins from their genome-expected sequences. Accumulation of substitutions in proteins underlies numerous diseases and antibiotic mechanisms. Accurate global detection of AASs and their frequencies is crucial for understanding these mechanisms. Shotgun proteomics provides an untargeted method for measuring AASs but introduces biases when extrapolating from the genome to identify AASs. To characterize these biases, we created a "ground-truth" approach using the similarities betweenEscherichia coli and Salmonella typhimurium to model the complexity of AAS detection. Shotgun proteomics on mixed lysates generated libraries representing ∼100,000 peptide-spectra and 4161 peptide sequences with a single AAS and defined stoichiometry. Identifying S. typhimurium peptide-spectra with only the E. coli genome resulted in 64.1% correctly identified library peptides. Specific AASs exhibit variable identification efficiencies. There was no inherent bias from the stoichiometry of the substitutions. Short peptides and AASs localized near peptide termini had poor identification efficiency. We identify a new class of "scissor substitutions" that gain or lose protease cleavage sites. Scissor substitutions also had poor identification efficiency. This ground-truth AAS library reveals various sources of bias, which will guide the application of shotgun proteomics to validate AAS hypotheses.
Collapse
Affiliation(s)
- Taylor
J. Lundgren
- Department
of Chemistry and Biochemistry, University
of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Patricia L. Clark
- Department
of Chemistry and Biochemistry, University
of Notre Dame, Notre Dame, Indiana 46556, United States
- Department
of Chemical and Biomolecular Engineering, University of Notre Dame, Notre
Dame, Indiana 46556, United States
| | - Matthew M. Champion
- Department
of Chemistry and Biochemistry, University
of Notre Dame, Notre Dame, Indiana 46556, United States
| |
Collapse
|
5
|
Villacrés C, Mizero B, Spicer V, Viner R, Saba J, Patel B, Snovida S, Jensen P, Huhmer A, Krokhin OV. Toward an Ultimate Solution for Peptide Retention Time Prediction: The Effect of Column Temperature on Separation Selectivity. J Proteome Res 2024; 23:1488-1494. [PMID: 38530092 DOI: 10.1021/acs.jproteome.4c00018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
We studied the effect of the column temperature on the selectivity of reversed-phase peptide separation in bottom-up proteomics. The number of peptide identifications from 2 h liquid chromatography with tandem mass spectrometry (LC-MS/MS) acquisitions reaches a plateau at 45-55 °C, driven simultaneously by improved separation efficiency, a gradual decrease in peptide retention, and possible on-column degradation of peptides at elevated temperatures. Performing 2D LC-MS/MS acquisitions at 25, 35, 45, and 55 °C resulted in the identification of ∼100,000 and ∼120,000 unique peptides for nonmodified and tandem mass tags (TMT)-labeled samples, respectively. These peptide collections were used to investigate the temperature-driven retention features. The latter is governed by the specific temperature response of individual residues, peptide hydrophobicity and length, and amphipathic helicity. On average, peptide retention decreased by 0.56 and 0.5% acetonitrile for each 10 °C increase for label-free and TMT-labeled peptides, respectively. This generally linear response of retention shifts allowed the extrapolation of predictive models beyond the studied temperature range. Thus, (trap) column cooling from room temperature to 0 °C will allow the retention of an additional 3% of detectable tryptic peptides. Meanwhile, the application of 90 °C would result in the loss of ∼20% of tryptic peptides that were amenable to MS/MS-based identification.
Collapse
Affiliation(s)
- Carina Villacrés
- Manitoba Centre for Proteomics and Systems Biology, Winnipeg R3E 3P4, Canada
| | - Benilde Mizero
- Department of Chemistry, University of Manitoba, Winnipeg R3T 2N2, Canada
| | - Victor Spicer
- Manitoba Centre for Proteomics and Systems Biology, Winnipeg R3E 3P4, Canada
| | - Rosa Viner
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Julian Saba
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | | | - Sergei Snovida
- Thermo Fisher Scientific, Rockford, Illinois 61101, United States
| | - Penny Jensen
- Thermo Fisher Scientific, Rockford, Illinois 61101, United States
| | - Andreas Huhmer
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Oleg V Krokhin
- Manitoba Centre for Proteomics and Systems Biology, Winnipeg R3E 3P4, Canada
- Department of Chemistry, University of Manitoba, Winnipeg R3T 2N2, Canada
- Department of Internal Medicine, University of Manitoba, Winnipeg R3E 3P4, Canada
| |
Collapse
|
6
|
Choi S, Paek E. pXg: Comprehensive Identification of Noncanonical MHC-I-Associated Peptides From De Novo Peptide Sequencing Using RNA-Seq Reads. Mol Cell Proteomics 2024; 23:100743. [PMID: 38403075 PMCID: PMC10979277 DOI: 10.1016/j.mcpro.2024.100743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 02/19/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024] Open
Abstract
Discovering noncanonical peptides has been a common application of proteogenomics. Recent studies suggest that certain noncanonical peptides, known as noncanonical major histocompatibility complex-I (MHC-I)-associated peptides (ncMAPs), that bind to MHC-I may make good immunotherapeutic targets. De novo peptide sequencing is a great way to find ncMAPs since it can detect peptide sequences from their tandem mass spectra without using any sequence databases. However, this strategy has not been widely applied for ncMAP identification because there is not a good way to estimate its false-positive rates. In order to completely and accurately identify immunopeptides using de novo peptide sequencing, we describe a unique pipeline called proteomics X genomics. In contrast to current pipelines, it makes use of genomic data, RNA-Seq abundance and sequencing quality, in addition to proteomic features to increase the sensitivity and specificity of peptide identification. We show that the peptide-spectrum match quality and genetic traits have a clear relationship, showing that they can be utilized to evaluate peptide-spectrum matches. From 10 samples, we found 24,449 canonical MHC-I-associated peptides and 956 ncMAPs by using a target-decoy competition. Three hundred eighty-seven ncMAPs and 1611 canonical MHC-I-associated peptides were new identifications that had not yet been published. We discovered 11 ncMAPs produced from a squirrel monkey retrovirus in human cell lines in addition to the two ncMAPs originating from a complementarity determining region 3 in an antibody thanks to the unrestricted search space assumed by de novo sequencing. These entirely new identifications show that proteomics X genomics can make the most of de novo peptide sequencing's advantages and its potential use in the search for new immunotherapeutic targets.
Collapse
Affiliation(s)
- Seunghyuk Choi
- Department of Computer Science, Hanyang University, Seoul, Republic of Korea
| | - Eunok Paek
- Department of Computer Science, Hanyang University, Seoul, Republic of Korea; Institute for Artificial Intelligence Research, Hanyang University, Seoul, Republic of Korea.
| |
Collapse
|
7
|
Siraj A, Bouwmeester R, Declercq A, Welp L, Chernev A, Wulf A, Urlaub H, Martens L, Degroeve S, Kohlbacher O, Sachsenberg T. Intensity and retention time prediction improves the rescoring of protein-nucleic acid cross-links. Proteomics 2024; 24:e2300144. [PMID: 38629965 DOI: 10.1002/pmic.202300144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 12/29/2023] [Accepted: 01/05/2024] [Indexed: 04/19/2024]
Abstract
In protein-RNA cross-linking mass spectrometry, UV or chemical cross-linking introduces stable bonds between amino acids and nucleic acids in protein-RNA complexes that are then analyzed and detected in mass spectra. This analytical tool delivers valuable information about RNA-protein interactions and RNA docking sites in proteins, both in vitro and in vivo. The identification of cross-linked peptides with oligonucleotides of different length leads to a combinatorial increase in search space. We demonstrate that the peptide retention time prediction tasks can be transferred to the task of cross-linked peptide retention time prediction using a simple amino acid composition encoding, yielding improved identification rates when the prediction error is included in rescoring. For the more challenging task of including fragment intensity prediction of cross-linked peptides in the rescoring, we obtain, on average, a similar improvement. Further improvement in the encoding and fine-tuning of retention time and intensity prediction models might lead to further gains, and merit further research.
Collapse
Affiliation(s)
- Arslan Siraj
- Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany
- Institute for Biological and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Robbin Bouwmeester
- Department of Biomolecular Medicine, Ghent University, Gent, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Gent, Belgium
| | - Arthur Declercq
- Department of Biomolecular Medicine, Ghent University, Gent, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Gent, Belgium
| | - Luisa Welp
- Bioanalytical Mass Spectrometry, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Bioanalytics, Institute of Clinical Chemistry, University Medical Center Göttingen, Göttingen, Germany
| | - Aleksandar Chernev
- Bioanalytical Mass Spectrometry, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Alexander Wulf
- Bioanalytical Mass Spectrometry, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Henning Urlaub
- Bioanalytical Mass Spectrometry, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Bioanalytics, Institute of Clinical Chemistry, University Medical Center Göttingen, Göttingen, Germany
| | - Lennart Martens
- Department of Biomolecular Medicine, Ghent University, Gent, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Gent, Belgium
| | - Sven Degroeve
- Department of Biomolecular Medicine, Ghent University, Gent, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Gent, Belgium
| | - Oliver Kohlbacher
- Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany
- Institute for Biological and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Timo Sachsenberg
- Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany
- Institute for Biological and Medical Informatics, University of Tübingen, Tübingen, Germany
| |
Collapse
|
8
|
Adams C, Laukens K, Bittremieux W, Boonen K. Machine learning-based peptide-spectrum match rescoring opens up the immunopeptidome. Proteomics 2024; 24:e2300336. [PMID: 38009585 DOI: 10.1002/pmic.202300336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/18/2023] [Accepted: 10/23/2023] [Indexed: 11/29/2023]
Abstract
Immunopeptidomics is a key technology in the discovery of targets for immunotherapy and vaccine development. However, identifying immunopeptides remains challenging due to their non-tryptic nature, which results in distinct spectral characteristics. Moreover, the absence of strict digestion rules leads to extensive search spaces, further amplified by the incorporation of somatic mutations, pathogen genomes, unannotated open reading frames, and post-translational modifications. This inflation in search space leads to an increase in random high-scoring matches, resulting in fewer identifications at a given false discovery rate. Peptide-spectrum match rescoring has emerged as a machine learning-based solution to address challenges in mass spectrometry-based immunopeptidomics data analysis. It involves post-processing unfiltered spectrum annotations to better distinguish between correct and incorrect peptide-spectrum matches. Recently, features based on predicted peptidoform properties, including fragment ion intensities, retention time, and collisional cross section, have been used to improve the accuracy and sensitivity of immunopeptide identification. In this review, we describe the diverse bioinformatics pipelines that are currently available for peptide-spectrum match rescoring and discuss how they can be used for the analysis of immunopeptidomics data. Finally, we provide insights into current and future machine learning solutions to boost immunopeptide identification.
Collapse
Affiliation(s)
- Charlotte Adams
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
- Laboratory of Protein Science, Proteomics and Epigenetic Signaling (PPES), Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Wout Bittremieux
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Kurt Boonen
- Laboratory of Protein Science, Proteomics and Epigenetic Signaling (PPES), Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
- ImmuneSpec BV, Niel, Belgium
| |
Collapse
|
9
|
Yang Y, Fang Q. Prediction of glycopeptide fragment mass spectra by deep learning. Nat Commun 2024; 15:2448. [PMID: 38503734 PMCID: PMC10951270 DOI: 10.1038/s41467-024-46771-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 03/11/2024] [Indexed: 03/21/2024] Open
Abstract
Deep learning has achieved a notable success in mass spectrometry-based proteomics and is now emerging in glycoproteomics. While various deep learning models can predict fragment mass spectra of peptides with good accuracy, they cannot cope with the non-linear glycan structure in an intact glycopeptide. Herein, we present DeepGlyco, a deep learning-based approach for the prediction of fragment spectra of intact glycopeptides. Our model adopts tree-structured long-short term memory networks to process the glycan moiety and a graph neural network architecture to incorporate potential fragmentation pathways of a specific glycan structure. This feature is beneficial to model explainability and differentiation ability of glycan structural isomers. We further demonstrate that predicted spectral libraries can be used for data-independent acquisition glycoproteomics as a supplement for library completeness. We expect that this work will provide a valuable deep learning resource for glycoproteomics.
Collapse
Affiliation(s)
- Yi Yang
- ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou, 311200, China.
| | - Qun Fang
- ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou, 311200, China.
- Department of Chemistry, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
10
|
Buur LM, Declercq A, Strobl M, Bouwmeester R, Degroeve S, Martens L, Dorfer V, Gabriels R. MS 2Rescore 3.0 Is a Modular, Flexible, and User-Friendly Platform to Boost Peptide Identifications, as Showcased with MS Amanda 3.0. J Proteome Res 2024. [PMID: 38491990 DOI: 10.1021/acs.jproteome.3c00785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2024]
Abstract
Rescoring of peptide-spectrum matches (PSMs) has emerged as a standard procedure for the analysis of tandem mass spectrometry data. This emphasizes the need for software maintenance and continuous improvement for such algorithms. We introduce MS2Rescore 3.0, a versatile, modular, and user-friendly platform designed to increase peptide identifications. Researchers can install MS2Rescore across various platforms with minimal effort and benefit from a graphical user interface, a modular Python API, and extensive documentation. To showcase this new version, we connected MS2Rescore 3.0 with MS Amanda 3.0, a new release of the well-established search engine, addressing previous limitations on automatic rescoring. Among new features, MS Amanda now contains additional output columns that can be used for rescoring. The full potential of rescoring is best revealed when applied on challenging data sets. We therefore evaluated the performance of these two tools on publicly available single-cell data sets, where the number of PSMs was substantially increased, thereby demonstrating that MS2Rescore offers a powerful solution to boost peptide identifications. MS2Rescore's modular design and user-friendly interface make data-driven rescoring easily accessible, even for inexperienced users. We therefore expect the MS2Rescore to be a valuable tool for the wider proteomics community. MS2Rescore is available at https://github.com/compomics/ms2rescore.
Collapse
Affiliation(s)
- Louise M Buur
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Hagenberg 4232, Austria
| | - Arthur Declercq
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Marina Strobl
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Hagenberg 4232, Austria
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Viktoria Dorfer
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Hagenberg 4232, Austria
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9052, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| |
Collapse
|
11
|
Battellino T, Yeung D, Neustaeter H, Spicer V, Ogata K, Ishihama Y, Krokhin OV. Retention time prediction for post-translationally modified peptides: Ser, Thr, Tyr-phosphorylation. J Chromatogr A 2024; 1718:464714. [PMID: 38359688 DOI: 10.1016/j.chroma.2024.464714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/22/2024] [Accepted: 02/01/2024] [Indexed: 02/17/2024]
Abstract
The development of a peptide retention prediction model for reversed-phase chromatography applications in proteomics is reported for peptides carrying phosphorylated Ser, Thr and Tyr-residues. The major retention features have been assessed using a collection of over 10,000 phosphorylated/non-phosphorylated peptide pairs identified in a series 1D and 2D LC-MS/MS acquisitions using formic acid as ion pairing modifier. Single modification event on average results in increased peptide retention for phosphorylation of Ser (+ 1.46), Thr (+1.33), Tyr (+0.93% acetonitrile, ACN) on gradient elution scale for Luna C18(2) stationary phase. We established several composition and sequence specific features, which drive deviations from these average values. Thus, single phosphorylation of serine results in retention shifts ranging from -2.4 to 5.5% ACN depending on position of the residue, nature of nearest neighbour residues, peptide length, hydrophobicity and pI value, and its propensity to form amphipathic helical structures. We established that the altered ion-pairing environment upon phosphorylation is detrimental for this variability. Hydrophobicity of ion-pairing modifier directly informs the magnitude of expected shifts: (most hydrophilic) 0.5 % acetic acid (larger positive shift upon phosphorylation) > 0.1 % formic acid (positive) > 0.1 % trifluoroacetic (negative) > 0.1 % heptafluorobutyric acid (larger negative shift). The effect of phosphorylation has been also evaluated for several separation conditions used in the first dimension of 2D LC applications: high pH reversed-phase (RP), hydrophilic interaction liquid chromatography (HILIC), strong cation- and strong anion exchange separations.
Collapse
Affiliation(s)
- Taylor Battellino
- Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada
| | - Darien Yeung
- Department of Biochemistry and Medical Genetics, University of Manitoba, 336 BMSB, 745 Bannatyne Avenue, Winnipeg, R3E 0J9, Canada
| | - Haley Neustaeter
- Department of Chemistry, University of Manitoba, 360 Parker Building, 144 Dysart Road, Winnipeg, R3T 2N2, Canada
| | - Vic Spicer
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada
| | - Kosuke Ogata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Oleg V Krokhin
- Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada; Department of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada.
| |
Collapse
|
12
|
Gomez-Zepeda D, Arnold-Schild D, Beyrle J, Declercq A, Gabriels R, Kumm E, Preikschat A, Łącki MK, Hirschler A, Rijal JB, Carapito C, Martens L, Distler U, Schild H, Tenzer S. Thunder-DDA-PASEF enables high-coverage immunopeptidomics and is boosted by MS 2Rescore with MS 2PIP timsTOF fragmentation prediction model. Nat Commun 2024; 15:2288. [PMID: 38480730 PMCID: PMC10937930 DOI: 10.1038/s41467-024-46380-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 02/26/2024] [Indexed: 03/17/2024] Open
Abstract
Human leukocyte antigen (HLA) class I peptide ligands (HLAIps) are key targets for developing vaccines and immunotherapies against infectious pathogens or cancer cells. Identifying HLAIps is challenging due to their high diversity, low abundance, and patient individuality. Here, we develop a highly sensitive method for identifying HLAIps using liquid chromatography-ion mobility-tandem mass spectrometry (LC-IMS-MS/MS). In addition, we train a timsTOF-specific peak intensity MS2PIP model for tryptic and non-tryptic peptides and implement it in MS2Rescore (v3) together with the CCS predictor from ionmob. The optimized method, Thunder-DDA-PASEF, semi-selectively fragments singly and multiply charged HLAIps based on their IMS and m/z. Moreover, the method employs the high sensitivity mode and extended IMS resolution with fewer MS/MS frames (300 ms TIMS ramp, 3 MS/MS frames), doubling the coverage of immunopeptidomics analyses, compared to the proteomics-tailored DDA-PASEF (100 ms TIMS ramp, 10 MS/MS frames). Additionally, rescoring boosts the HLAIps identification by 41.7% to 33%, resulting in 5738 HLAIps from as little as one million JY cell equivalents, and 14,516 HLAIps from 20 million. This enables in-depth profiling of HLAIps from diverse human cell lines and human plasma. Finally, profiling JY and Raji cells transfected to express the SARS-CoV-2 spike protein results in 16 spike HLAIps, thirteen of which have been reported to elicit immune responses in human patients.
Collapse
Affiliation(s)
- David Gomez-Zepeda
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany.
- Helmholtz Institute for Translational Oncology Mainz (HI-TRON Mainz) - A Helmholtz Institute of the DKFZ, Mainz, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, Division 191, Heidelberg, Germany.
| | - Danielle Arnold-Schild
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Julian Beyrle
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
- Helmholtz Institute for Translational Oncology Mainz (HI-TRON Mainz) - A Helmholtz Institute of the DKFZ, Mainz, Germany
- German Cancer Research Center (DKFZ) Heidelberg, Division 191, Heidelberg, Germany
| | - Arthur Declercq
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Elena Kumm
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Annica Preikschat
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Mateusz Krzysztof Łącki
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Aurélie Hirschler
- BioOrganic Mass Spectrometry Laboratory (LSMBO), IPHC UMR 7178, University of Strasbourg, CNRS, ProFI - FR2048, Strasbourg, France
| | - Jeewan Babu Rijal
- BioOrganic Mass Spectrometry Laboratory (LSMBO), IPHC UMR 7178, University of Strasbourg, CNRS, ProFI - FR2048, Strasbourg, France
| | - Christine Carapito
- BioOrganic Mass Spectrometry Laboratory (LSMBO), IPHC UMR 7178, University of Strasbourg, CNRS, ProFI - FR2048, Strasbourg, France
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Ute Distler
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
- Research Center for Immunotherapy (FZI), University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Hansjörg Schild
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
- Research Center for Immunotherapy (FZI), University Medical Center of the Johannes-Gutenberg University, Mainz, Germany
| | - Stefan Tenzer
- Institute of Immunology, University Medical Center of the Johannes-Gutenberg University, Mainz, Germany.
- Helmholtz Institute for Translational Oncology Mainz (HI-TRON Mainz) - A Helmholtz Institute of the DKFZ, Mainz, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, Division 191, Heidelberg, Germany.
- Research Center for Immunotherapy (FZI), University Medical Center of the Johannes-Gutenberg University, Mainz, Germany.
| |
Collapse
|
13
|
Stastna M. Post-translational modifications of proteins in cardiovascular diseases examined by proteomic approaches. FEBS J 2024. [PMID: 38440918 DOI: 10.1111/febs.17108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 01/22/2024] [Accepted: 02/20/2024] [Indexed: 03/06/2024]
Abstract
Over 400 different types of post-translational modifications (PTMs) have been reported and over 200 various types of PTMs have been discovered using mass spectrometry (MS)-based proteomics. MS-based proteomics has proven to be a powerful method capable of global PTM mapping with the identification of modified proteins/peptides, the localization of PTM sites and PTM quantitation. PTMs play regulatory roles in protein functions, activities and interactions in various heart related diseases, such as ischemia/reperfusion injury, cardiomyopathy and heart failure. The recognition of PTMs that are specific to cardiovascular pathology and the clarification of the mechanisms underlying these PTMs at molecular levels are crucial for discovery of novel biomarkers and application in a clinical setting. With sensitive MS instrumentation and novel biostatistical methods for precise processing of the data, low-abundance PTMs can be successfully detected and the beneficial or unfavorable effects of specific PTMs on cardiac function can be determined. Moreover, computational proteomic strategies that can predict PTM sites based on MS data have gained an increasing interest and can contribute to characterization of PTM profiles in cardiovascular disorders. More recently, machine learning- and deep learning-based methods have been employed to predict the locations of PTMs and explore PTM crosstalk. In this review article, the types of PTMs are briefly overviewed, approaches for PTM identification/quantitation in MS-based proteomics are discussed and recently published proteomic studies on PTMs associated with cardiovascular diseases are included.
Collapse
Affiliation(s)
- Miroslava Stastna
- Institute of Analytical Chemistry of the Czech Academy of Sciences, Brno, Czech Republic
| |
Collapse
|
14
|
Wang F, Zhang Z, Mao M, Yang Y, Xu P, Lu S. COSMIC-based mutation database enhances identification efficiency of HLA-I immunopeptidome. J Transl Med 2024; 22:144. [PMID: 38336780 PMCID: PMC10858511 DOI: 10.1186/s12967-023-04821-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/20/2023] [Indexed: 02/12/2024] Open
Abstract
BACKGROUND Neoantigens have emerged as a promising area of focus in tumor immunotherapy, with several established strategies aiming to enhance their identification. Human leukocyte antigen class I molecules (HLA-I), which present intracellular immunopeptides to T cells, provide an ideal source for identifying neoantigens. However, solely relying on a mutation database generated through commonly used whole exome sequencing (WES) for the identification of HLA-I immunopeptides, may result in potential neoantigens being missed due to limitations in sequencing depth and sample quality. METHOD In this study, we constructed and evaluated an extended database for neoantigen identification, based on COSMIC mutation database. This study utilized mass spectrometry-based proteogenomic profiling to identify the HLA-I immunopeptidome enriched from HepG2 cell. HepG2 WES-based and the COSMIC-based mutation database were generated and utilized to identify HepG2-specific mutant immunopeptides. RESULT The results demonstrated that COSMIC-based database identified 5 immunopeptides compared to only 1 mutant peptide identified by HepG2 WES-based database, indicating its effectiveness in identifying mutant immunopeptides. Furthermore, HLA-I affinity of the mutant immunopeptides was evaluated through NetMHCpan and peptide-docking modeling to validate their binding to HLA-I molecules, demonstrating the potential of mutant peptides identified by the COSMIC-based database as neoantigens. CONCLUSION Utilizing the COSMIC-based mutation database is a more efficient strategy for identifying mutant peptides from HLA-I immunopeptidome without significantly increasing the false positive rate. HepG2 specific WES-based database may exclude certain mutant peptides due to WES sequencing depth or sample heterogeneity. The COSMIC-based database can effectively uncover potential neoantigens within the HLA-I immunopeptidomes.
Collapse
Affiliation(s)
- Fangzhou Wang
- Medical School of Chinese People's Liberation Army (PLA), Faculty of Hepato-Pancreato-Biliary Surgery, Chinese PLA General Hospital, Institute of Hepatobiliary Surgery of Chinese PLA, Key Laboratory of Digital Hepatobiliary Surgery PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China
| | - Zhenpeng Zhang
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Institute of Lifeomics, 38 Life Science Park Road, Changping District, Beijing, 102206, China
| | - Mingsong Mao
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Institute of Lifeomics, 38 Life Science Park Road, Changping District, Beijing, 102206, China
- School of Basic Medical Sciences, Anhui Medical University, Hefei, China
| | - Yudai Yang
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Institute of Lifeomics, 38 Life Science Park Road, Changping District, Beijing, 102206, China
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Ping Xu
- State Key Laboratory of Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Institute of Lifeomics, 38 Life Science Park Road, Changping District, Beijing, 102206, China.
- School of Basic Medical Sciences, Anhui Medical University, Hefei, China.
- Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China.
- School of Medicine, Guizhou University, Guiyang, China.
| | - Shichun Lu
- Medical School of Chinese People's Liberation Army (PLA), Faculty of Hepato-Pancreato-Biliary Surgery, Chinese PLA General Hospital, Institute of Hepatobiliary Surgery of Chinese PLA, Key Laboratory of Digital Hepatobiliary Surgery PLA, 28 Fuxing Road, Haidian District, Beijing, 100853, China.
| |
Collapse
|
15
|
Ye J, He X, Wang S, Dong MQ, Wu F, Lu S, Feng F. Test-Time Training for Deep MS/MS Spectrum Prediction Improves Peptide Identification. J Proteome Res 2024; 23:550-559. [PMID: 38153036 DOI: 10.1021/acs.jproteome.3c00229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
In bottom-up proteomics, peptide-spectrum matching is critical for peptide and protein identification. Recently, deep learning models have been used to predict tandem mass spectra of peptides, enabling the calculation of similarity scores between the predicted and experimental spectra for peptide-spectrum matching. These models follow the supervised learning paradigm, which trains a general model using paired peptides and spectra from standard data sets and directly employs the model on experimental data. However, this approach can lead to inaccurate predictions due to differences between the training data and the experimental data, such as sample types, enzyme specificity, and instrument calibration. To tackle this problem, we developed a test-time training paradigm that adapts the pretrained model to generate experimental data-specific models, namely, PepT3. PepT3 yields a 10-40% increase in peptide identification depending on the variability in training and experimental data. Intriguingly, when applied to a patient-derived immunopeptidomic sample, PepT3 increases the identification of tumor-specific immunopeptide candidates by 60%. Two-thirds of the newly identified candidates are predicted to bind to the patient's human leukocyte antigen isoforms. To facilitate access of the model and all the results, we have archived all the intermediate files in Zenodo.org with identifier 8231084.
Collapse
Affiliation(s)
- Jianbai Ye
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Xiangnan He
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Shujuan Wang
- National Institute of Biological Sciences, Beijing 102206, China
| | - Meng-Qiu Dong
- National Institute of Biological Sciences, Beijing 102206, China
| | - Feng Wu
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Shan Lu
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California 92093, United States
| | - Fuli Feng
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
16
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
17
|
Leblanc S, Yala F, Provencher N, Lucier JF, Levesque M, Lapointe X, Jacques JF, Fournier I, Salzet M, Ouangraoua A, Scott MS, Boisvert FM, Brunet MA, Roucou X. OpenProt 2.0 builds a path to the functional characterization of alternative proteins. Nucleic Acids Res 2024; 52:D522-D528. [PMID: 37956315 PMCID: PMC10767855 DOI: 10.1093/nar/gkad1050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023] Open
Abstract
The OpenProt proteogenomic resource (https://www.openprot.org/) provides users with a complete and freely accessible set of non-canonical or alternative open reading frames (AltORFs) within the transcriptome of various species, as well as functional annotations of the corresponding protein sequences not found in standard databases. Enhancements in this update are largely the result of user feedback and include the prediction of structure, subcellular localization, and intrinsic disorder, using cutting-edge algorithms based on machine learning techniques. The mass spectrometry pipeline now integrates a machine learning-based peptide rescoring method to improve peptide identification. We continue to help users explore this cryptic proteome by providing OpenCustomDB, a tool that enables users to build their own customized protein databases, and OpenVar, a genomic annotator including genetic variants within AltORFs and protein sequences. A new interface improves the visualization of all functional annotations, including a spectral viewer and the prediction of multicoding genes. All data on OpenProt are freely available and downloadable. Overall, OpenProt continues to establish itself as an important resource for the exploration and study of new proteins.
Collapse
Affiliation(s)
- Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Feriel Yala
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Nicolas Provencher
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Department of Biology, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Xavier Lapointe
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Michel Salzet
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aïda Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
| | - François-Michel Boisvert
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cellular Biology, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Marie A Brunet
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC J1H 5N4, Canada
| |
Collapse
|
18
|
Liu Y, Yang Y, Chen W, Shen F, Xie L, Zhang Y, Zhai Y, He F, Zhu Y, Chang C. DeepRTAlign: toward accurate retention time alignment for large cohort mass spectrometry data analysis. Nat Commun 2023; 14:8188. [PMID: 38081814 PMCID: PMC10713976 DOI: 10.1038/s41467-023-43909-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
Retention time (RT) alignment is a crucial step in liquid chromatography-mass spectrometry (LC-MS)-based proteomic and metabolomic experiments, especially for large cohort studies. The most popular alignment tools are based on warping function method and direct matching method. However, existing tools can hardly handle monotonic and non-monotonic RT shifts simultaneously. Here, we develop a deep learning-based RT alignment tool, DeepRTAlign, for large cohort LC-MS data analysis. DeepRTAlign has been demonstrated to have improved performances by benchmarking it against current state-of-the-art approaches on multiple real-world and simulated proteomic and metabolomic datasets. The results also show that DeepRTAlign can improve identification sensitivity without compromising quantitative accuracy. Furthermore, using the MS features aligned by DeepRTAlign, we trained and validated a robust classifier to predict the early recurrence of hepatocellular carcinoma. DeepRTAlign provides an advanced solution to RT alignment in large cohort LC-MS studies, which is currently a major bottleneck in proteomics and metabolomics research.
Collapse
Affiliation(s)
- Yi Liu
- Faculty of Environment and Life, Beijing University of Technology, Beijing, 100023, China
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Yun Yang
- International Academy of Phronesis Medicine (Guang Dong), No. 96 Xindao Ring South Road, Guangzhou International Bio Island, Guangzhou, 510000, China
- South China Institute of Biomedicine, No. 83 Ruihe Road, Guangzhou, 510535, China
| | - Wendong Chen
- International Academy of Phronesis Medicine (Guang Dong), No. 96 Xindao Ring South Road, Guangzhou International Bio Island, Guangzhou, 510000, China
- South China Institute of Biomedicine, No. 83 Ruihe Road, Guangzhou, 510535, China
| | - Feng Shen
- Department of Hepatic Surgery IV, the Eastern Hepatobiliary Surgery Hospital, Naval Medical University, Shanghai, 200433, China
| | - Linhai Xie
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China
- International Academy of Phronesis Medicine (Guang Dong), No. 96 Xindao Ring South Road, Guangzhou International Bio Island, Guangzhou, 510000, China
- South China Institute of Biomedicine, No. 83 Ruihe Road, Guangzhou, 510535, China
| | - Yingying Zhang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Yuanjun Zhai
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Fuchu He
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China
- International Academy of Phronesis Medicine (Guang Dong), No. 96 Xindao Ring South Road, Guangzhou International Bio Island, Guangzhou, 510000, China
- Research Unit of Proteomics Driven Cancer Precision Medicine, Chinese Academy of Medical Sciences, Beijing, 102206, China
| | - Yunping Zhu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China.
| | - Cheng Chang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, 102206, China.
- Research Unit of Proteomics Driven Cancer Precision Medicine, Chinese Academy of Medical Sciences, Beijing, 102206, China.
| |
Collapse
|
19
|
Maeng JH, Jang HJ, Du AY, Tzeng SC, Wang T. Using long-read CAGE sequencing to profile cryptic-promoter-derived transcripts and their contribution to the immunopeptidome. Genome Res 2023; 33:gr.277061.122. [PMID: 38065624 PMCID: PMC10760525 DOI: 10.1101/gr.277061.122] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 11/13/2023] [Indexed: 01/04/2024]
Abstract
Recent studies have shown that the noncoding genome can produce unannotated proteins as antigens that induce immune response. One major source of this activity is the aberrant epigenetic reactivation of transposable elements (TEs). In tumors, TEs often provide cryptic or alternate promoters, which can generate transcripts that encode tumor-specific unannotated proteins. Thus, TE-derived transcripts (TE transcripts) have the potential to produce tumor-specific, but recurrent, antigens shared among many tumors. Identification of TE-derived tumor antigens holds the promise to improve cancer immunotherapy approaches; however, current genomics and computational tools are not optimized for their detection. Here we combined CAGE technology with full-length long-read transcriptome sequencing (long-read CAGE, or LRCAGE) and developed a suite of computational tools to significantly improve immunopeptidome detection by incorporating TE and other tumor transcripts into the proteome database. By applying our methods to human lung cancer cell line H1299 data, we show that long-read technology significantly improves mapping of promoters with low mappability scores and that LRCAGE guarantees accurate construction of uncharacterized 5' transcript structure. Augmenting a reference proteome database with newly characterized transcripts enabled us to detect noncanonical antigens from HLA-pulldown LC-MS/MS data. Lastly, we show that epigenetic treatment increased the number of noncanonical antigens, particularly those encoded by TE transcripts, which might expand the pool of targetable antigens for cancers with low mutational burden.
Collapse
Affiliation(s)
- Ju Heon Maeng
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - H Josh Jang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Alan Y Du
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Shin-Cheng Tzeng
- Donald Danforth Plant Science Center, St. Louis, Missouri 63132, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| |
Collapse
|
20
|
Liu K, Ye Y, Li S, Tang H. Accurate de novo peptide sequencing using fully convolutional neural networks. Nat Commun 2023; 14:7974. [PMID: 38042873 PMCID: PMC10693636 DOI: 10.1038/s41467-023-43010-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 10/29/2023] [Indexed: 12/04/2023] Open
Abstract
De novo peptide sequencing, which does not rely on a comprehensive target sequence database, provides us with a way to identify novel peptides from tandem mass spectra. However, current de novo sequencing algorithms suffer from low accuracy and coverage, which hinders their application in proteomics. In this paper, we present PepNet, a fully convolutional neural network for high accuracy de novo peptide sequencing. PepNet takes an MS/MS spectrum (represented as a high-dimensional vector) as input, and outputs the optimal peptide sequence along with its confidence score. The PepNet model is trained using a total of 3 million high-energy collisional dissociation MS/MS spectra from multiple human peptide spectral libraries. Evaluation results show that PepNet significantly outperforms current best-performing de novo sequencing algorithms (e.g. PointNovo and DeepNovo) in both peptide-level accuracy and positional-level accuracy. PepNet can sequence a large fraction of spectra that were not identified by database search engines, and thus could be used as a complementary tool to database search engines for peptide identification in proteomics. In addition, PepNet runs around 3x and 7x faster than PointNovo and DeepNovo on GPUs, respectively, thus being more suitable for the analysis of large-scale proteomics data.
Collapse
Affiliation(s)
- Kaiyuan Liu
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
| | - Yuzhen Ye
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
| | - Sujun Li
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
- Dengding BioAI Co., Ltd., Bloomington, USA
| | - Haixu Tang
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA.
| |
Collapse
|
21
|
Chan CMJ, Lam H. Merging Full-Spectrum and Fragment Ion Intensity Predictions from Deep Learning for High-Quality Spectral Libraries. J Proteome Res 2023; 22:3692-3702. [PMID: 37910637 DOI: 10.1021/acs.jproteome.3c00180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
Spectral libraries are useful resources in proteomic data analysis. Recent advances in deep learning allow tandem mass spectra of peptides to be predicted from their amino acid sequences. This enables predicted spectral libraries to be compiled, and searching against such libraries has been shown to improve the sensitivity in peptide identification over conventional sequence database searching. However, current prediction models lack support for longer peptides, and thus far, predicted library searching has only been demonstrated for backbone ion-only spectrum prediction methods. Here, we propose a deep learning-based full-spectrum prediction method to generate predicted spectral libraries for peptide identification. We demonstrated the superiority of using full-spectrum libraries over backbone ion-only prediction approaches in spectral library searching. Furthermore, merging spectra from different prediction models, as a form of ensemble learning, can produce improved spectral libraries, in terms of identification sensitivity. We also show that a hybrid library combining predicted and experimental spectra can lead to 20% more confident identifications over experimental library searching or sequence database searching.
Collapse
Affiliation(s)
- Chak Ming Jerry Chan
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, China
| | - Henry Lam
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, China
| |
Collapse
|
22
|
Kang Q, Fang P, Zhang S, Qiu H, Lan Z. Deep graph convolutional network for small-molecule retention time prediction. J Chromatogr A 2023; 1711:464439. [PMID: 37865024 DOI: 10.1016/j.chroma.2023.464439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 10/04/2023] [Accepted: 10/06/2023] [Indexed: 10/23/2023]
Abstract
The retention time (RT) is a crucial source of data for liquid chromatography-mass spectrometry (LCMS). A model that can accurately predict the RT for each molecule would empower filtering candidates with similar spectra but differing RT in LCMS-based molecule identification. Recent research shows that graph neural networks (GNNs) outperform traditional machine learning algorithms in RT prediction. However, all of these models use relatively shallow GNNs. This study for the first time investigates how depth affects GNNs' performance on RT prediction. The results demonstrate that a notable improvement can be achieved by pushing the depth of GNNs to 16 layers by the adoption of residual connection. Additionally, we also find that graph convolutional network (GCN) model benefits from the edge information. The developed deep graph convolutional network, DeepGCN-RT, significantly outperforms the previous state-of-the-art method and achieves the lowest mean absolute percentage error (MAPE) of 3.3% and the lowest mean absolute error (MAE) of 26.55 s on the SMRT test set. We also finetune DeepGCN-RT on seven datasets with various chromatographic conditions. The mean MAE of the seven datasets largely decreases 30% compared to previous state-of-the-art method. On the RIKEN-PlaSMA dataset, we also test the effectiveness of DeepGCN-RT in assisting molecular structure identification. By 30% lessening the number of potential structures, DeepGCN-RT is able to improve top-1 accuracy by about 11%.
Collapse
Affiliation(s)
- Qiyue Kang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| | - Pengfei Fang
- School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, 210096, China
| | - Shuai Zhang
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Huachuan Qiu
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Zhenzhong Lan
- School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.
| |
Collapse
|
23
|
Lazear MR. Sage: An Open-Source Tool for Fast Proteomics Searching and Quantification at Scale. J Proteome Res 2023; 22:3652-3659. [PMID: 37819886 DOI: 10.1021/acs.jproteome.3c00486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
The growing complexity and volume of proteomics data necessitate the development of efficient software tools for peptide identification and quantification from mass spectra. Given their central role in proteomics, it is imperative that these tools are auditable and extensible─requirements that are best fulfilled by open-source and permissively licensed software. This work presents Sage, a high-performance, open-source, and freely available proteomics pipeline. Scalable and cloud-ready, Sage matches the performance of state-of-the-art software tools while running an order of magnitude faster.
Collapse
Affiliation(s)
- Michael R Lazear
- Belharra Therapeutics, 3985 Sorrento Valley Boulevard Suite C, San Diego, California 92121, United States
| |
Collapse
|
24
|
Claeys T, Van Den Bossche T, Perez-Riverol Y, Gevaert K, Vizcaíno JA, Martens L. lesSDRF is more: maximizing the value of proteomics data through streamlined metadata annotation. Nat Commun 2023; 14:6743. [PMID: 37875519 PMCID: PMC10598006 DOI: 10.1038/s41467-023-42543-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 10/13/2023] [Indexed: 10/26/2023] Open
Abstract
Public proteomics data often lack essential metadata, limiting its potential. To address this, we present lesSDRF, a tool to simplify the process of metadata annotation, thereby ensuring that data leave a lasting, impactful legacy well beyond its initial publication.
Collapse
Affiliation(s)
- Tine Claeys
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000, Ghent, Belgium
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000, Ghent, Belgium
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | - Kris Gevaert
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000, Ghent, Belgium
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK.
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium.
- Department of Biomolecular Medicine, Ghent University, 9000, Ghent, Belgium.
| |
Collapse
|
25
|
Skiadopoulou D, Vašíček J, Kuznetsova K, Bouyssié D, Käll L, Vaudel M. Retention Time and Fragmentation Predictors Increase Confidence in Identification of Common Variant Peptides. J Proteome Res 2023; 22:3190-3199. [PMID: 37656829 PMCID: PMC10563157 DOI: 10.1021/acs.jproteome.3c00243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Indexed: 09/03/2023]
Abstract
Precision medicine focuses on adapting care to the individual profile of patients, for example, accounting for their unique genetic makeup. Being able to account for the effect of genetic variation on the proteome holds great promise toward this goal. However, identifying the protein products of genetic variation using mass spectrometry has proven very challenging. Here we show that the identification of variant peptides can be improved by the integration of retention time and fragmentation predictors into a unified proteogenomic pipeline. By combining these intrinsic peptide characteristics using the search-engine post-processor Percolator, we demonstrate improved discrimination power between correct and incorrect peptide-spectrum matches. Our results demonstrate that the drop in performance that is induced when expanding a protein sequence database can be compensated, hence enabling efficient identification of genetic variation products in proteomics data. We anticipate that this enhancement of proteogenomic pipelines can provide a more refined picture of the unique proteome of patients and thereby contribute to improving patient care.
Collapse
Affiliation(s)
- Dafni Skiadopoulou
- Mohn
Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020 Bergen, Norway
- Computational
Biology Unit, Department of Informatics, University of Bergen, NO-5020 Bergen, Norway
| | - Jakub Vašíček
- Mohn
Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020 Bergen, Norway
- Computational
Biology Unit, Department of Informatics, University of Bergen, NO-5020 Bergen, Norway
| | - Ksenia Kuznetsova
- Mohn
Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020 Bergen, Norway
- Computational
Biology Unit, Department of Informatics, University of Bergen, NO-5020 Bergen, Norway
| | - David Bouyssié
- Institut
de Pharmacologie et de Biologie Structurale (IPBS), Université
de Toulouse, CNRS, Université Toulouse III—Paul Sabatier
(UT3), 31000 Toulouse, France
| | - Lukas Käll
- Science
for Life Laboratory, School of Engineering Sciences in Chemistry,
Biotechnology and Health, KTH Royal Institute
of Technology, SE-100 44 Stockholm, Sweden
| | - Marc Vaudel
- Mohn
Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020 Bergen, Norway
- Computational
Biology Unit, Department of Informatics, University of Bergen, NO-5020 Bergen, Norway
- Department
of Genetics and Bioinformatics, Health Data and Digitalization, Norwegian Institute of Public Health, N-0213 Oslo, Norway
| |
Collapse
|
26
|
Neale Q, Prefontaine A, Battellino T, Mizero B, Yeung D, Spicer V, Budisa N, Perreault H, Zahedi RP, Krokhin OV. Compendium of Chromatographic Behavior of Post-translationally and Chemically Modified Peptides in Bottom-Up Proteomic Experiments. Anal Chem 2023; 95:14634-14642. [PMID: 37739932 DOI: 10.1021/acs.analchem.3c02412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
We have systematically evaluated the chromatographic behavior of post-translationally/chemically modified peptides using data spanning over 70 of the most relevant modifications. These retention properties were measured for standard bottom-up proteomic settings (fully porous C18 separation media, 0.1% formic acid as ion-pairing modifier) using collections of modified/nonmodified peptide pairs. These pairs were generated by spontaneous degradation, chemical or enzymatic treatment, analysis of synthetic peptides, or the cotranslational incorporation of noncanonical proline analogues. In addition, these measurements were validated using external data acquired for synthetic peptides and enzymatically induced citrullination. Working in units of hydrophobicity index (HI, % ACN) and evaluating the average retention shifts (ΔHI) represent the simplest approach to describe the effect of modifications from a didactic point of view. Plotting HI values for modified (y-axis) vs nonmodified (x-axis) counterparts generates unique slope and intercept values for each modification defined by the chemistry of the modifying moiety: its hydrophobicity, size, pKa of ionizable groups, and position of the altered residue. These composition-dependent correlations can be used for coarse incorporation of PTMs into models for prediction of peptide retention. More accurate predictions would require the development of specific sequence-dependent algorithms to predict ΔHI values.
Collapse
Affiliation(s)
- Quinn Neale
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg R3T 2N2, Manitoba, Canada
| | - Alexandre Prefontaine
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg R3T 2N2, Manitoba, Canada
| | - Taylor Battellino
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg R3T 2N2, Manitoba, Canada
| | - Benilde Mizero
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg R3T 2N2, Manitoba, Canada
| | - Darien Yeung
- Department of Biochemistry and Medical Genetics, University of Manitoba, 336 Basic Medical Sciences Building, 745 Bannatyne Avenue, Winnipeg R3E 0J9, Manitoba, Canada
| | - Victor Spicer
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg R3E 3P4, Manitoba, Canada
| | - Nediljko Budisa
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg R3T 2N2, Manitoba, Canada
| | - Helene Perreault
- Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg R3T 2N2, Manitoba, Canada
| | - Rene P Zahedi
- Department of Biochemistry and Medical Genetics, University of Manitoba, 336 Basic Medical Sciences Building, 745 Bannatyne Avenue, Winnipeg R3E 0J9, Manitoba, Canada
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg R3E 3P4, Manitoba, Canada
- Department of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg R3E 3P4, Manitoba, Canada
- CancerCare Manitoba Research Institute, 675 McDermot Avenue, Winnipeg R3E 0 V9, Manitoba, Canada
| | - Oleg V Krokhin
- Department of Biochemistry and Medical Genetics, University of Manitoba, 336 Basic Medical Sciences Building, 745 Bannatyne Avenue, Winnipeg R3E 0J9, Manitoba, Canada
- Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg R3E 3P4, Manitoba, Canada
- Department of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg R3E 3P4, Manitoba, Canada
| |
Collapse
|
27
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
28
|
Penanes P, Gorshkov V, Ivanov MV, Gorshkov MV, Kjeldsen F. Potential of Negative-Ion-Mode Proteomics: An MS1-Only Approach. J Proteome Res 2023; 22:2734-2742. [PMID: 37395192 PMCID: PMC10407931 DOI: 10.1021/acs.jproteome.3c00307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Indexed: 07/04/2023]
Abstract
Current proteomics approaches rely almost exclusively on using the positive ionization mode, resulting in inefficient ionization of many acidic peptides. This study investigates protein identification efficiency in the negative ionization mode using the DirectMS1 method. DirectMS1 is an ultrafast data acquisition method based on accurate peptide mass measurements and predicted retention times. Our method achieves the highest rate of protein identification in the negative ion mode to date, identifying over 1000 proteins in a human cell line at a 1% false discovery rate. This is accomplished using a single-shot 10 min separation gradient, comparable to lengthy MS/MS-based analyses. Optimizing separation and experimental conditions was achieved by utilizing mobile buffers containing 2.5 mM imidazole and 3% isopropanol. The study emphasized the complementary nature of data obtained in positive and negative ion modes. Combining the results from all replicates in both polarities increased the number of identified proteins to 1774. Additionally, we analyzed the method's efficiency using different proteases for protein digestion. Among the four studied proteases (LysC, GluC, AspN, and trypsin), trypsin and LysC demonstrated the highest protein identification yield. This suggests that digestion procedures utilized in positive-mode proteomics can be effectively applied in the negative ion mode. Data are deposited to ProteomeXchange: PXD040583.
Collapse
Affiliation(s)
- Pelayo
A. Penanes
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, DK-5230 Odense M, Denmark
| | - Vladimir Gorshkov
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, DK-5230 Odense M, Denmark
| | - Mark V. Ivanov
- V.
L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical
Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Mikhail V. Gorshkov
- V.
L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical
Physics, Russian Academy of Sciences, 38 Leninsky Pr., Bld. 2, Moscow 119334, Russia
| | - Frank Kjeldsen
- Department
of Biochemistry and Molecular Biology, University
of Southern Denmark, DK-5230 Odense M, Denmark
| |
Collapse
|
29
|
Bogaert A, Fijalkowska D, Staes A, Van de Steene T, Vuylsteke M, Stadler C, Eyckerman S, Spirohn K, Hao T, Calderwood MA, Gevaert K. N-terminal proteoforms may engage in different protein complexes. Life Sci Alliance 2023; 6:e202301972. [PMID: 37316325 PMCID: PMC10267514 DOI: 10.26508/lsa.202301972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 05/26/2023] [Accepted: 05/30/2023] [Indexed: 06/16/2023] Open
Abstract
Alternative translation initiation and alternative splicing may give rise to N-terminal proteoforms, proteins that differ at their N-terminus compared with their canonical counterparts. Such proteoforms can have altered localizations, stabilities, and functions. Although proteoforms generated from splice variants can be engaged in different protein complexes, it remained to be studied to what extent this applies to N-terminal proteoforms. To address this, we mapped the interactomes of several pairs of N-terminal proteoforms and their canonical counterparts. First, we generated a catalogue of N-terminal proteoforms found in the HEK293T cellular cytosol from which 22 pairs were selected for interactome profiling. In addition, we provide evidence for the expression of several N-terminal proteoforms, identified in our catalogue, across different human tissues, as well as tissue-specific expression, highlighting their biological relevance. Protein-protein interaction profiling revealed that the overlap of the interactomes for both proteoforms is generally high, showing their functional relation. We also showed that N-terminal proteoforms can be engaged in new interactions and/or lose several interactions compared with their canonical counterparts, thus further expanding the functional diversity of proteomes.
Collapse
Affiliation(s)
- Annelies Bogaert
- VIB Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Daria Fijalkowska
- VIB Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - An Staes
- VIB Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Tessa Van de Steene
- VIB Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | | | - Charlotte Stadler
- Department of Protein Science, KTH Royal Institute of Technology and Science for Life Laboratories, Stockholm, Sweden
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Kerstin Spirohn
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kris Gevaert
- VIB Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
30
|
Yang KL, Yu F, Teo GC, Li K, Demichev V, Ralser M, Nesvizhskii AI. MSBooster: improving peptide identification rates using deep learning-based features. Nat Commun 2023; 14:4539. [PMID: 37500632 PMCID: PMC10374903 DOI: 10.1038/s41467-023-40129-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 07/06/2023] [Indexed: 07/29/2023] Open
Abstract
Peptide identification in liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiments relies on computational algorithms for matching acquired MS/MS spectra against sequences of candidate peptides using database search tools, such as MSFragger. Here, we present a new tool, MSBooster, for rescoring peptide-to-spectrum matches using additional features incorporating deep learning-based predictions of peptide properties, such as LC retention time, ion mobility, and MS/MS spectra. We demonstrate the utility of MSBooster, in tandem with MSFragger and Percolator, in several different workflows, including nonspecific searches (immunopeptidomics), direct identification of peptides from data independent acquisition data, single-cell proteomics, and data generated on an ion mobility separation-enabled timsTOF MS platform. MSBooster is fast, robust, and fully integrated into the widely used FragPipe computational platform.
Collapse
Affiliation(s)
- Kevin L Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Kai Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Vadim Demichev
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Markus Ralser
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
- Nuffield Department of Medicine, The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
31
|
Stražar M, Park J, Abelin JG, Taylor HB, Pedersen TK, Plichta DR, Brown EM, Eraslan B, Hung YM, Ortiz K, Clauser KR, Carr SA, Xavier RJ, Graham DB. HLA-II immunopeptidome profiling and deep learning reveal features of antigenicity to inform antigen discovery. Immunity 2023; 56:1681-1698.e13. [PMID: 37301199 PMCID: PMC10519123 DOI: 10.1016/j.immuni.2023.05.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 02/08/2023] [Accepted: 05/11/2023] [Indexed: 06/12/2023]
Abstract
CD4+ T cell responses are exquisitely antigen specific and directed toward peptide epitopes displayed by human leukocyte antigen class II (HLA-II) on antigen-presenting cells. Underrepresentation of diverse alleles in ligand databases and an incomplete understanding of factors affecting antigen presentation in vivo have limited progress in defining principles of peptide immunogenicity. Here, we employed monoallelic immunopeptidomics to identify 358,024 HLA-II binders, with a particular focus on HLA-DQ and HLA-DP. We uncovered peptide-binding patterns across a spectrum of binding affinities and enrichment of structural antigen features. These aspects underpinned the development of context-aware predictor of T cell antigens (CAPTAn), a deep learning model that predicts peptide antigens based on their affinity to HLA-II and full sequence of their source proteins. CAPTAn was instrumental in discovering prevalent T cell epitopes from bacteria in the human microbiome and a pan-variant epitope from SARS-CoV-2. Together CAPTAn and associated datasets present a resource for antigen discovery and the unraveling genetic associations of HLA alleles with immunopathologies.
Collapse
Affiliation(s)
- Martin Stražar
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jihye Park
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | - Hannah B Taylor
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Thomas K Pedersen
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Technical University of Denmark, Kongens Lyngby, Denmark
| | | | - Eric M Brown
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Computational and Integrative Biology, Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Basak Eraslan
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Yuan-Mao Hung
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Computational and Integrative Biology, Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Kayla Ortiz
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Steven A Carr
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Ramnik J Xavier
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Computational and Integrative Biology, Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA; Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | - Daniel B Graham
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Computational and Integrative Biology, Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA; Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
32
|
Affiliation(s)
- Bruna Gomes
- From the Departments of Medicine, Genetics, and Biomedical Data Science, Stanford University, Stanford, CA (B.G., E.A.A.); and the Department of Cardiology, Pneumology, and Angiology, Heidelberg University Hospital, Heidelberg, Germany (B.G.)
| | - Euan A Ashley
- From the Departments of Medicine, Genetics, and Biomedical Data Science, Stanford University, Stanford, CA (B.G., E.A.A.); and the Department of Cardiology, Pneumology, and Angiology, Heidelberg University Hospital, Heidelberg, Germany (B.G.)
| |
Collapse
|
33
|
Wilburn DB, Shannon AE, Spicer V, Richards AL, Yeung D, Swaney DL, Krokhin OV, Searle BC. Deep learning from harmonized peptide libraries enables retention time prediction of diverse post translational modifications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.30.542978. [PMID: 37398395 PMCID: PMC10312522 DOI: 10.1101/2023.05.30.542978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
In proteomics experiments, peptide retention time (RT) is an orthogonal property to fragmentation when assessing detection confidence. Advances in deep learning enable accurate RT prediction for any peptide from sequence alone, including those yet to be experimentally observed. Here we present Chronologer, an open-source software tool for rapid and accurate peptide RT prediction. Using new approaches to harmonize and false-discovery correct across independently collected datasets, Chronologer is built on a massive database with >2.2 million peptides including 10 common post-translational modification (PTM) types. By linking knowledge learned across diverse peptide chemistries, Chronologer predicts RTs with less than two-thirds the error of other deep learning tools. We show how RT for rare PTMs, such as OGlcNAc, can be learned with high accuracy using as few as 10-100 example peptides in newly harmonized datasets. This iteratively updatable workflow enables Chronologer to comprehensively predict RTs for PTM-marked peptides across entire proteomes.
Collapse
|
34
|
Nowatzky Y, Benner P, Reinert K, Muth T. Mistle: bringing spectral library predictions to metaproteomics with an efficient search index. Bioinformatics 2023; 39:btad376. [PMID: 37294786 PMCID: PMC10313348 DOI: 10.1093/bioinformatics/btad376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 05/11/2023] [Accepted: 06/08/2023] [Indexed: 06/11/2023] Open
Abstract
MOTIVATION Deep learning has moved to the forefront of tandem mass spectrometry-driven proteomics and authentic prediction for peptide fragmentation is more feasible than ever. Still, at this point spectral prediction is mainly used to validate database search results or for confined search spaces. Fully predicted spectral libraries have not yet been efficiently adapted to large search space problems that often occur in metaproteomics or proteogenomics. RESULTS In this study, we showcase a workflow that uses Prosit for spectral library predictions on two common metaproteomes and implement an indexing and search algorithm, Mistle, to efficiently identify experimental mass spectra within the library. Hence, the workflow emulates a classic protein sequence database search with protein digestion but builds a searchable index from spectral predictions as an in-between step. We compare Mistle to popular search engines, both on a spectral and database search level, and provide evidence that this approach is more accurate than a database search using MSFragger. Mistle outperforms other spectral library search engines in terms of run time and proves to be extremely memory efficient with a 4- to 22-fold decrease in RAM usage. This makes Mistle universally applicable to large search spaces, e.g. covering comprehensive sequence databases of diverse microbiomes. AVAILABILITY AND IMPLEMENTATION Mistle is freely available on GitHub at https://github.com/BAMeScience/Mistle.
Collapse
Affiliation(s)
- Yannek Nowatzky
- Section S.3 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin 12205, Germany
| | - Philipp Benner
- Section S.3 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin 12205, Germany
| | - Knut Reinert
- Department of Mathematics and Computer Science, FU Berlin, Berlin 14195, Germany
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| | - Thilo Muth
- Section S.3 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin 12205, Germany
| |
Collapse
|
35
|
Hollebrands B, Hageman JA, van de Sande JW, Albada B, Janssen HG. Improved LC-MS identification of short homologous peptides using sequence-specific retention time predictors. Anal Bioanal Chem 2023; 415:2715-2726. [PMID: 37000211 PMCID: PMC10185643 DOI: 10.1007/s00216-023-04670-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 03/17/2023] [Accepted: 03/21/2023] [Indexed: 04/01/2023]
Abstract
Peptides are an important group of compounds contributing to the desired, as well as the undesired taste of a food product. Their taste impressions can include aspects of sweetness, bitterness, savoury, umami and many other impressions depending on the amino acids present as well as their sequence. Identification of short peptides in foods is challenging. We developed a method to assign identities to short peptides including homologous structures, i.e. peptides containing the same amino acids with a different sequence order, by accurate prediction of the retention times during reversed phase separation. To train the method, a large set of well-defined short peptides with systematic variations in the amino acid sequence was prepared by a novel synthesis strategy called 'swapped-sequence synthesis'. Additionally, several proteins were enzymatically digested to yield short peptides. Experimental retention times were determined after reversed phase separation and peptide MS2 data was acquired using a high-resolution mass spectrometer operated in data-dependent acquisition mode (DDA). A support vector regression model was trained using a combination of existing sequence-independent peptide descriptors and a newly derived set of selected amino acid index derived sequence-specific peptide (ASP) descriptors. The model was trained and validated using the experimental retention times of the 713 small food-relevant peptides prepared. Whilst selecting the most useful ASP descriptors for our model, special attention was given to predict the retention time differences between homologous peptide structures. Inclusion of ASP descriptors greatly improved the ability to accurately predict retention times, including retention time differences between 157 homologous peptide pairs. The final prediction model had a goodness-of-fit (Q2) of 0.94; moreover for 93% of the short peptides, the elution order was correctly predicted.
Collapse
Affiliation(s)
- Boudewijn Hollebrands
- Unilever Foods Innovation Centre - Hive, Bronland 14, 6708 WH, Wageningen, the Netherlands.
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands.
| | - Jos A Hageman
- Wageningen University & Research, Biometris, P.O. Box 16, 6700 AA, Wageningen, the Netherlands
| | - Jasper W van de Sande
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands
| | - Bauke Albada
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands
| | - Hans-Gerd Janssen
- Unilever Foods Innovation Centre - Hive, Bronland 14, 6708 WH, Wageningen, the Netherlands
- Laboratory of Organic Chemistry, Wageningen University & Research, Stippeneng 4, 6708 WE, Wageningen, the Netherlands
| |
Collapse
|
36
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
37
|
Declercq A, Bouwmeester R, Chiva C, Sabidó E, Hirschler A, Carapito C, Martens L, Degroeve S, Gabriels R. Updated MS²PIP web server supports cutting-edge proteomics applications. Nucleic Acids Res 2023:7151340. [PMID: 37140039 DOI: 10.1093/nar/gkad335] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 04/04/2023] [Accepted: 04/25/2023] [Indexed: 05/05/2023] Open
Abstract
Interest in the use of machine learning for peptide fragmentation spectrum prediction has been strongly on the rise over the past years, especially for applications in challenging proteomics identification workflows such as immunopeptidomics and the full-proteome identification of data independent acquisition spectra. Since its inception, the MS²PIP peptide spectrum predictor has been widely used for various downstream applications, mostly thanks to its accuracy, ease-of-use, and broad applicability. We here present a thoroughly updated version of the MS²PIP web server, which includes new and more performant prediction models for both tryptic- and non-tryptic peptides, for immunopeptides, and for CID-fragmented TMT-labeled peptides. Additionally, we have also added new functionality to greatly facilitate the generation of proteome-wide predicted spectral libraries, requiring only a FASTA protein file as input. These libraries also include retention time predictions from DeepLC. Moreover, we now provide pre-built and ready-to-download spectral libraries for various model organisms in multiple DIA-compatible spectral library formats. Besides upgrading the back-end models, the user experience on the MS²PIP web server is thus also greatly enhanced, extending its applicability to new domains, including immunopeptidomics and MS3-based TMT quantification experiments. MS²PIP is freely available at https://iomics.ugent.be/ms2pip/.
Collapse
Affiliation(s)
- Arthur Declercq
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Cristina Chiva
- Proteomics Unit, Universitat Pompeu Fabra, 08003, Barcelona, Spain
- Proteomics Unit, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003, Barcelona, Spain
| | - Eduard Sabidó
- Proteomics Unit, Universitat Pompeu Fabra, 08003, Barcelona, Spain
- Proteomics Unit, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003, Barcelona, Spain
| | - Aurélie Hirschler
- Laboratoire de Spectrométrie de Masse BioOrganique (LSMBO), Université de Strasbourg, CNRS, France
| | - Christine Carapito
- Laboratoire de Spectrométrie de Masse BioOrganique (LSMBO), Université de Strasbourg, CNRS, France
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| |
Collapse
|
38
|
Chen M, Zhu P, Wan Q, Ruan X, Wu P, Hao Y, Zhang Z, Sun J, Nie W, Chen S. High-Coverage Four-Dimensional Data-Independent Acquisition Proteomics and Phosphoproteomics Enabled by Deep Learning-Driven Multidimensional Predictions. Anal Chem 2023; 95:7495-7502. [PMID: 37126374 DOI: 10.1021/acs.analchem.2c05414] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Four-dimensional (4D) data-independent acquisition (DIA)-based proteomics is a promising technology. However, its full performance is restricted by the time-consuming building and limited coverage of a project-specific experimental library. Herein, we developed a versatile multifunctional deep learning model Deep4D based on self-attention that could predict the collisional cross section, retention time, fragment ion intensity, and charge state with high accuracies for both the unmodified and phosphorylated peptides and thus established the complete workflows for high-coverage 4D DIA proteomics and phosphoproteomics based on multidimensional predictions. A 4D predicted library containing ∼2 million peptides was established that could realize experimental library-free DIA analysis, and 33% more proteins were identified than using an experimental library of single-shot measurement in the example of HeLa cells. These results show the great values of the convenient high-coverage 4D DIA proteomics methods.
Collapse
Affiliation(s)
- Moran Chen
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Pujia Zhu
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Qiongqiong Wan
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Xianqin Ruan
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Pengfei Wu
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Yanhong Hao
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Zhourui Zhang
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Jian Sun
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Wenjing Nie
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Suming Chen
- The Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| |
Collapse
|
39
|
Oreper D, Klaeger S, Jhunjhunwala S, Delamarre L. The peptide woods are lovely, dark and deep: Hunting for novel cancer antigens. Semin Immunol 2023; 67:101758. [PMID: 37027981 DOI: 10.1016/j.smim.2023.101758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/22/2023] [Accepted: 03/22/2023] [Indexed: 04/08/2023]
Abstract
Harnessing the patient's immune system to control a tumor is a proven avenue for cancer therapy. T cell therapies as well as therapeutic vaccines, which target specific antigens of interest, are being explored as treatments in conjunction with immune checkpoint blockade. For these therapies, selecting the best suited antigens is crucial. Most of the focus has thus far been on neoantigens that arise from tumor-specific somatic mutations. Although there is clear evidence that T-cell responses against mutated neoantigens are protective, the large majority of these mutations are not immunogenic. In addition, most somatic mutations are unique to each individual patient and their targeting requires the development of individualized approaches. Therefore, novel antigen types are needed to broaden the scope of such treatments. We review high throughput approaches for discovering novel tumor antigens and some of the key challenges associated with their detection, and discuss considerations when selecting tumor antigens to target in the clinic.
Collapse
Affiliation(s)
- Daniel Oreper
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | - Susan Klaeger
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | | | | |
Collapse
|
40
|
Xu AM, Tang LC, Jovanovic M, Regev O. A high-throughput approach reveals distinct peptide charging behaviors in electrospray ionization mass spectrometry. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.31.535171. [PMID: 37066236 PMCID: PMC10103939 DOI: 10.1101/2023.03.31.535171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Electrospray ionization is a powerful and prevalent technique used to ionize analytes in mass spectrometry. The distribution of charges that an analyte receives (charge state distribution, CSD) is an important consideration for interpreting mass spectra. However, due to an incomplete understanding of the ionization mechanism, the analyte properties that influence CSDs are not fully understood. Here, we employ a machine learning-based high-throughput approach and analyze CSDs of hundreds of thousands of peptides. Interestingly, half of the peptides exhibit charges that differ from what one would naively expect (number of basic sites). We find that these peptides can be classified into two regimes-undercharging and overcharging-and that these two regimes display markedly different charging characteristics. Strikingly, peptides in the overcharging regime show minimal dependence on basic site count, and more generally, the two regimes exhibit distinct sequence determinants. These findings highlight the rich ionization behavior of peptides and the potential of CSDs for enhancing peptide identification.
Collapse
Affiliation(s)
- Allyn M. Xu
- Department of Mathematics, Courant Institute of Mathematical Sciences, New York University, NY, USA
| | - Lauren C. Tang
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Marko Jovanovic
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Oded Regev
- Computer Science Department, Courant Institute of Mathematical Sciences, New York University, NY, USA
| |
Collapse
|
41
|
Abelin JG, Bergstrom EJ, Rivera KD, Taylor HB, Klaeger S, Xu C, Verzani EK, Jackson White C, Woldemichael HB, Virshup M, Olive ME, Maynard M, Vartany SA, Allen JD, Phulphagar K, Harry Kane M, Rachimi S, Mani DR, Gillette MA, Satpathy S, Clauser KR, Udeshi ND, Carr SA. Workflow enabling deepscale immunopeptidome, proteome, ubiquitylome, phosphoproteome, and acetylome analyses of sample-limited tissues. Nat Commun 2023; 14:1851. [PMID: 37012232 PMCID: PMC10070353 DOI: 10.1038/s41467-023-37547-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 03/20/2023] [Indexed: 04/05/2023] Open
Abstract
Serial multi-omic analysis of proteome, phosphoproteome, and acetylome provides insights into changes in protein expression, cell signaling, cross-talk and epigenetic pathways involved in disease pathology and treatment. However, ubiquitylome and HLA peptidome data collection used to understand protein degradation and antigen presentation have not together been serialized, and instead require separate samples for parallel processing using distinct protocols. Here we present MONTE, a highly sensitive multi-omic native tissue enrichment workflow, that enables serial, deep-scale analysis of HLA-I and HLA-II immunopeptidome, ubiquitylome, proteome, phosphoproteome, and acetylome from the same tissue sample. We demonstrate that the depth of coverage and quantitative precision of each 'ome is not compromised by serialization, and the addition of HLA immunopeptidomics enables the identification of peptides derived from cancer/testis antigens and patient specific neoantigens. We evaluate the technical feasibility of the MONTE workflow using a small cohort of patient lung adenocarcinoma tumors.
Collapse
Affiliation(s)
- Jennifer G Abelin
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA.
| | - Erik J Bergstrom
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Keith D Rivera
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Hannah B Taylor
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Susan Klaeger
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Charles Xu
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Eva K Verzani
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - C Jackson White
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Hilina B Woldemichael
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Maya Virshup
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Meagan E Olive
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Myranda Maynard
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Stephanie A Vartany
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Joseph D Allen
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Kshiti Phulphagar
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - M Harry Kane
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Suzanna Rachimi
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - D R Mani
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Michael A Gillette
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
- Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Shankha Satpathy
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Karl R Clauser
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA
| | - Namrata D Udeshi
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA.
| | - Steven A Carr
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, 02142, USA.
| |
Collapse
|
42
|
Franciosa G, Locard-Paulet M, Jensen LJ, Olsen JV. Recent advances in kinase signaling network profiling by mass spectrometry. Curr Opin Chem Biol 2023; 73:102260. [PMID: 36657259 DOI: 10.1016/j.cbpa.2022.102260] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/13/2022] [Accepted: 12/14/2022] [Indexed: 01/19/2023]
Abstract
Mass spectrometry-based phosphoproteomics is currently the leading methodology for the study of global kinase signaling. The scientific community is continuously releasing technological improvements for sensitive and fast identification of phosphopeptides, and their accurate quantification. To interpret large-scale phosphoproteomics data, numerous bioinformatic resources are available that help understanding kinase network functional role in biological systems upon perturbation. Some of these resources are databases of phosphorylation sites, protein kinases and phosphatases; others are bioinformatic algorithms to infer kinase activity, predict phosphosite functional relevance and visualize kinase signaling networks. In this review, we present the latest experimental and bioinformatic tools to profile protein kinase signaling networks and provide examples of their application in biomedicine.
Collapse
Affiliation(s)
- Giulia Franciosa
- Proteomics Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Marie Locard-Paulet
- Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Lars J Jensen
- Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jesper V Olsen
- Proteomics Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
43
|
Peng J, Chan C, Meng F, Hu Y, Chen L, Lin G, Zhang S, Wheeler AR. Comparison of Database Searching Programs for the Analysis of Single-Cell Proteomics Data. J Proteome Res 2023; 22:1298-1308. [PMID: 36892105 DOI: 10.1021/acs.jproteome.2c00821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
Single-cell proteomics is emerging as an important subfield in the proteomics and mass spectrometry communities, with potential to reshape our understanding of cell development, cell differentiation, disease diagnosis, and the development of new therapies. Compared with significant advancements in the "hardware" that is used in single-cell proteomics, there has been little work comparing the effects of using different "software" packages to analyze single-cell proteomics datasets. To this end, seven popular proteomics programs were compared here, applying them to search three single-cell proteomics datasets generated by three different platforms. The results suggest that MSGF+, MSFragger, and Proteome Discoverer are generally more efficient in maximizing protein identifications, that MaxQuant is better suited for the identification of low-abundance proteins, that MSFragger is superior in elucidating peptide modifications, and that Mascot and X!Tandem are better for analyzing long peptides. Furthermore, an experiment with different loading amounts was carried out to investigate changes in identification results and to explore areas in which single-cell proteomics data analysis may be improved in the future. We propose that this comparative study may provide insight for experts and beginners alike operating in the emerging subfield of single-cell proteomics.
Collapse
Affiliation(s)
- Jiaxi Peng
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario M5S 3G9, Canada
| | - Calvin Chan
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada
| | - Fei Meng
- Clinical Research Center for Reproduction and Genetics in Hunan Province, Reproductive and Genetic Hospital of CITIC-XIANGYA, Changsha, Hunan 410000, China
| | - Yechen Hu
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario M5S 3G9, Canada
| | - Lingfan Chen
- Fujian Province New Drug Safety Evaluation Centre, Fujian Medical University, Fuzhou Fujian 350108, China
| | - Ge Lin
- Clinical Research Center for Reproduction and Genetics in Hunan Province, Reproductive and Genetic Hospital of CITIC-XIANGYA, Changsha, Hunan 410000, China.,Laboratory of Reproductive and Stem Cell Engineering, NHC Key Laboratory of Human Stem Cell and Reproductive Engineering, Central South University, Changsha, Hunan 410075, China
| | - Shen Zhang
- Clinical Research Center for Reproduction and Genetics in Hunan Province, Reproductive and Genetic Hospital of CITIC-XIANGYA, Changsha, Hunan 410000, China
| | - Aaron R Wheeler
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario M5S 3G9, Canada
| |
Collapse
|
44
|
Neely BA, Dorfer V, Martens L, Bludau I, Bouwmeester R, Degroeve S, Deutsch EW, Gessulat S, Käll L, Palczynski P, Payne SH, Rehfeldt TG, Schmidt T, Schwämmle V, Uszkoreit J, Vizcaíno JA, Wilhelm M, Palmblad M. Toward an Integrated Machine Learning Model of a Proteomics Experiment. J Proteome Res 2023; 22:681-696. [PMID: 36744821 PMCID: PMC9990124 DOI: 10.1021/acs.jproteome.2c00711] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.
Collapse
Affiliation(s)
- Benjamin A Neely
- National Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Viktoria Dorfer
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Softwarepark 11, 4232 Hagenberg, Austria
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Isabell Bludau
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | | | - Lukas Käll
- Science for Life Laboratory, KTH - Royal Institute of Technology, 171 21 Solna, Sweden
| | - Pawel Palczynski
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Samuel H Payne
- Department of Biology, Brigham Young University, Provo, Utah 84602, United States
| | - Tobias Greisager Rehfeldt
- Institute for Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark
| | | | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark
| | - Julian Uszkoreit
- Medical Proteome Analysis, Center for Protein Diagnostics (ProDi), Ruhr University Bochum, 44801 Bochum, Germany.,Medizinisches Proteom-Center, Medical Faculty, Ruhr University Bochum, 44801 Bochum, Germany
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich (TUM), 85354 Freising, Germany
| | - Magnus Palmblad
- Leiden University Medical Center, Postbus 9600, 2300 RC Leiden, The Netherlands
| |
Collapse
|
45
|
Ahn R, Cui Y, White FM. Antigen discovery for the development of cancer immunotherapy. Semin Immunol 2023; 66:101733. [PMID: 36841147 DOI: 10.1016/j.smim.2023.101733] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 02/25/2023]
Abstract
Central to successful cancer immunotherapy is effective T cell antitumor immunity. Multiple targeted immunotherapies engineered to invigorate T cell-driven antitumor immunity rely on identifying the repertoire of T cell antigens expressed on the tumor cell surface. Mass spectrometry-based survey of such antigens ("immunopeptidomics") combined with other omics platforms and computational algorithms has been instrumental in identifying and quantifying tumor-derived T cell antigens. In this review, we discuss the types of tumor antigens that have emerged for targeted cancer immunotherapy and the immunopeptidomics methods that are central in MHC peptide identification and quantification. We provide an overview of the strength and limitations of mass spectrometry-driven approaches and how they have been integrated with other technologies to discover targetable T cell antigens for cancer immunotherapy. We highlight some of the emerging cancer immunotherapies that successfully capitalized on immunopeptidomics, their challenges, and mass spectrometry-based strategies that can support their development.
Collapse
Affiliation(s)
- Ryuhjin Ahn
- David H. Koch Institute for Integrative Cancer Research, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Yufei Cui
- David H. Koch Institute for Integrative Cancer Research, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Forest M White
- David H. Koch Institute for Integrative Cancer Research, Cambridge, MA 02139, USA; Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
46
|
Merlotti A, Sadacca B, Arribas YA, Ngoma M, Burbage M, Goudot C, Houy A, Rocañín-Arjó A, Lalanne A, Seguin-Givelet A, Lefevre M, Heurtebise-Chrétien S, Baudon B, Oliveira G, Loew D, Carrascal M, Wu CJ, Lantz O, Stern MH, Girard N, Waterfall JJ, Amigorena S. Noncanonical splicing junctions between exons and transposable elements represent a source of immunogenic recurrent neo-antigens in patients with lung cancer. Sci Immunol 2023; 8:eabm6359. [PMID: 36735774 DOI: 10.1126/sciimmunol.abm6359] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 01/12/2023] [Indexed: 02/05/2023]
Abstract
Although most characterized tumor antigens are encoded by canonical transcripts (such as differentiation or tumor-testis antigens) or mutations (both driver and passenger mutations), recent results have shown that noncanonical transcripts including long noncoding RNAs and transposable elements (TEs) can also encode tumor-specific neo-antigens. Here, we investigate the presentation and immunogenicity of tumor antigens derived from noncanonical mRNA splicing events between coding exons and TEs. Comparing human non-small cell lung cancer (NSCLC) and diverse healthy tissues, we identified a subset of splicing junctions that is both tumor specific and shared across patients. We used HLA-I peptidomics to identify peptides encoded by tumor-specific junctions in primary NSCLC samples and lung tumor cell lines. Recurrent junction-encoded peptides were immunogenic in vitro, and CD8+ T cells specific for junction-encoded epitopes were present in tumors and tumor-draining lymph nodes from patients with NSCLC. We conclude that noncanonical splicing junctions between exons and TEs represent a source of recurrent, immunogenic tumor-specific antigens in patients with NSCLC.
Collapse
Affiliation(s)
- Antonela Merlotti
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Benjamin Sadacca
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
- Department of Translational Research, PSL Research University, Institut Curie Research Center, Paris, France
| | - Yago A Arribas
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Mercia Ngoma
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Marianne Burbage
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Christel Goudot
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Alexandre Houy
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
| | - Ares Rocañín-Arjó
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Ana Lalanne
- Institut Curie, Laboratory of Clinical immunology, 75005 Paris, France
- Institut Curie, CIC-BT1428, 75005 Paris, France
| | - Agathe Seguin-Givelet
- Thoracic Surgery Department, Curie-Montsouris Thorax Institute - Institut Mutualiste Montsouris, Paris, France
- Paris 13 University, Sorbonne Paris Cité, Faculty of Medicine SMBH, Bobigny, France
| | - Marine Lefevre
- Department of Pathology, Institute Mutualiste Montsouris, Paris, France
| | | | - Blandine Baudon
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| | - Giacomo Oliveira
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Damarys Loew
- Institut Curie, Centre de Recherche, Laboratoire de Spectrométrie de Masse Protéomique, PSL Research University, Paris cedex 05, France
| | - Montserrat Carrascal
- Biological and Environmental Proteomics, Institut d'Investigacions Biomèdiques de Barcelona-CSIC, IDIBAPS, Roselló 161, 6a planta, 08036 Barcelona, Spain
| | - Catherine J Wu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Olivier Lantz
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
- Institut Curie, Laboratory of Clinical immunology, 75005 Paris, France
- Institut Curie, CIC-BT1428, 75005 Paris, France
| | - Marc-Henri Stern
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
| | - Nicolas Girard
- Thoracic Surgery Department, Curie-Montsouris Thorax Institute - Institut Mutualiste Montsouris, Paris, France
| | - Joshua J Waterfall
- INSERM U830, PSL Research University, Institute Curie Research Center, Paris, France
- Department of Translational Research, PSL Research University, Institut Curie Research Center, Paris, France
| | - Sebastian Amigorena
- Institut Curie, Université Paris Sciences et Lettres, INSERM U932, 75005 Paris, France
| |
Collapse
|
47
|
Yeung D, Spicer V, Zahedi RP, Krokhin O. Exploring the variable space of shallow machine learning models for reversed-phase retention time prediction. Comput Struct Biotechnol J 2023; 21:2446-2453. [PMID: 37090433 PMCID: PMC10113922 DOI: 10.1016/j.csbj.2023.02.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/24/2023] [Accepted: 02/24/2023] [Indexed: 03/02/2023] Open
Abstract
Peptide retention time (RT) prediction algorithms are tools to study and identify the physicochemical properties that drive the peptide-sorbent interaction. Traditional RT algorithms use multiple linear regression with manually curated parameters to determine the degree of direct contribution for each parameter and improvements to RT prediction accuracies relied on superior feature engineering. Deep learning led to a significant increase in RT prediction accuracy and automated feature engineering via chaining multiple learning modules. However, the significance and the identity of these extracted variables are not well understood due to the inherent complexity when interpreting "relationships-of-relationships" found in deep learning variables. To achieve both accuracy and interpretability simultaneously, we isolated individual modules used in deep learning and the isolated modules are the shallow learners employed for RT prediction in this work. Using a shallow convolutional neural network (CNN) and gated recurrent unit (GRU), we find that the spatial features obtained via the CNN correlate with real-world physicochemical properties namely cross-collisional sections (CCS) and variations of assessable surface area (ASA). Furthermore, we determined that the discovered parameters are "micro-coefficients" that contribute to the "macro-coefficient" - hydrophobicity. Manually embedding CCS and the variations of ASA to the GRU model yielded an R2 = 0.981 using only 525 variables and can represent 88% of the ∼110,000 tryptic peptides used in our dataset. This work highlights the feature discovery process of our shallow learners can achieve beyond traditional RT models in performance and have better interpretability when compared with the deep learning RT algorithms found in the literature.
Collapse
|
48
|
Identification of Alternative Splicing in Proteomes of Human Melanoma Cell Lines without RNA Sequencing Data. Int J Mol Sci 2023; 24:ijms24032466. [PMID: 36768787 PMCID: PMC9916885 DOI: 10.3390/ijms24032466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/06/2023] [Accepted: 01/13/2023] [Indexed: 01/31/2023] Open
Abstract
Alternative splicing is one of the main regulation pathways in living cells beyond simple changes in the level of protein expression. Most of the approaches proposed in proteomics for the identification of specific splicing isoforms require a preliminary deep transcriptomic analysis of the sample under study, which is not always available, especially in the case of the re-analysis of previously acquired data. Herein, we developed new algorithms for the identification and validation of protein splice isoforms in proteomic data in the absence of RNA sequencing of the samples under study. The bioinformatic approaches were tested on the results of proteome analysis of human melanoma cell lines, obtained earlier by high-resolution liquid chromatography and mass spectrometry (LC-MS). A search for alternative splicing events for each of the cell lines studied was performed against the database generated from all known transcripts (RefSeq) and the one composed of peptide sequences, which included all biologically possible combinations of exons. The identifications were filtered using the prediction of both retention times and relative intensities of fragment ions in the corresponding mass spectra. The fragmentation mass spectra corresponding to the discovered alternative splicing events were additionally examined for artifacts. Selected splicing events were further validated at the mRNA level by quantitative PCR.
Collapse
|
49
|
Rehfeldt T, Gabriels R, Bouwmeester R, Gessulat S, Neely BA, Palmblad M, Perez-Riverol Y, Schmidt T, Vizcaíno JA, Deutsch EW. ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics. J Proteome Res 2023; 22:632-636. [PMID: 36693629 PMCID: PMC9903315 DOI: 10.1021/acs.jproteome.2c00629] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Data set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC) coupled to mass spectrometry (MS) data sets, due to the high levels of data reduction that occur between raw data and machine learning-ready data. Since predictive proteomics is an emerging field, when predicting peptide behavior in LC-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based data sets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides data sets that are useful for comparing state-of-the-art machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available at https://www.proteomicsml.org/, and we welcome the entire proteomics community to contribute to the project at https://github.com/ProteomicsML/ProteomicsML.
Collapse
Affiliation(s)
- Tobias
G. Rehfeldt
- Institute
for Mathematics and Computer Science, University
of Southern Denmark, 5000 Odense, Denmark
| | - Ralf Gabriels
- VIB-UGent
Center for Medical Biotechnology, VIB, Ghent 9052, Belgium,Department
of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Robbin Bouwmeester
- VIB-UGent
Center for Medical Biotechnology, VIB, Ghent 9052, Belgium,Department
of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | | | - Benjamin A. Neely
- National
Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Magnus Palmblad
- Center for
Proteomics and Metabolomics, Leiden University
Medical Center, 2300 RC Leiden, The Netherlands
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust
Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust
Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom,Juan
Antonio Vizcaíno: , Phone: +44 (0) 1223 492686
| | - Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States,Eric Deutsch: ,
Phone: 206-732-1200, Fax: 206-732-1299
| |
Collapse
|
50
|
Cox J. Prediction of peptide mass spectral libraries with machine learning. Nat Biotechnol 2023; 41:33-43. [PMID: 36008611 DOI: 10.1038/s41587-022-01424-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/11/2022] [Indexed: 01/21/2023]
Abstract
The recent development of machine learning methods to identify peptides in complex mass spectrometric data constitutes a major breakthrough in proteomics. Longstanding methods for peptide identification, such as search engines and experimental spectral libraries, are being superseded by deep learning models that allow the fragmentation spectra of peptides to be predicted from their amino acid sequence. These new approaches, including recurrent neural networks and convolutional neural networks, use predicted in silico spectral libraries rather than experimental libraries to achieve higher sensitivity and/or specificity in the analysis of proteomics data. Machine learning is galvanizing applications that involve large search spaces, such as immunopeptidomics and proteogenomics. Current challenges in the field include the prediction of spectra for peptides with post-translational modifications and for cross-linked pairs of peptides. Permeation of machine-learning-based spectral prediction into search engines and spectrum-centric data-independent acquisition workflows for diverse peptide classes and measurement conditions will continue to push sensitivity and dynamic range in proteomics applications in the coming years.
Collapse
Affiliation(s)
- Jürgen Cox
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany.
- Department of Biological and Medical Psychology, University of Bergen, Bergen, Norway.
| |
Collapse
|