1
|
Geiszler DJ, Polasky DA, Yu F, Nesvizhskii AI. Detecting diagnostic features in MS/MS spectra of post-translationally modified peptides. Nat Commun 2023; 14:4132. [PMID: 37438360 DOI: 10.1038/s41467-023-39828-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 06/23/2023] [Indexed: 07/14/2023] Open
Abstract
Post-translational modifications are an area of great interest in mass spectrometry-based proteomics, with a surge in methods to detect them in recent years. However, post-translational modifications can introduce complexity into proteomics searches by fragmenting in unexpected ways, ultimately hindering the detection of modified peptides. To address these deficiencies, we present a fully automated method to find diagnostic spectral features for any modification. The features can be incorporated into proteomics search engines to improve modified peptide recovery and localization. We show the utility of this approach by interrogating fragmentation patterns for a cysteine-reactive chemoproteomic probe, RNA-crosslinked peptides, sialic acid-containing glycopeptides, and ADP-ribosylated peptides. We also analyze the interactions between a diagnostic ion's intensity and its statistical properties. This method has been incorporated into the open-search annotation tool PTM-Shepherd and the FragPipe computational platform.
Collapse
Affiliation(s)
- Daniel J Geiszler
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Daniel A Polasky
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
2
|
Desaire H, Go EP, Hua D. Advances, obstacles, and opportunities for machine learning in proteomics. CELL REPORTS. PHYSICAL SCIENCE 2022; 3:101069. [PMID: 36381226 PMCID: PMC9648337 DOI: 10.1016/j.xcrp.2022.101069] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The fields of proteomics and machine learning are both large disciplines, each producing well over 5,000 publications per year. However, studies combining both fields are still relatively rare, with only about 2% of recent proteomics papers including machine learning. This review, which focuses on the intersection of the fields, is intended to inspire proteomics researchers to develop skills and knowledge in the application of machine learning. A brief tutorial introduction to machine learning is provided, and research advances that rely on both fields, particularly as they relate to proteomics tools development and biomarker discovery, are highlighted. Key knowledge gaps and opportunities for scientific advancement are also enumerated.
Collapse
Affiliation(s)
- Heather Desaire
- Department of Chemistry, University of Kansas, Lawrence, KS 66045, USA
| | - Eden P. Go
- Department of Chemistry, University of Kansas, Lawrence, KS 66045, USA
| | - David Hua
- Department of Chemistry, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
3
|
Altenburg T, Giese SH, Wang S, Muth T, Renard BY. Ad hoc learning of peptide fragmentation from mass spectra enables an interpretable detection of phosphorylated and cross-linked peptides. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00467-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
AbstractMass spectrometry-based proteomics provides a holistic snapshot of the entire protein set of living cells on a molecular level. Currently, only a few deep learning approaches exist that involve peptide fragmentation spectra, which represent partial sequence information of proteins. Commonly, these approaches lack the ability to characterize less studied or even unknown patterns in spectra because of their use of explicit domain knowledge. Here, to elevate unrestricted learning from spectra, we introduce ‘ad hoc learning of fragmentation’ (AHLF), a deep learning model that is end-to-end trained on 19.2 million spectra from several phosphoproteomic datasets. AHLF is interpretable, and we show that peak-level feature importance values and pairwise interactions between peaks are in line with corresponding peptide fragments. We demonstrate our approach by detecting post-translational modifications, specifically protein phosphorylation based on only the fragmentation spectrum without a database search. AHLF increases the area under the receiver operating characteristic curve (AUC) by an average of 9.4% on recent phosphoproteomic data compared with the current state of the art on this task. Furthermore, use of AHLF in rescoring search results increases the number of phosphopeptide identifications by a margin of up to 15.1% at a constant false discovery rate. To show the broad applicability of AHLF, we use transfer learning to also detect cross-linked peptides, as used in protein structure analysis, with an AUC of up to 94%.
Collapse
|
4
|
Musiani D, Massignani E, Cuomo A, Yadav A, Bonaldi T. Biochemical and Computational Approaches for the Large-Scale Analysis of Protein Arginine Methylation by Mass Spectrometry. Curr Protein Pept Sci 2021; 21:725-739. [PMID: 32338214 DOI: 10.2174/1389203721666200426232531] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 12/20/2019] [Accepted: 12/24/2019] [Indexed: 12/27/2022]
Abstract
The absence of efficient mass spectrometry-based approaches for the large-scale analysis of protein arginine methylation has hindered the understanding of its biological role, beyond the transcriptional regulation occurring through histone modification. In the last decade, however, several technological advances of both the biochemical methods for methylated polypeptide enrichment and the computational pipelines for MS data analysis have considerably boosted this research field, generating novel insights about the extent and role of this post-translational modification. Here, we offer an overview of state-of-the-art approaches for the high-confidence identification and accurate quantification of protein arginine methylation by high-resolution mass spectrometry methods, which comprise the development of both biochemical and bioinformatics methods. The further optimization and systematic application of these analytical solutions will lead to ground-breaking discoveries on the role of protein methylation in biological processes.
Collapse
Affiliation(s)
- Daniele Musiani
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan 20139, Italy
| | - Enrico Massignani
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan 20139, Italy
| | - Alessandro Cuomo
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan 20139, Italy
| | - Avinash Yadav
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan 20139, Italy
| | - Tiziana Bonaldi
- Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan 20139, Italy
| |
Collapse
|
5
|
Toropova AP, Toropov AA. Application of the Monte Carlo Method for the Prediction of Behavior of Peptides. Curr Protein Pept Sci 2019; 20:1151-1157. [DOI: 10.2174/1389203720666190123163907] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 12/17/2018] [Accepted: 12/20/2018] [Indexed: 12/26/2022]
Abstract
Prediction of physicochemical and biochemical behavior of peptides is an important and attractive
task of the modern natural sciences, since these substances have a key role in life processes. The
Monte Carlo technique is a possible way to solve the above task. The Monte Carlo method is a tool with
different applications relative to the study of peptides: (i) analysis of the 3D configurations (conformers);
(ii) establishment of quantitative structure – property / activity relationships (QSPRs/QSARs); and (iii)
development of databases on the biopolymers. Current ideas related to application of the Monte Carlo
technique for studying peptides and biopolymers have been discussed in this review.
Collapse
Affiliation(s)
- Alla P. Toropova
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy
| | - Andrey A. Toropov
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy
| |
Collapse
|
6
|
Hernandez-Valladares M, Wangen R, Berven FS, Guldbrandsen A. Protein Post-Translational Modification Crosstalk in Acute Myeloid Leukemia Calls for Action. Curr Med Chem 2019; 26:5317-5337. [PMID: 31241430 DOI: 10.2174/0929867326666190503164004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2018] [Revised: 11/23/2018] [Accepted: 02/01/2019] [Indexed: 01/24/2023]
Abstract
BACKGROUND Post-translational modification (PTM) crosstalk is a young research field. However, there is now evidence of the extraordinary characterization of the different proteoforms and their interactions in a biological environment that PTM crosstalk studies can describe. Besides gene expression and phosphorylation profiling of acute myeloid leukemia (AML) samples, the functional combination of several PTMs that might contribute to a better understanding of the complexity of the AML proteome remains to be discovered. OBJECTIVE By reviewing current workflows for the simultaneous enrichment of several PTMs and bioinformatics tools to analyze mass spectrometry (MS)-based data, our major objective is to introduce the PTM crosstalk field to the AML research community. RESULTS After an introduction to PTMs and PTM crosstalk, this review introduces several protocols for the simultaneous enrichment of PTMs. Two of them allow a simultaneous enrichment of at least three PTMs when using 0.5-2 mg of cell lysate. We have reviewed many of the bioinformatics tools used for PTM crosstalk discovery as its complex data analysis, mainly generated from MS, becomes challenging for most AML researchers. We have presented several non-AML PTM crosstalk studies throughout the review in order to show how important the characterization of PTM crosstalk becomes for the selection of disease biomarkers and therapeutic targets. CONCLUSION Herein, we have reviewed the advances and pitfalls of the emerging PTM crosstalk field and its potential contribution to unravel the heterogeneity of AML. The complexity of sample preparation and bioinformatics workflows demands a good interaction between experts of several areas.
Collapse
Affiliation(s)
- Maria Hernandez-Valladares
- Department of Clinical Science, Faculty of Medicine, University of Bergen, Jonas Lies vei 87, N-5021 Bergen, Norway.,The Proteomics Unit at the University of Bergen, Department of Biomedicine, Building for Basic Biology, Faculty of Medicine, University of Bergen, Jonas Lies vei 91, N-5009 Bergen, Norway
| | - Rebecca Wangen
- Department of Clinical Science, Faculty of Medicine, University of Bergen, Jonas Lies vei 87, N-5021 Bergen, Norway.,The Proteomics Unit at the University of Bergen, Department of Biomedicine, Building for Basic Biology, Faculty of Medicine, University of Bergen, Jonas Lies vei 91, N-5009 Bergen, Norway.,Department of Internal Medicine, Hematology Section, Haukeland University Hospital, Jonas Lies vei 65, N-5021 Bergen, Norway
| | - Frode S Berven
- The Proteomics Unit at the University of Bergen, Department of Biomedicine, Building for Basic Biology, Faculty of Medicine, University of Bergen, Jonas Lies vei 91, N-5009 Bergen, Norway
| | - Astrid Guldbrandsen
- The Proteomics Unit at the University of Bergen, Department of Biomedicine, Building for Basic Biology, Faculty of Medicine, University of Bergen, Jonas Lies vei 91, N-5009 Bergen, Norway.,Computational Biology Unit, Department of Informatics, Faculty of Mathematics and Natural Sciences, University of Bergen, Thormøhlensgt 55, N-5008 Bergen, Norway
| |
Collapse
|
7
|
Gao J, Yang F, Che J, Han Y, Wang Y, Chen N, Bak DW, Lai S, Xie X, Weerapana E, Wang C. Selenium-Encoded Isotopic Signature Targeted Profiling. ACS CENTRAL SCIENCE 2018; 4:960-970. [PMID: 30159393 PMCID: PMC6107865 DOI: 10.1021/acscentsci.8b00112] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Indexed: 05/09/2023]
Abstract
Selenium (Se), as an essential trace element, plays crucial roles in many organisms including humans. The biological functions of selenium are mainly mediated by selenoproteins, a unique class of selenium-containing proteins in which selenium is inserted in the form of selenocysteine. Due to their low abundance and uneven tissue distribution, detection of selenoproteins within proteomes is very challenging, and therefore functional studies of these proteins are limited. In this study, we developed a computational method, named as selenium-encoded isotopic signature targeted profiling (SESTAR), which utilizes the distinct natural isotopic distribution of selenium to assist detection of trace selenium-containing signals from shotgun-proteomic data. SESTAR can detect femtomole quantities of synthetic selenopeptides in a benchmark test and dramatically improved detection of native selenoproteins from tissue proteomes in a targeted profiling mode. By applying SESTAR to screen publicly available datasets from Human Proteome Map, we provide a comprehensive picture of selenoprotein distributions in human primary hematopoietic cells and tissues. We further demonstrated that SESTAR can aid chemical-proteomic strategies to identify additional selenoprotein targets of RSL3, a canonical inducer of cell ferroptosis. We believe SESTAR not only serves as a powerful tool for global profiling of native selenoproteomes, but can also work seamlessly with chemical-proteomic profiling strategies to enhance identification of target proteins, post-translational modifications, or protein-protein interactions.
Collapse
Affiliation(s)
- Jinjun Gao
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
- Peking−Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Fan Yang
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
| | - Jinteng Che
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
| | - Yu Han
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
| | - Yankun Wang
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
- Peking−Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Nan Chen
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
| | - Daniel W. Bak
- Department
of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Shuchang Lai
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
| | - Xiao Xie
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
| | - Eranthie Weerapana
- Department
of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Chu Wang
- Synthetic
and Functional Biomolecules Center; Beijing National Laboratory for
Molecular Sciences; Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of the Ministry of Education; College of Chemistry and
Molecular Engineering, Peking University, Beijing 100871, China
- Peking−Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- E-mail:
| |
Collapse
|