1
|
Frejno M, Berger MT, Tüshaus J, Hogrebe A, Seefried F, Graber M, Samaras P, Ben Fredj S, Sukumar V, Eljagh L, Bronshtein I, Mamisashvili L, Schneider M, Gessulat S, Schmidt T, Kuster B, Zolg DP, Wilhelm M. Unifying the analysis of bottom-up proteomics data with CHIMERYS. Nat Methods 2025; 22:1017-1027. [PMID: 40263583 PMCID: PMC12074992 DOI: 10.1038/s41592-025-02663-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 03/06/2025] [Indexed: 04/24/2025]
Abstract
Proteomic workflows generate vastly complex peptide mixtures that are analyzed by liquid chromatography-tandem mass spectrometry, creating thousands of spectra, most of which are chimeric and contain fragment ions from more than one peptide. Because of differences in data acquisition strategies such as data-dependent, data-independent or parallel reaction monitoring, separate software packages employing different analysis concepts are used for peptide identification and quantification, even though the underlying information is principally the same. Here, we introduce CHIMERYS, a spectrum-centric search algorithm designed for the deconvolution of chimeric spectra that unifies proteomic data analysis. Using accurate predictions of peptide retention time, fragment ion intensities and applying regularized linear regression, it explains as much fragment ion intensity as possible with as few peptides as possible. Together with rigorous false discovery rate control, CHIMERYS accurately identifies and quantifies multiple peptides per tandem mass spectrum in data-dependent, data-independent or parallel reaction monitoring experiments.
Collapse
Affiliation(s)
| | | | - Johanna Tüshaus
- School of Life Sciences, Technical University of Munich, Freising, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | - Bernhard Kuster
- School of Life Sciences, Technical University of Munich, Freising, Germany
- Munich Data Science Institute (MDSI), Technical University of Munich, Garching b. München, Germany
| | | | - Mathias Wilhelm
- School of Life Sciences, Technical University of Munich, Freising, Germany.
- Munich Data Science Institute (MDSI), Technical University of Munich, Garching b. München, Germany.
| |
Collapse
|
2
|
Wang Z, Xiong X, Liu X. Proteoform identification using multiplexed top-down mass spectra. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.05.636727. [PMID: 39975217 PMCID: PMC11839095 DOI: 10.1101/2025.02.05.636727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Top-down mass spectrometry (TDMS) is the method of choice for analyzing intact proteoforms, as well as their post-translational modifications and sequence variations. In TDMS experiments, multiple proteoforms are often co-fragmented in tandem mass spectrometry (MS/MS) analysis, resulting in multiplexed TD-MS/MS spectra. Since multiplexed TD-MS/MS spectra are more complex than common spectra generated from single proteoforms, these spectra pose a significant challenge for proteoform identification and quantification. Here we present TopMPI, a new computational tool specifically designed for the identification of multiplexed TD-MS/MS spectra. Experimental results demonstrate that TopMPI significantly increases proteoform identifications and reduces identification errors in multiplexed TD-MS/MS spectral analysis compared to existing tools.
Collapse
Affiliation(s)
- Zhige Wang
- Department of Computer Science, Tulane University, New Orleans, Louisiana, 70112, United States
| | - Xingzhao Xiong
- Deming Department of Medicine, Tulane University, New Orleans, Louisiana, 70112, United States
| | - Xiaowen Liu
- Deming Department of Medicine, Tulane University, New Orleans, Louisiana, 70112, United States
| |
Collapse
|
3
|
Zhan Z, Wang L. Proteoform identification and quantification based on alignment graphs. Bioinformatics 2024; 41:btaf007. [PMID: 39786854 PMCID: PMC11769674 DOI: 10.1093/bioinformatics/btaf007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 01/02/2025] [Accepted: 01/06/2025] [Indexed: 01/12/2025] Open
Abstract
MOTIVATION Proteoforms are the different forms of a proteins generated from the genome with various sequence variations, splice isoforms, and post-translational modifications. Proteoforms regulate protein structures and functions. A single protein can have multiple proteoforms due to different modification sites. Proteoform identification is to find proteoforms of a given protein that best fits the input spectrum. Proteoform quantification is to find the corresponding abundances of different proteoforms for a specific protein. RESULTS We proposed algorithms for proteoform identification and quantification based on the top-down tandem mass spectrum. In the combination alignments of the HomMTM spectrum and the reference protein, we need to give a correction of the mass for each matched peak within the pre-defined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the protein is identical to that of the corresponding two matched peaks in the HomMTM spectrum. We design a back-tracking graph to store such kind of information and find a combinatorial path (k paths) with the minimum sum of peak intensity error in this back-tracking graph. The obtained alignment can also show the relative abundance of these proteoforms (paths). Our experimental results demonstrate the algorithm's capability to identify and quantify proteoform combinations encompassing a greater number of peaks. This advancement holds promise for enhancing the accuracy and comprehensiveness of proteoform quantification, addressing a crucial need in the field of top-down MS-based proteomics. AVAILABILITY AND IMPLEMENTATION The software package are available at https://github.com/Zeirdo/TopMGQuant.
Collapse
Affiliation(s)
- Zhaohui Zhan
- Department of Engineering, Shenzhen MSU-BIT University, Shenzhen, 518172, China
- Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China
| | - Lusheng Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China
- City University of Hong Kong Shenzhen Research Institution, 518057, China
| |
Collapse
|
4
|
Jeong K, Kaulich PT, Jung W, Kim J, Tholey A, Kohlbacher O. Precursor deconvolution error estimation: The missing puzzle piece in false discovery rate in top-down proteomics. Proteomics 2024; 24:e2300068. [PMID: 37997224 DOI: 10.1002/pmic.202300068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 11/09/2023] [Accepted: 11/13/2023] [Indexed: 11/25/2023]
Abstract
Top-down proteomics (TDP) directly analyzes intact proteins and thus provides more comprehensive qualitative and quantitative proteoform-level information than conventional bottom-up proteomics (BUP) that relies on digested peptides and protein inference. While significant advancements have been made in TDP in sample preparation, separation, instrumentation, and data analysis, reliable and reproducible data analysis still remains one of the major bottlenecks in TDP. A key step for robust data analysis is the establishment of an objective estimation of proteoform-level false discovery rate (FDR) in proteoform identification. The most widely used FDR estimation scheme is based on the target-decoy approach (TDA), which has primarily been established for BUP. We present evidence that the TDA-based FDR estimation may not work at the proteoform-level due to an overlooked factor, namely the erroneous deconvolution of precursor masses, which leads to incorrect FDR estimation. We argue that the conventional TDA-based FDR in proteoform identification is in fact protein-level FDR rather than proteoform-level FDR unless precursor deconvolution error rate is taken into account. To address this issue, we propose a formula to correct for proteoform-level FDR bias by combining TDA-based FDR and precursor deconvolution error rate.
Collapse
Affiliation(s)
- Kyowon Jeong
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Philipp T Kaulich
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Wonhyeuk Jung
- Department of Cell Biology, Yale School of Medicine, New Haven, Connecticut, USA
| | - Jihyung Kim
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Oliver Kohlbacher
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
- Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| |
Collapse
|
5
|
Adair LR, Jones I, Cramer R. Utilizing Precursor Ion Connectivity of Different Charge States to Improve Peptide and Protein Identification in MS/MS Analysis. Anal Chem 2024; 96:985-990. [PMID: 38193749 PMCID: PMC10809226 DOI: 10.1021/acs.analchem.3c03061] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 12/12/2023] [Accepted: 12/13/2023] [Indexed: 01/10/2024]
Abstract
Tandem mass spectrometry (MS/MS) has become a key method for the structural analysis of biomolecules such as peptides and proteins. A pervasive problem in MS/MS analyses, especially for top-down proteomics, is the occurrence of chimeric spectra, when two or more precursor ions are co-isolated and fragmented, thus leading to complex MS/MS spectra that are populated with fragment ions originating from different precursor ions. This type of convoluted data typically results in low sequence database search scores due to the vast number of mixed-source fragment ions, of which only a fraction originates from a specific precursor ion. Herein, we present a novel workflow that deconvolutes the data of chimeric MS/MS spectra, improving the protein search scores and sequence coverages in database searching and thus providing a more confident peptide and protein identification. Previously misidentified proteins or proteins with insignificant search scores can be correctly and significantly identified following the presented data acquisition and analysis workflow with search scores increasing by a factor of 3-4 for smaller precursor ions (peptides) and >6 for larger precursor ions such as intact ubiquitin and cytochrome C.
Collapse
Affiliation(s)
- Lily R. Adair
- Department
of Chemistry, University of Reading, Whiteknights, Reading RG6 6DX, United Kingdom
| | - Ian Jones
- School
of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AJ, United Kingdom
| | - Rainer Cramer
- Department
of Chemistry, University of Reading, Whiteknights, Reading RG6 6DX, United Kingdom
| |
Collapse
|
6
|
Zhang W, Liang Z, Chen X, Xin L, Shan B, Luo Z, Li M. ChimST: An Efficient Spectral Library Search Tool for Peptide Identification from Chimeric Spectra in Data-Dependent Acquisition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1416-1425. [PMID: 31603795 DOI: 10.1109/tcbb.2019.2945954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Accurate and sensitive identification of peptides from MS/MS spectra is a very challenging problem in computational shotgun proteomics. To tackle this problem, spectral library search has been one of the competitive solutions. However, most existing library search tools were developed on the basis of one peptide per spectrum, which prevents them from working properly on chimeric spectra where two or more peptides are co-fragmented. In this work, we present a new library search tool called ChimST, which is particularly capable of reliably identifying multiple peptides from a chimeric spectrum. It starts with associating each query MS/MS spectrum with MS precursor features. For each precursor feature, there is a list of peptide candidates extracted from an input spectral library. Then, it takes one peptide candidate from each associated feature and scores how well they could collectively interpret the query spectrum. The highest-scoring set of peptide candidates are finally reported as the identification of the query spectrum. Our experimental tests show that ChimST could significantly outperform the three state-of-the-art library search tools, SpectraST, reSpect, and MSPLIT, in terms of the numbers of both peptide-spectrum matches and unique peptides, especially when the acquisition isolation window is broad.
Collapse
|
7
|
Abstract
Metaproteomics can provide critical information about biological systems, but peptides are found within a complex background of other peptides. This complex background can change across samples, in some cases drastically. Cofragmentation, the coelution of peptides with similar mass to charge ratios, is one factor that influences which peptides are identified in an LC-MS/MS experiment: it is dependent on the nature and complexity of this dynamic background. Metaproteomics applications are particularly susceptible to cofragmentation-induced bias; they have vast protein sequence diversity and the abundance of those proteins can span many orders of magnitude. We have developed a mechanistic model that determines the number of potentially cofragmenting peptides in a given sample (called cobia, https://github.com/bertrand-lab/cobia ). We then used previously published data sets to validate our model, showing that the resulting peptide-specific score reflects the cofragmentation "risk" of peptides. Using an Antarctic sea ice edge metatranscriptome case study, we found that more rare taxonomic and functional groups are associated with higher cofragmentation bias. We also demonstrate how cofragmentation scores can be used to guide the selection of protein- or peptide-based biomarkers. We illustrate potential consequences of cofragmentation for multiple metaproteomic approaches, and suggest practical paths forward to cope with cofragmentation-induced bias.
Collapse
Affiliation(s)
- J Scott P McCain
- Department of Biology , Dalhousie University , Halifax , Nova Scotia B3H 4R2 , Canada
| | - Erin M Bertrand
- Department of Biology , Dalhousie University , Halifax , Nova Scotia B3H 4R2 , Canada
| |
Collapse
|
8
|
Zhu K, Liu X. A graph-based approach for proteoform identification and quantification using top-down homogeneous multiplexed tandem mass spectra. BMC Bioinformatics 2018; 19:280. [PMID: 30367573 PMCID: PMC6101081 DOI: 10.1186/s12859-018-2273-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Background Top-down homogeneous multiplexed tandem mass (HomMTM) spectra are generated from modified proteoforms of the same protein with different post-translational modification patterns. They are frequently observed in the analysis of ultramodified proteins, some proteoforms of which have similar molecular weights and cannot be well separated by liquid chromatography in mass spectrometry analysis. Results We formulate the top-down HomMTM spectral identification problem as the minimum error k-splittable flow problem on graphs and propose a graph-based algorithm for the identification and quantification of proteoforms using top-down HomMTM spectra. Conclusions Experiments on a top-down mass spectrometry data set of the histone H4 protein showed that the proposed method identified many proteoform pairs that better explain the query spectra than single proteoforms. Electronic supplementary material The online version of this article (10.1186/s12859-018-2273-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kaiyuan Zhu
- Department of Computer Science, Indiana University Bloomington, 700 N. Woodlawn Avenue, Bloomington, IN, 47408, USA
| | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, 719 Indiana Avenue, Indianapolis, IN, 46202, USA. .,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 410 W. 10th Street, Indianapolis, IN, 46202, USA.
| |
Collapse
|
9
|
Dorfer V, Maltsev S, Winkler S, Mechtler K. CharmeRT: Boosting Peptide Identifications by Chimeric Spectra Identification and Retention Time Prediction. J Proteome Res 2018; 17:2581-2589. [PMID: 29863353 PMCID: PMC6079931 DOI: 10.1021/acs.jproteome.7b00836] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Coeluting peptides are still a major challenge for the identification and validation of MS/MS spectra, but carry great potential. To tackle these problems, we have developed the here presented CharmeRT workflow, combining a chimeric spectra identification strategy implemented as part of the MS Amanda algorithm with the validation system Elutator, which incorporates a highly accurate retention time prediction algorithm. For high-resolution data sets this workflow identifies 38-64% chimeric spectra, which results in up to 63% more unique peptides compared to a conventional single search strategy.
Collapse
Affiliation(s)
- Viktoria Dorfer
- Bioinformatics Research Group , University of Applied Sciences Upper Austria , Softwarepark 11 , 4232 Hagenberg , Austria
| | | | - Stephan Winkler
- Bioinformatics Research Group , University of Applied Sciences Upper Austria , Softwarepark 11 , 4232 Hagenberg , Austria
| | | |
Collapse
|
10
|
Madar IH, Ko SI, Kim H, Mun DG, Kim S, Smith RD, Lee SW. Multiplexed Post-Experimental Monoisotopic Mass Refinement (mPE-MMR) to Increase Sensitivity and Accuracy in Peptide Identifications from Tandem Mass Spectra of Cofragmentation. Anal Chem 2017; 89:1244-1253. [PMID: 27966901 PMCID: PMC5627999 DOI: 10.1021/acs.analchem.6b03874] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Mass spectrometry (MS)-based proteomics, which uses high-resolution hybrid mass spectrometers such as the quadrupole-orbitrap mass spectrometer, can yield tens of thousands of tandem mass (MS/MS) spectra of high resolution during a routine bottom-up experiment. Despite being a fundamental and key step in MS-based proteomics, the accurate determination and assignment of precursor monoisotopic masses to the MS/MS spectra remains difficult. The difficulties stem from imperfect isotopic envelopes of precursor ions, inaccurate charge states for precursor ions, and cofragmentation. We describe a composite method of utilizing MS data to assign accurate monoisotopic masses to MS/MS spectra, including those subject to cofragmentation. The method, "multiplexed post-experiment monoisotopic mass refinement" (mPE-MMR), consists of the following: multiplexing of precursor masses to assign multiple monoisotopic masses of cofragmented peptides to the corresponding multiplexed MS/MS spectra, multiplexing of charge states to assign correct charges to the precursor ions of MS/MS spectra with no charge information, and mass correction for inaccurate monoisotopic peak picking. When combined with MS-GF+, a database search algorithm based on fragment mass difference, mPE-MMR effectively increases both sensitivity and accuracy in peptide identification from complex high-throughput proteomics data compared to conventional methods.
Collapse
Affiliation(s)
- Inamul Hasan Madar
- Laboratory of Gaseous Ion Chemistry, Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul 136-701, South Korea
| | - Seung-Ik Ko
- Laboratory of Gaseous Ion Chemistry, Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul 136-701, South Korea
| | - Hokeun Kim
- Laboratory of Gaseous Ion Chemistry, Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul 136-701, South Korea
| | - Dong-Gi Mun
- Laboratory of Gaseous Ion Chemistry, Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul 136-701, South Korea
| | - Sangtae Kim
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, United States
| | - Richard D. Smith
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, United States
| | - Sang-Won Lee
- Laboratory of Gaseous Ion Chemistry, Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul 136-701, South Korea
| |
Collapse
|
11
|
Na S, Payne SH, Bandeira N. Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks. Mol Cell Proteomics 2016; 15:3501-3512. [PMID: 27609420 PMCID: PMC5098046 DOI: 10.1074/mcp.o116.060913] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Indexed: 11/25/2022] Open
Abstract
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.
Collapse
Affiliation(s)
- Seungjin Na
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093.,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093
| | - Samuel H Payne
- ¶Pacific Northwest National Laboratory, Richland, Washington 99354
| | - Nuno Bandeira
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093; .,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093.,‖Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, 92093
| |
Collapse
|
12
|
Tessier D, Lollier V, Larré C, Rogniaux H. Origin of Disagreements in Tandem Mass Spectra Interpretation by Search Engines. J Proteome Res 2016; 15:3481-3488. [PMID: 27571036 DOI: 10.1021/acs.jproteome.6b00024] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Several proteomic database search engines that interpret LC-MS/MS data do not identify the same set of peptides. These disagreements occur even when the scores of the peptide-to-spectrum matches suggest good confidence in the interpretation. Our study shows that these disagreements observed for the interpretations of a given spectrum are almost exclusively due to the variation of what we call the "peptide space", i.e., the set of peptides that are actually compared to the experimental spectra. We discuss the potential difficulties of precisely defining the "peptide space." Indeed, although several parameters that are generally reported in publications can easily be set to the same values, many additional parameters-with much less straightforward user access-might impact the "peptide space" used by each program. Moreover, in a configuration where each search engine identifies the same candidates for each spectrum, the inference of the proteins may remain quite different depending on the false discovery rate selected.
Collapse
Affiliation(s)
- Dominique Tessier
- INRA, UR 1268 Biopolymères Interactions Assemblages, F-44300 Nantes, France
| | - Virginie Lollier
- INRA, UR 1268 Biopolymères Interactions Assemblages, F-44300 Nantes, France
| | - Colette Larré
- INRA, UR 1268 Biopolymères Interactions Assemblages, F-44300 Nantes, France
| | - Hélène Rogniaux
- INRA, UR 1268 Biopolymères Interactions Assemblages, F-44300 Nantes, France
| |
Collapse
|
13
|
Gorshkov V, Hotta SYK, Verano-Braga T, Kjeldsen F. Peptide de novo sequencing of mixture tandem mass spectra. Proteomics 2016; 16:2470-9. [PMID: 27329701 PMCID: PMC5297990 DOI: 10.1002/pmic.201500549] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2015] [Revised: 04/27/2016] [Accepted: 06/17/2016] [Indexed: 02/02/2023]
Abstract
The impact of mixture spectra deconvolution on the performance of four popular de novo sequencing programs was tested using artificially constructed mixture spectra as well as experimental proteomics data. Mixture fragmentation spectra are recognized as a limitation in proteomics because they decrease the identification performance using database search engines. De novo sequencing approaches are expected to be even more sensitive to the reduction in mass spectrum quality resulting from peptide precursor co‐isolation and thus prone to false identifications. The deconvolution approach matched complementary b‐, y‐ions to each precursor peptide mass, which allowed the creation of virtual spectra containing sequence specific fragment ions of each co‐isolated peptide. Deconvolution processing resulted in equally efficient identification rates but increased the absolute number of correctly sequenced peptides. The improvement was in the range of 20–35% additional peptide identifications for a HeLa lysate sample. Some correct sequences were identified only using unprocessed spectra; however, the number of these was lower than those where improvement was obtained by mass spectral deconvolution. Tight candidate peptide score distribution and high sensitivity to small changes in the mass spectrum introduced by the employed deconvolution method could explain some of the missing peptide identifications.
Collapse
Affiliation(s)
- Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark Odense M, Odense, Denmark.
| | | | - Thiago Verano-Braga
- Department of Biochemistry and Molecular Biology, University of Southern Denmark Odense M, Odense, Denmark.,Department of Physiology and Biophysics, Federal University of Minas Gerais Belo Horizonte - MG, Belo Horizonte, Brazil
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark Odense M, Odense, Denmark
| |
Collapse
|
14
|
Chang C, Zhang J, Xu C, Zhao Y, Ma J, Chen T, He F, Xie H, Zhu Y. Quantitative and In-Depth Survey of the Isotopic Abundance Distribution Errors in Shotgun Proteomics. Anal Chem 2016; 88:6844-51. [DOI: 10.1021/acs.analchem.6b01409] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Cheng Chang
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, National
Engineering Research Center for Protein Drugs, National Center for
Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, Beijing, 102206, P.R. China
| | - Jiyang Zhang
- Department
of Automatic Control, College of Mechanical Engineering and Automation, National University of Defense Technology, Changsha, Hunan 410073, P.R. China
| | - Changming Xu
- Department
of Automatic Control, College of Mechanical Engineering and Automation, National University of Defense Technology, Changsha, Hunan 410073, P.R. China
| | - Yan Zhao
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, National
Engineering Research Center for Protein Drugs, National Center for
Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, Beijing, 102206, P.R. China
| | - Jie Ma
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, National
Engineering Research Center for Protein Drugs, National Center for
Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, Beijing, 102206, P.R. China
| | - Tao Chen
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, National
Engineering Research Center for Protein Drugs, National Center for
Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, Beijing, 102206, P.R. China
| | - Fuchu He
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, National
Engineering Research Center for Protein Drugs, National Center for
Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, Beijing, 102206, P.R. China
| | - Hongwei Xie
- Department
of Automatic Control, College of Mechanical Engineering and Automation, National University of Defense Technology, Changsha, Hunan 410073, P.R. China
| | - Yunping Zhu
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, National
Engineering Research Center for Protein Drugs, National Center for
Protein Sciences (Beijing), Beijing Institute of Radiation Medicine, Beijing, 102206, P.R. China
| |
Collapse
|
15
|
Proteomics Is Analytical Chemistry: Fitness-for-Purpose in the Application of Top-Down and Bottom-Up Analyses. Proteomes 2015; 3:440-453. [PMID: 28248279 PMCID: PMC5217385 DOI: 10.3390/proteomes3040440] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2015] [Revised: 11/21/2015] [Accepted: 11/26/2015] [Indexed: 11/17/2022] Open
Abstract
Molecular mechanisms underlying health and disease function at least in part based on the flexibility and fine-tuning afforded by protein isoforms and post-translational modifications. The ability to effectively and consistently resolve these protein species or proteoforms, as well as assess quantitative changes is therefore central to proteomic analyses. Here we discuss the pros and cons of currently available and developing analytical techniques from the perspective of the full spectrum of available tools and their current applications, emphasizing the concept of fitness-for-purpose in experimental design based on consideration of sample size and complexity; this necessarily also addresses analytical reproducibility and its variance. Data quality is considered the primary criterion, and we thus emphasize that the standards of Analytical Chemistry must apply throughout any proteomic analysis.
Collapse
|
16
|
Abshiru N, Caron-Lizotte O, Rajan RE, Jamai A, Pomies C, Verreault A, Thibault P. Discovery of protein acetylation patterns by deconvolution of peptide isomer mass spectra. Nat Commun 2015; 6:8648. [PMID: 26468920 PMCID: PMC4667697 DOI: 10.1038/ncomms9648] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 09/16/2015] [Indexed: 01/29/2023] Open
Abstract
Protein post-translational modifications (PTMs) play important roles in the control of various biological processes including protein–protein interactions, epigenetics and cell cycle regulation. Mass spectrometry-based proteomics approaches enable comprehensive identification and quantitation of numerous types of PTMs. However, the analysis of PTMs is complicated by the presence of indistinguishable co-eluting isomeric peptides that result in composite spectra with overlapping features that prevent the identification of individual components. In this study, we present Iso-PeptidAce, a novel software tool that enables deconvolution of composite MS/MS spectra of isomeric peptides based on features associated with their characteristic fragment ion patterns. We benchmark Iso-PeptidAce using dilution series prepared from mixtures of known amounts of synthetic acetylated isomers. We also demonstrate its applicability to different biological problems such as the identification of site-specific acetylation patterns in histones bound to chromatin assembly factor-1 and profiling of histone acetylation in cells treated with different classes of HDAC inhibitors. Deciphering patterns of histone modifications that modulate chromatin structure and function is important, but remains challenging. Here the authors describe a method to uncover patterns of site-specific histone acetylation by deconvolution of overlapping peptide isomer mass spectra.
Collapse
Affiliation(s)
- Nebiyu Abshiru
- Department of Chemistry, Université de Montréal, PO Box 6128, Station centre-ville, Montréal, Québec, Canada H3C 3J7.,Institute for Research in Immunology and Cancer, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, Canada H3C 3J7
| | - Olivier Caron-Lizotte
- Institute for Research in Immunology and Cancer, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, Canada H3C 3J7
| | - Roshan Elizabeth Rajan
- Institute for Research in Immunology and Cancer, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, Canada H3C 3J7.,Molecular Biology Programme, Université de Montréal, PO Box 6128, Station centre-ville, Montréal, Québec, Canada H3C 3J7
| | - Adil Jamai
- Institute for Research in Immunology and Cancer, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, Canada H3C 3J7
| | - Christelle Pomies
- Institute for Research in Immunology and Cancer, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, Canada H3C 3J7
| | - Alain Verreault
- Department of Chemistry, Université de Montréal, PO Box 6128, Station centre-ville, Montréal, Québec, Canada H3C 3J7.,Molecular Biology Programme, Université de Montréal, PO Box 6128, Station centre-ville, Montréal, Québec, Canada H3C 3J7
| | - Pierre Thibault
- Department of Chemistry, Université de Montréal, PO Box 6128, Station centre-ville, Montréal, Québec, Canada H3C 3J7.,Institute for Research in Immunology and Cancer, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, Canada H3C 3J7
| |
Collapse
|
17
|
Ting YS, Egertson JD, Payne SH, Kim S, MacLean B, Käll L, Aebersold R, Smith RD, Noble WS, MacCoss MJ. Peptide-Centric Proteome Analysis: An Alternative Strategy for the Analysis of Tandem Mass Spectrometry Data. Mol Cell Proteomics 2015. [PMID: 26217018 DOI: 10.1074/mcp.o114.047035] [Citation(s) in RCA: 129] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
In mass spectrometry-based bottom-up proteomics, data-independent acquisition is an emerging technique because of its comprehensive and unbiased sampling of precursor ions. However, current data-independent acquisition methods use wide precursor isolation windows, resulting in cofragmentation and complex mixture spectra. Thus, conventional database searching tools that identify peptides by interpreting individual tandem MS spectra are inherently limited in analyzing data-independent acquisition data. Here we discuss an alternative approach, peptide-centric analysis, which tests directly for the presence and absence of query peptides. We discuss how peptide-centric analysis resolves some limitations of traditional spectrum-centric analysis, and we outline the unique characteristics of peptide-centric analysis in general.
Collapse
Affiliation(s)
- Ying S Ting
- From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington
| | - Jarrett D Egertson
- From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington
| | - Samuel H Payne
- §Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington
| | - Sangtae Kim
- §Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington
| | - Brendan MacLean
- From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington
| | - Lukas Käll
- ¶Science for Life Laboratory, Royal Institute of Technology (KTH), Stockholm, Sweden
| | - Ruedi Aebersold
- ‖Department of Biology, Institute of Molecular Systems Biology, Swiss Federal Institute of Technology (ETH) Zurich, Zurich, Switzerland; ‡‡Faculty of Science, University of Zurich, Zurich, Switzerland
| | - Richard D Smith
- §Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington
| | - William Stafford Noble
- From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington; **Department of Computer Science and Engineering, University of Washington, Seattle, Washington
| | - Michael J MacCoss
- From the ‡Department of Genome Sciences, University of Washington, Seattle, Washington;
| |
Collapse
|
18
|
Gorshkov V, Verano-Braga T, Kjeldsen F. SuperQuant: A Data Processing Approach to Increase Quantitative Proteome Coverage. Anal Chem 2015; 87:6319-27. [PMID: 25978296 DOI: 10.1021/acs.analchem.5b01166] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
SuperQuant is a quantitative proteomics data processing approach that uses complementary fragment ions to identify multiple coisolated peptides in tandem mass spectra allowing for their quantification. This approach can be applied to any shotgun proteomics data set acquired with high mass accuracy for quantification at the MS(1) level. The SuperQuant approach was developed and implemented as a processing node within the Thermo Proteome Discoverer 2.x. The performance of the developed approach was tested using dimethyl-labeled HeLa lysate samples having a ratio between channels of 10(heavy):4(medium):1(light). Peptides were fragmented with collision-induced dissociation using isolation windows of 1, 2, and 4 Th while recording data both with high-resolution and low-resolution. The results obtained using SuperQuant were compared to those using the conventional ion trap-based approach (low mass accuracy MS(2) spectra), which is known to achieve high identification performance. Compared to the common high-resolution approach, the SuperQuant approach identifies up to 70% more peptide-spectrum matches (PSMs), 40% more peptides, and 20% more proteins at the 0.01 FDR level. It identifies more PSMs and peptides than the ion trap-based approach. Improvements in identifications resulted in up to 10% more PSMs, 15% more peptides, and 10% more proteins quantified on the same raw data. The developed approach does not affect the accuracy of the quantification and observed coefficients of variation between replicates of the same proteins were close to the values typical for other precursor ion-based quantification methods. The raw data is deposited to ProteomeXchange (PXD001907). The developed node is available for testing at https://github.com/caetera/SuperQuantNode.
Collapse
Affiliation(s)
- Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark
| | - Thiago Verano-Braga
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark
| |
Collapse
|
19
|
Egertson JD, MacLean B, Johnson R, Xuan Y, MacCoss MJ. Multiplexed peptide analysis using data-independent acquisition and Skyline. Nat Protoc 2015; 10:887-903. [PMID: 25996789 DOI: 10.1038/nprot.2015.055] [Citation(s) in RCA: 156] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Here we describe the use of data-independent acquisition (DIA) on a Q-Exactive mass spectrometer for the detection and quantification of peptides in complex mixtures using the Skyline Targeted Proteomics Environment (freely available online at http://skyline.maccosslab.org). The systematic acquisition of mass spectrometry (MS) or tandem MS (MS/MS) spectra by DIA is in contrast to DDA, in which the acquired MS/MS spectra are only suitable for the identification of a stochastically sampled set of peptides. Similarly to selected reaction monitoring (SRM), peptides can be quantified from DIA data using targeted chromatogram extraction. Unlike SRM, data acquisition is not constrained to a predetermined set of target peptides. In this protocol, a spectral library is generated using data-dependent acquisition (DDA), and chromatograms are extracted from the DIA data for all peptides in the library. As in SRM, quantification using DIA data is based on the area under the curve of extracted MS/MS chromatograms. In addition, a quality control (QC) method suitable for DIA based on targeted MS/MS acquisition is detailed. Not including time spent acquiring data, and time for database searching, the procedure takes ∼1-2 h to complete. Typically, data acquisition requires roughly 1-4 h per sample, and a database search will take 0.5-2 h to complete.
Collapse
Affiliation(s)
- Jarrett D Egertson
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Brendan MacLean
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Richard Johnson
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Yue Xuan
- Thermo Fisher Scientific (Bremen) GmbH, Bremen, Germany
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| |
Collapse
|