1
|
Chen Y, Du Z, Zhao H, Fang W, Liu T, Zhang Y, Zhang W, Qin W. SPPUSM: An MS/MS spectra merging strategy for improved low-input and single-cell proteome identification. Anal Chim Acta 2023; 1279:341793. [PMID: 37827637 DOI: 10.1016/j.aca.2023.341793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Revised: 08/26/2023] [Accepted: 09/06/2023] [Indexed: 10/14/2023]
Abstract
Single and rare cell analysis provides unique insights into the investigation of biological processes and disease progress by resolving the cellular heterogeneity that is masked by bulk measurements. Although many efforts have been made, the techniques used to measure the proteome in trace amounts of samples or in single cells still lag behind those for DNA and RNA due to the inherent non-amplifiable nature of proteins and the sensitivity limitation of current mass spectrometry. Here, we report an MS/MS spectra merging strategy termed SPPUSM (same precursor-produced unidentified spectra merging) for improved low-input and single-cell proteome data analysis. In this method, all the unidentified MS/MS spectra from multiple test files are first extracted. Then, the corresponding MS/MS spectra produced by the same precursor ion from different files are matched according to their precursor mass and retention time (RT) and are merged into one new spectrum. The newly merged spectra with more fragment ions are next searched against the database to increase the MS/MS spectra identification and proteome coverage. Further improvement can be achieved by increasing the number of test files and spectra to be merged. Up to 18.2% improvement in protein identification was achieved for 1 ng HeLa peptides by SPPUSM. Reliability evaluation by the "entrapment database" strategy using merged spectra from human and E. coli revealed a marginal error rate for the proposed method. For application in single cell proteome (SCP) study, identification enhancement of 28%-61% was achieved for proteins for different SCP data. Furthermore, a lower abundance was found for the SPPUSM-identified peptides, indicating its potential for more sensitive low sample input and SCP studies.
Collapse
Affiliation(s)
- Yongle Chen
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China
| | - Zhuokun Du
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China
| | - Hongxian Zhao
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China
| | - Wei Fang
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China
| | - Tong Liu
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China
| | - Yangjun Zhang
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China
| | - Wanjun Zhang
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China; College of Chemistry and Materials Science, Hebei University, Baoding, 071002, China
| | - Weijie Qin
- State Key Laboratory of Proteomics, Beijing Institute of Lifeomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing, 102206, PR China; College of Chemistry and Materials Science, Hebei University, Baoding, 071002, China.
| |
Collapse
|
2
|
Muth T, Renard BY. Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019; 19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open
Abstract
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
Collapse
Affiliation(s)
- Thilo Muth
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Bernhard Y Renard
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
3
|
Guo C, Guo XF, Zhao L, Chen DD, Wang J, Sun J. A Study on Immonium Ions and Immonium-Related Ions Depending on Different Collision Energies as Assessed by Q-TOF MS. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2018. [DOI: 10.1134/s1068162018040088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
4
|
Berberich MJ, Paulo JA, Everley RA. MS3-IDQ: Utilizing MS3 Spectra beyond Quantification Yields Increased Coverage of the Phosphoproteome in Isobaric Tag Experiments. J Proteome Res 2018; 17:1741-1747. [PMID: 29461835 DOI: 10.1021/acs.jproteome.8b00006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein phosphorylation is critically important for many cellular processes, including progression through the cell cycle, cellular metabolism, and differentiation. Isobaric labeling, for example, tandem mass tags (TMT), in phosphoproteomics workflows enables both relative and absolute quantitation of these phosphorylation events. Traditional TMT workflows identify peptides using fragment ions at the MS2 level and quantify reporter ions at the MS3 level. However, in addition to the TMT reporter ions, MS3 spectra also include fragment ions that can be used to identify peptides. Here we describe using MS3 spectra for both phosphopeptide identification and quantification, a process that we term MS3-IDQ. To maximize quantified phosphopeptides, we optimize several instrument parameters, including the modality of mass analyzer (i.e., ion trap or Orbitrap), MS2 automatic gain control (AGC), and MS3 normalized collision energy (NCE), to achieve the best balance of identified and quantified peptides. Our optimized MS3-IDQ method included the following parameters for the MS3 scan: NCE = 37.5 and AGC target = 1.5 × 105, and scan range = 100-2000. Data from the MS3 scan were complementary to those of the MS2 scan, and the combination of these scans can increase phosphoproteome coverage by >50%, thereby yielding a greater number of quantified and accurately localized phosphopeptides.
Collapse
Affiliation(s)
- Matthew J Berberich
- Laboratory of Systems Pharmacology , Harvard Medical School , Boston , Massachusetts 02115 , United States
| | - Joao A Paulo
- Department of Cell Biology , Harvard Medical School , Boston , Massachusetts 02115 , United States
| | - Robert A Everley
- Laboratory of Systems Pharmacology , Harvard Medical School , Boston , Massachusetts 02115 , United States.,Department of Cell Biology , Harvard Medical School , Boston , Massachusetts 02115 , United States
| |
Collapse
|