1
|
Yang Y, Lin L, Qiao L. Deep learning approaches for data-independent acquisition proteomics. Expert Rev Proteomics 2021; 18:1031-1043. [PMID: 34918987 DOI: 10.1080/14789450.2021.2020654] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
INTRODUCTION Data-independent acquisition (DIA) is an emerging technology for large-scale proteomic studies. DIA data analysis methods are evolving rapidly, and deep learning has cut a conspicuous figure in this field. AREAS COVERED This review discusses and provides an overview of the deep learning methods that are used for DIA data analysis, including spectral library prediction, feature scoring, and statistical control in peptide-centric analysis, as well as de novo peptide sequencing. Literature searches were performed for articles, including preprints, up to December 2021 from PubMed, Scopus, and Web of Science databases. EXPERT OPINION While spectral library prediction has broken through the limitation on proteome coverage of experimental libraries, the statistical burden due to the large query space is the remaining challenge of utilizing proteome-wide predicted libraries. Analysis of post-translational modifications is another promising direction of deep learning-based DIA methods.
Collapse
Affiliation(s)
- Yi Yang
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| | - Ling Lin
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| | - Liang Qiao
- Department of Chemistry, Shanghai Stomatological Hospital, and Minhang Hospital, Fudan University, Shanghai China
| |
Collapse
|
2
|
Parker R, Tailor A, Peng X, Nicastri A, Zerweck J, Reimer U, Wenschuh H, Schnatbaum K, Ternette N. The Choice of Search Engine Affects Sequencing Depth and HLA Class I Allele-Specific Peptide Repertoires. Mol Cell Proteomics 2021; 20:100124. [PMID: 34303857 PMCID: PMC8724928 DOI: 10.1016/j.mcpro.2021.100124] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 07/09/2021] [Accepted: 07/12/2021] [Indexed: 12/26/2022] Open
Abstract
Standardization of immunopeptidomics experiments across laboratories is a pressing issue within the field, and currently a variety of different methods for sample preparation and data analysis tools are applied. Here, we compared different software packages to interrogate immunopeptidomics datasets and found that Peaks reproducibly reports substantially more peptide sequences (~30-70%) compared with Maxquant, Comet, and MS-GF+ at a global false discovery rate (FDR) of <1%. We noted that these differences are driven by search space and spectral ranking. Furthermore, we observed differences in the proportion of peptides binding the human leukocyte antigen (HLA) alleles present in the samples, indicating that sequence-related differences affected the performance of each tested engine. Utilizing data from single HLA allele expressing cell lines, we observed significant differences in amino acid frequency among the peptides reported, with a broadly higher representation of hydrophobic amino acids L, I, P, and V reported by Peaks. We validated these results using data generated with a synthetic library of 2000 HLA-associated peptides from four common HLA alleles with distinct anchor residues. Our investigation highlights that search engines create a bias in peptide sequence depth and peptide amino acid composition, and resulting data should be interpreted with caution.
Collapse
Affiliation(s)
- Robert Parker
- Nuffield Department of Medicine, Centre for Cellar and Medical Physiology, University of Oxford, Oxford, UK.
| | - Arun Tailor
- Nuffield Department of Medicine, Centre for Cellar and Medical Physiology, University of Oxford, Oxford, UK
| | - Xu Peng
- Nuffield Department of Medicine, Centre for Cellar and Medical Physiology, University of Oxford, Oxford, UK
| | - Annalisa Nicastri
- Nuffield Department of Medicine, Centre for Cellar and Medical Physiology, University of Oxford, Oxford, UK
| | | | - Ulf Reimer
- JPT Peptide Technologies GmbH, Berlin, Germany
| | | | | | - Nicola Ternette
- Nuffield Department of Medicine, Centre for Cellar and Medical Physiology, University of Oxford, Oxford, UK.
| |
Collapse
|
3
|
Guan S, Bythell BJ. Size Dependent Fragmentation Chemistry of Short Doubly Protonated Tryptic Peptides. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2021; 32:1020-1032. [PMID: 33779179 DOI: 10.1021/jasms.1c00009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Tandem mass spectrometry of electrospray ionized multiply charged peptide ions is commonly used to identify the sequence of peptide(s) and infer the identity of source protein(s). Doubly protonated peptide ions are consistently the most efficiently sequenced ions following collision-induced dissociation of peptides generated by tryptic digestion. While the broad characteristics of longer (N ≥ 8 residue) doubly protonated peptides have been investigated, there is comparatively little data on shorter systems where charge repulsion should exhibit the greatest influence on the dissociation chemistry. To address this gap and further understand the chemistry underlying collisional-dissociation of doubly charged tryptic peptides, two series of analytes ([GxR+2H]2+ and [AxR+2H]2+, x = 2-5) were investigated experimentally and with theory. We find distinct differences in the preference of bond cleavage sites for these peptides as a function of size and to a lesser extent composition. Density functional calculations at two levels of theory predict that the threshold relative energies required for bond cleavages at the same site for peptides of different size are quite similar (for example, b2-yN-2). In isolation, this finding is inconsistent with experiment. However, the predicted extent of entropy change of these reactions is size dependent. Subsequent RRKM rate constant calculations provide a far clearer picture of the kinetics of the competing bond cleavage reactions enabling rationalization of experimental findings. The M06-2X data were substantially more consistent with experiment than were the B3LYP data.
Collapse
Affiliation(s)
- Shanshan Guan
- Department of Chemistry and Biochemistry, Ohio University, 307 Chemistry Building, Athens, Ohio 45701, United States
- Department of Chemistry and Biochemistry, University of Missouri-St. Louis, 1 University Boulevard, St. Louis, Missouri 63121, United States
| | - Benjamin J Bythell
- Department of Chemistry and Biochemistry, Ohio University, 307 Chemistry Building, Athens, Ohio 45701, United States
- Department of Chemistry and Biochemistry, University of Missouri-St. Louis, 1 University Boulevard, St. Louis, Missouri 63121, United States
| |
Collapse
|
4
|
|
5
|
Wen B, Zeng W, Liao Y, Shi Z, Savage SR, Jiang W, Zhang B. Deep Learning in Proteomics. Proteomics 2020; 20:e1900335. [PMID: 32939979 PMCID: PMC7757195 DOI: 10.1002/pmic.201900335] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 09/14/2020] [Indexed: 12/17/2022]
Abstract
Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.
Collapse
Affiliation(s)
- Bo Wen
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen‐Feng Zeng
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)Chinese Academy of SciencesInstitute of Computing TechnologyBeijing100190China
| | - Yuxing Liao
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Zhiao Shi
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Sara R. Savage
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen Jiang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Bing Zhang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| |
Collapse
|
6
|
Gabriels R, Martens L, Degroeve S. Updated MS²PIP web server delivers fast and accurate MS² peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques. Nucleic Acids Res 2020; 47:W295-W299. [PMID: 31028400 PMCID: PMC6602496 DOI: 10.1093/nar/gkz299] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 04/14/2019] [Accepted: 04/24/2019] [Indexed: 12/13/2022] Open
Abstract
MS²PIP is a data-driven tool that accurately predicts peak intensities for a given peptide's fragmentation mass spectrum. Since the release of the MS²PIP web server in 2015, we have brought significant updates to both the tool and the web server. In addition to the original models for CID and HCD fragmentation, we have added specialized models for the TripleTOF 5600+ mass spectrometer, for TMT-labeled peptides, for iTRAQ-labeled peptides, and for iTRAQ-labeled phosphopeptides. Because the fragmentation pattern is heavily altered in each of these cases, these additional models greatly improve the prediction accuracy for their corresponding data types. We have also substantially reduced the computational resources required to run MS²PIP, and have completely rebuilt the web server, which now allows predictions of up to 100 000 peptide sequences in a single request. The MS²PIP web server is freely available at https://iomics.ugent.be/ms2pip/.
Collapse
Affiliation(s)
- Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, A. Baertsoenkaai 3, B9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, A. Baertsoenkaai 3, B9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, A. Baertsoenkaai 3, B9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
7
|
Chen J, Shiyanov P, Green KB. Top-down mass spectrometry of intact phosphorylated β-casein: Correlation between the precursor charge state and internal fragments. JOURNAL OF MASS SPECTROMETRY : JMS 2019; 54:527-539. [PMID: 30997701 PMCID: PMC6779312 DOI: 10.1002/jms.4364] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 03/25/2019] [Accepted: 04/11/2019] [Indexed: 05/12/2023]
Abstract
Phosphorylated proteins play essential roles in many cellular processes, and identification and characterization of the relevant phosphoproteins can help to understand underlying mechanisms. Herein, we report a collision-induced dissociation top-down approach for characterizing phosphoproteins on a quadrupole time-of-flight mass spectrometer. β-casein, a protein with two major isoforms and five phosphorylatable serine residues, was used as a model. Peaks corresponding to intact β-casein ions with charged states up to 36+ were detected. Tandem mass spectrometry was performed on β-casein ions of different charge states (12+ , and 15+ to 28+ ) in order to determine the effects of charge state on dissociation of this protein. Most of the abundant fragments corresponded to y, b ions, and internal fragments caused by cleavage of the N-terminal amide bond adjacent to proline residues (Xxx-Pro). The abundance of internal fragments increased with the charge state of the protein precursor ion; these internal fragments predominantly arose from one or two Xxx-Pro cleavage events and were difficult to accurately assign. The presence of abundant sodium adducts of β-casein further complicated the spectra. Our results suggest that when interpreting top-down mass spectra of phosphoproteins and other proteins, researchers should consider the potential formation of internal fragments and sodium adducts for reliable characterization.
Collapse
Affiliation(s)
- Jianzhong Chen
- Department of Optometry and Vision Science; University of Alabama at Birmingham; Birmingham, AL, 35294
- Applied Biotechnology Branch; Air Force Research Laboratory; Dayton, OH 45433, USA
- Mass Spectrometry and Proteomics Facility; The Ohio State University; Columbus, OH 43210, USA
- Corresponding author: Jianzhong Chen, Ph.D., Department of Optometry and Vision Science, University of Alabama at Birmingham, Birmingham, AL, USA; ; Phone: 205.934.8230
| | - Pavel Shiyanov
- Applied Biotechnology Branch; Air Force Research Laboratory; Dayton, OH 45433, USA
| | - Kari B Green
- Mass Spectrometry and Proteomics Facility; The Ohio State University; Columbus, OH 43210, USA
| |
Collapse
|
8
|
Lee H, Cuthbertson DJ, Otter DE, Barile D. Rapid Screening of Bovine Milk Oligosaccharides in a Whey Permeate Product and Domestic Animal Milks by Accurate Mass Database and Tandem Mass Spectral Library. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2016; 64:6364-74. [PMID: 27428379 PMCID: PMC5832056 DOI: 10.1021/acs.jafc.6b02039] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
A bovine milk oligosaccharide (BMO) library, prepared from cow colostrum, with 34 structures was generated and used to rapidly screen oligosaccharides in domestic animal milks and a whey permeate powder. The novel library was entered into a custom Personal Compound Database and Library (PCDL) and included accurate mass, retention time, and tandem mass spectra. Oligosaccharides in minute-sized samples were separated using nanoliquid chromatography (nanoLC) coupled to a high resolution and sensitive quadrupole-Time of Flight (Q-ToF) MS system. Using the PCDL, 18 oligosaccharides were found in a BMO-enriched product obtained from whey permeate processing. The usefulness of the analytical system and BMO library was further validated using milks from domestic sheep and buffaloes. Through BMO PCDL searching, 15 and 13 oligosaccharides in the BMO library were assigned in sheep and buffalo milks, respectively, thus demonstrating significant overlap between oligosaccharides in bovine (cow and buffalo) and ovine (sheep) milks. This method was shown to be an efficient, reliable, and rapid tool to identify oligosaccharide structures using automated spectral matching.
Collapse
Affiliation(s)
- Hyeyoung Lee
- Department of Food Science and Technology, University of California-Davis, Davis, California 95616, United States
| | | | - Don E. Otter
- School of Chemical Sciences, University of Auckland, Auckland 1142, New Zealand
| | - Daniela Barile
- Department of Food Science and Technology, University of California-Davis, Davis, California 95616, United States
- Foods for Health Institute, University of California-Davis, Davis, California 95616, United States
- Corresponding Author: Tel: +1-530-752-0976. Fax: +1-530-752-4759.
| |
Collapse
|
9
|
Koehbach J, Gruber CW, Becker C, Kreil DP, Jilek A. MALDI TOF/TOF-Based Approach for the Identification of d- Amino Acids in Biologically Active Peptides and Proteins. J Proteome Res 2016; 15:1487-96. [PMID: 26985971 PMCID: PMC4861975 DOI: 10.1021/acs.jproteome.5b01067] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
![]()
Several
biologically active peptides contain a d- amino
acid in a well-defined position, which is position 2 in all peptide
epimers isolated to date from vertebrates and also some from invertebrates.
The detection of such D- residues by standard analytical
techniques is challenging. In tandem mass spectrometric (MS) analysis,
although fragment masses are the same for all stereoisomers, peak
intensities are known to depend on chirality. Here, we observe that
the effect of a d- amino acid in the second N-terminal position
on the fragmentation pattern in matrix assisted laser desorption time-of-flight
spectrometry (MALDI-TOF/TOF MS) strongly depends on the peptide sequence.
Stereosensitive fragmentation (SF) is correlated to a neighborhood
effect, but the d- residue also exerts an overall effect
influencing distant bonds. In a fingerprint analysis, multiple peaks
can thus serve to identify the chirality of a sample in short time
and potentially high throughput. Problematic variations between individual
spots could be successfully suppressed by cospotting deuterated analogues
of the epimers. By identifying the [d-Leu2] isomer of the
predicted peptide GH-2 (gene derived bombininH) in skin secretions
of the toad Bombina orientalis, we
demonstrated the analytical power of SF-MALDI-TOF/TOF measurements.
In conclusion, SF-MALDI-TOF/TOF MS combines high sensitivity, versatility,
and the ability to complement other methods.
Collapse
Affiliation(s)
- Johannes Koehbach
- Centre for Physiology and Pharmacology, Medical University of Vienna , Schwarzspanierstraße 17, A-1090 Vienna, Austria.,School of Biomedical Sciences, The University of Queensland , Brisbane, QLD, 4072 Australia
| | - Christian W Gruber
- Centre for Physiology and Pharmacology, Medical University of Vienna , Schwarzspanierstraße 17, A-1090 Vienna, Austria
| | - Christian Becker
- Institute of Biological Chemistry, Department of Chemistry, University of Vienna , Währinger Straße 38, A-1090 Vienna, Austria
| | - David P Kreil
- Chair of Bioinformatics, University of Natural Resources and Life Sciences , Muthgasse 18, A-1190 Vienna, Austria
| | - Alexander Jilek
- Institute of Biological Chemistry, Department of Chemistry, University of Vienna , Währinger Straße 38, A-1090 Vienna, Austria.,Chair of Bioinformatics, University of Natural Resources and Life Sciences , Muthgasse 18, A-1190 Vienna, Austria
| |
Collapse
|
10
|
Extracting high confidence protein interactions from affinity purification data: at the crossroads. J Proteomics 2015; 118:63-80. [PMID: 25782749 DOI: 10.1016/j.jprot.2015.03.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Revised: 02/27/2015] [Accepted: 03/09/2015] [Indexed: 02/06/2023]
Abstract
UNLABELLED Deriving protein-protein interactions from data generated by affinity-purification and mass spectrometry (AP-MS) techniques requires application of scoring methods to measure the reliability of detected putative interactions. Choosing the appropriate scoring method has become a major challenge. Here we apply six popular scoring methods to the same AP-MS dataset and compare their performance. The comparison was carried out for six distinct datasets from human, fly and yeast, which focus on different biological processes and differ in their coverage of the proteome. Results show that the performance of a given scoring method may vary substantially depending on the dataset. Disturbingly, we find that the high confidence (HC) PPI networks built by applying the six scoring methods to the same raw AP-MS dataset display very poor overlap, with only 1.7-4.1% of the HC interactions present in all the networks built, respectively, from the proteome-wide human, fly or yeast datasets. Various properties of the shared versus unique interactions in each network, including biases in protein abundance, suggest that current scoring methods are able to eliminate only the most obvious contaminants, but still fail to reliably single out specific interactions from the large body of spurious associations detected in the AP-MS experiments. BIOLOGICAL SIGNIFICANCE The fast progress in AP-MS techniques has prompted the development of a multitude of scoring methods, which are relied upon to remove contaminants and non-specific binders. Choosing the appropriate scoring scheme for a given AP-MS dataset has become a major challenge. The comparative analysis of 6 of the most popular scoring methods, presented here, reveals that overall these methods do not perform as expected. Evidence is provided that this is due to 3 closely related issues: the high 'noise' levels of the raw AP-MS data, the limited capacity of current scoring methods to deal with such high noise levels, and the biases introduced using Gold Standard datasets to benchmark the scoring functions and threshold the networks. For the field to move forward, all three issues will have to be addressed. This article is part of a Special Issue entitled: Protein dynamics in health and disease. Guest Editors: Pierre Thibault and Anne-Claude Gingras.
Collapse
|
11
|
Pipil S, Rawat VS, Sharma L, Sehgal N. Characterization of incomplete vitellogenin (VgC) in the Indian freshwater murrel, Channa punctatus (Bloch). FISH PHYSIOLOGY AND BIOCHEMISTRY 2015; 41:107-117. [PMID: 25389068 DOI: 10.1007/s10695-014-0009-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 10/31/2014] [Indexed: 06/04/2023]
Abstract
A novel incomplete vitellogenin (VgC) was purified from the plasma of estradiol-treated male murrel, Channa punctatus, by gel filtration chromatography. The native mass of VgC protein was 180 kDa, and it resolved as a single peptide of 100 kDa on SDS-PAGE. The peptide on subjecting to matrix-assisted laser desorption/ionization-time of flight produced a peptide mass fingerprint. On tandem mass spectrometry, some of these peptides showed mass to charge (m/z) ratio and amino acid sequence similarity with VgC peptides of other teleosts. Phylogenetic analysis revealed a similarity of murrel VgC with fish species of the order Perciformes. Semi-quantitative RT-PCR assay was developed to study expression of vgc gene at variable levels of estradiol exposure. Presence of VgC in males indicates that fish has been exposed to estrogens; hence, it can be used as a biomarker for estrogenic exposure.
Collapse
Affiliation(s)
- S Pipil
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | | | | | | |
Collapse
|
12
|
Wang B, Yu J, Wang H, Wei Z, Guo X, Xiao Z, Zeng Z, Kong W. Investigation of bn-44 peptide fragments using high resolution mass spectrometry and isotope labeling. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2014; 25:2116-2124. [PMID: 25280401 DOI: 10.1007/s13361-014-0994-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Revised: 08/20/2014] [Accepted: 08/25/2014] [Indexed: 06/03/2023]
Abstract
An N-terminal deuterohemin-containing hexapeptide (DhHP-6) was designed as a short peptide cytochrome c (Cyt c) mimetic to study the effect of N-terminal charge on peptide fragmentation pathways. This peptide gave different dissociation patterns than normal tryptic peptides. Upon collision-induced dissociation (CID) with an ion trap mass spectrometer, the singly charged peptide ion containing no added proton generated abundant and characteristic bn-44 ions instead of bn-28 (an) ions. Studies by high resolution mass spectrometry (HRMS) and isotope labeling indicate that elimination of 44 Da fragments from b ions occurs via two different pathways: (1) loss of CH3CHO (44.0262) from a Thr side chain; (2) loss of CO2 (43.9898) from the oxazolone structure in the C-terminus. A series of analogues were designed and analyzed. The experimental results combined with Density Functional Theory (DFT) calculations on the proton affinity of the deuteroporphyrin demonstrate that the production of these novel bn-44 ions is related to the N-terminal charge via a charge-remote rather than radical-directed fragmentation pathway.
Collapse
Affiliation(s)
- Bing Wang
- College of Chemistry, Jilin University, Changchun, 130012, China
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Dong NP, Liang YZ, Xu QS, Mok DKW, Yi LZ, Lu HM, He M, Fan W. Prediction of Peptide Fragment Ion Mass Spectra by Data Mining Techniques. Anal Chem 2014; 86:7446-54. [DOI: 10.1021/ac501094m] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
| | | | | | - Daniel K. W. Mok
- Department
of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
- State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), Shenzhen, 518000, P. R. China
| | - Lun-zhao Yi
- Yunnan
Food Safety Research Institute, Kunming University of Science and Technology, Kunming, 650500, P. R. China
| | | | - Min He
- Department of
Pharmaceutical Engineering,
School of Chemical Engineering, Xiangtan University, Xiangtan, 411105, P.R. China
| | - Wei Fan
- College of
Bioscience and Biotechnology, Hunan Agricultural University, Changsha, 410083, P. R. China
| |
Collapse
|
14
|
Kelchtermans P, Bittremieux W, De Grave K, Degroeve S, Ramon J, Laukens K, Valkenborg D, Barsnes H, Martens L. Machine learning applications in proteomics research: how the past can boost the future. Proteomics 2014; 14:353-66. [PMID: 24323524 DOI: 10.1002/pmic.201300289] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2013] [Revised: 09/24/2013] [Accepted: 10/14/2013] [Indexed: 01/22/2023]
Abstract
Machine learning is a subdiscipline within artificial intelligence that focuses on algorithms that allow computers to learn solving a (complex) problem from existing data. This ability can be used to generate a solution to a particularly intractable problem, given that enough data are available to train and subsequently evaluate an algorithm on. Since MS-based proteomics has no shortage of complex problems, and since publicly available data are becoming available in ever growing amounts, machine learning is fast becoming a very popular tool in the field. We here therefore present an overview of the different applications of machine learning in proteomics that together cover nearly the entire wet- and dry-lab workflow, and that address key bottlenecks in experiment planning and design, as well as in data processing and analysis.
Collapse
Affiliation(s)
- Pieter Kelchtermans
- Department of Medical Protein Research, VIB, Ghent, Belgium; Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium; Flemish Institute for Technological Research (VITO), Boeretang, Mol, Belgium
| | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Takayama M, Sekiya S, Iimuro R, Iwamoto S, Tanaka K. Selective and nonselective cleavages in positive and negative CID of the fragments generated from in-source decay of intact proteins in MALDI-MS. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2014; 25:120-131. [PMID: 24135807 DOI: 10.1007/s13361-013-0756-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Revised: 09/13/2013] [Accepted: 09/16/2013] [Indexed: 06/02/2023]
Abstract
Selective and nonselective cleavages in ion trap low-energy collision-induced dissociation (CID) experiments of the fragments generated from in-source decay (ISD) with matrix-assisted laser desorption/ionization mass spectrometry (MALDI MS) of intact proteins are described in both positive and negative ion modes. The MALDI-ISD spectra of the proteins demonstrate common, discontinuous, abundant c- and z'-ions originating from cleavage at the N-Cα bond of Xxx-Asp/Asn and Gly-Xxx residues in both positive- and negative-ion modes. The positive ion CID of the c- and z'-ions resulted in product ions originating from selective cleavage at Asp-Xxx, Glu-Xxx and Cys-Xxx residues. Nonselective cleavage product ions rationalized by the mechanism of a "mobile proton" are also observed in positive ion CID spectra. Negative ion CID of the ISD fragments results in complex product ions accompanied by the loss of neutrals from b-, c-, and y-ions. The most characteristic feature of negative ion CID is selective cleavage of the peptide bonds of acidic residues, Xxx-Asp/Glu/Cys. A definite influence of α-helix on the CID product ions was not obtained. However, the results from positive ion and negative ion CID of the MALDI-ISD fragments that may have long α-helical domains suggest that acidic residues in helix-free regions tend to degrade more than those in helical regions.
Collapse
Affiliation(s)
- Mitsuo Takayama
- Graduate School in Nanobioscience, Mass Spectrometry Laboratory, Yokohama City University, Kanazawa-ku, Yokohama, Japan,
| | | | | | | | | |
Collapse
|
16
|
Schliekelman P, Liu S. Quantifying the effect of competition for detection between coeluting peptides on detection probabilities in mass-spectrometry-based proteomics. J Proteome Res 2013; 13:348-61. [PMID: 24313442 DOI: 10.1021/pr400034z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
There are many factors that contribute to the variation in detection probabilities of proteins in LC-MS/MS experiments, and currently little is known about their relative importance. In this study, we analyze the effect of competition for detection between coeluting peptides on peptide detection probability. Using a novel method for estimating peptide detection probabilities, we show that these probabilities can vary by an order of magnitude between peptides that elute from the liquid chromatograph at the same time as many other peptides and those that elute with fewer other peptides. To explore these results, we use a mathematical model to show that competition for detection between peptides is expected to be a major source of missed detections in complex mixtures because there will be many MS/MS scanning intervals that contain more coeluting peptides than can be subjected to MS/MS analysis. Our data and simulation results show that the number of coeluting peptides is a primary determinant of whether a peptide will be detected. In our data, this had a several-fold larger effect on peptide detection probability than did peptide abundance. Furthermore, the distribution of elution times for the most frequently detected peptides was strongly shifted toward values where there were few coeluting peptides, indicating that the number of coeluting peptides is a major determinant of whether a peptide is proteotypic.
Collapse
Affiliation(s)
- Paul Schliekelman
- Department of Statistics, University of Georgia , 204 Statistics Building, Athens, Georgia 30602, United States
| | | |
Collapse
|
17
|
Guerrero A, Lebrilla CB. New strategies for resolving oligosaccharide isomers by exploiting mechanistic and thermochemical aspects of fragment ion formation. INTERNATIONAL JOURNAL OF MASS SPECTROMETRY 2013; 354-355:10.1016/j.ijms.2013.05.002. [PMID: 24273436 PMCID: PMC3835204 DOI: 10.1016/j.ijms.2013.05.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Three complementary experimental approaches for elucidating human milk oligosaccharide (HMOs) isomers by Fourier Transform Ion Cyclotron Resonance mass spectrometry (FT-ICR) are described: tandem-MS disruption by double resonance to distinguish different fragmentation pathways, examination of fragment intensity ratios arising from differential alkali metal ion affinities and monitoring competitive fragmentation rates. The interpretation of the fragmentation pattern from a mechanistic and thermochemical point of view permits the assignment of not only pure isomers but, in some cases, mixtures of them. Methodologically the procedures are simple, reliable and rapid making unnecessary both the use of previous separation techniques and tedious chemical modifications of the HMOs. In principle, the rationale can be expanded to resolve other isomeric mixtures of biological nature.
Collapse
Affiliation(s)
- Andres Guerrero
- Department of Chemistry, University of California Davis, CA 95616, United States
| | - Carlito B. Lebrilla
- Department of Chemistry, University of California Davis, CA 95616, United States
- Corresponding author
| |
Collapse
|
18
|
Abstract
MOTIVATION Tandem mass spectrometry provides the means to match mass spectrometry signal observations with the chemical entities that generated them. The technology produces signal spectra that contain information about the chemical dissociation pattern of a peptide that was forced to fragment using methods like collision-induced dissociation. The ability to predict these MS(2) signals and to understand this fragmentation process is important for sensitive high-throughput proteomics research. RESULTS We present a new tool called MS(2)PIP for predicting the intensity of the most important fragment ion signal peaks from a peptide sequence. MS(2)PIP pre-processes a large dataset with confident peptide-to-spectrum matches to facilitate data-driven model induction using a random forest regression learning algorithm. The intensity predictions of MS(2)PIP were evaluated on several independent evaluation sets and found to correlate significantly better with the observed fragment-ion intensities as compared with the current state-of-the-art PeptideART tool. AVAILABILITY MS(2)PIP code is available for both training and predicting at http://compomics.com/.
Collapse
Affiliation(s)
- Sven Degroeve
- Department of Medical Protein Research, VIB, Ghent 9000, Belgium and Department of Biochemistry, Ghent University, Ghent 9000, Belgium
| | | |
Collapse
|
19
|
Abstract
Motivation: Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but de novo peptide sequencing algorithms to analyze tandem mass (MS/MS) spectra are lagging behind. Although existing de novo sequencing tools perform well on certain types of spectra [e.g. Collision Induced Dissociation (CID) spectra of tryptic peptides], their performance often deteriorates on other types of spectra, such as Electron Transfer Dissociation (ETD), Higher-energy Collisional Dissociation (HCD) spectra or spectra of non-tryptic digests. Thus, rather than developing a new algorithm for each type of spectra, we develop a universal de novo sequencing algorithm called UniNovo that works well for all types of spectra or even for spectral pairs (e.g. CID/ETD spectral pairs). UniNovo uses an improved scoring function that captures the dependences between different ion types, where such dependencies are learned automatically using a modified offset frequency function. Results: The performance of UniNovo is compared with PepNovo+, PEAKS and pNovo using various types of spectra. The results show that the performance of UniNovo is superior to other tools for ETD spectra and superior or comparable with others for CID and HCD spectra. UniNovo also estimates the probability that each reported reconstruction is correct, using simple statistics that are readily obtained from a small training dataset. We demonstrate that the estimation is accurate for all tested types of spectra (including CID, HCD, ETD, CID/ETD and HCD/ETD spectra of trypsin, LysC or AspN digested peptides). Availability: UniNovo is implemented in JAVA and tested on Windows, Ubuntu and OS X machines. UniNovo is available at http://proteomics.ucsd.edu/Software/UniNovo.html along with the manual. Contact:kwj@ucsd.edu or ppevzner@ucsd.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kyowon Jeong
- Department of Electrical and Computer Engineering and Department of Computer Science and Engineering, University of California-San Diego, CA 92093, USA.
| | | | | |
Collapse
|
20
|
Armean IM, Lilley KS, Trotter MWB. Popular computational methods to assess multiprotein complexes derived from label-free affinity purification and mass spectrometry (AP-MS) experiments. Mol Cell Proteomics 2012; 12:1-13. [PMID: 23071097 DOI: 10.1074/mcp.r112.019554] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Advances in sensitivity, resolution, mass accuracy, and throughput have considerably increased the number of protein identifications made via mass spectrometry. Despite these advances, state-of-the-art experimental methods for the study of protein-protein interactions yield more candidate interactions than may be expected biologically owing to biases and limitations in the experimental methodology. In silico methods, which distinguish between true and false interactions, have been developed and applied successfully to reduce the number of false positive results yielded by physical interaction assays. Such methods may be grouped according to: (1) the type of data used: methods based on experiment-specific measurements (e.g., spectral counts or identification scores) versus methods that extract knowledge encoded in external annotations (e.g., public interaction and functional categorisation databases); (2) the type of algorithm applied: the statistical description and estimation of physical protein properties versus predictive supervised machine learning or text-mining algorithms; (3) the type of protein relation evaluated: direct (binary) interaction of two proteins in a cocomplex versus probability of any functional relationship between two proteins (e.g., co-occurrence in a pathway, sub cellular compartment); and (4) initial motivation: elucidation of experimental data by evaluation versus prediction of novel protein-protein interaction, to be experimentally validated a posteriori. This work reviews several popular computational scoring methods and software platforms for protein-protein interactions evaluation according to their methodology, comparative strengths and weaknesses, data representation, accessibility, and availability. The scoring methods and platforms described include: CompPASS, SAINT, Decontaminator, MINT, IntAct, STRING, and FunCoup. References to related work are provided throughout in order to provide a concise but thorough introduction to a rapidly growing interdisciplinary field of investigation.
Collapse
Affiliation(s)
- Irina M Armean
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, CB2 1GA, UK
| | | | | |
Collapse
|
21
|
Pechan T, Gwaltney SR. Calculations of relative intensities of fragment ions in the MSMS spectra of a doubly charged penta-peptide. BMC Bioinformatics 2012; 13 Suppl 15:S13. [PMID: 23046347 PMCID: PMC3439735 DOI: 10.1186/1471-2105-13-s15-s13] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Currently, the tandem mass spectrometry (MSMS) of peptides is a dominant technique used to identify peptides and consequently proteins. The peptide fragmentation inside the mass analyzer typically offers a spectrum containing several different groups of ions. The mass to charge (m/z) values of these ions can be exactly calculated following simple rules based on the possible peptide fragmentation reactions. But the (relative) intensities of the particular ions cannot be simply predicted from the amino-acid sequence of the peptide. This study presents initial work towards developing a theoretical fundamental approach to ion intensity elucidation by utilizing quantum mechanical computations. METHODS MSMS spectra of the doubly charged GAVLK peptide were collected on electrospray ion trap mass spectrometers using low energy modes of fragmentation. Density functional theory (DFT) calculations were performed on the population of ion precursors to determine the fragment ion intensities corresponding to a Boltzmann distribution of the protonation of nitrogens in the peptide backbone amide bonds. RESULTS We were able to a) predict the y and b ions intensities order in concert with the experimental observation; b) predict relative intensities of y ions with errors not exceeding the experimental variation. CONCLUSIONS These results suggest that the GAVLK peptide fragmentation process in the ion trap mass spectrometer is predominantly driven by the thermodynamic stability of the precursor ions formed upon ionization of the sample. The computational approach presented in this manuscript successfully calculated ion intensities in the mass spectra of this doubly charged tryptic peptide, based solely on its amino acid sequence. As such, this work indicates a potential of incorporating quantum mechanical calculations into mass spectrometry based algorithms for molecular identification.
Collapse
Affiliation(s)
- Tibor Pechan
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi Agricultural and Forestry Experiment Station, High Performance Computing Collaboratory, Mississippi State University, Mississippi State, MS 39762, USA.
| | | |
Collapse
|
22
|
Abstract
Selected reaction monitoring mass spectrometry is an emerging targeted proteomics technology that allows for the investigation of complex protein samples with high sensitivity and efficiency. It requires extensive knowledge about the sample for the many parameters needed to carry out the experiment to be set appropriately. Most studies today rely on parameter estimation from prior studies, public databases, or from measuring synthetic peptides. This is efficient and sound, but in absence of prior data, de novo parameter estimation is necessary. Computational methods can be used to create an automated framework to address this problem. However, the number of available applications is still small. This review aims at giving an orientation on the various bioinformatical challenges. To this end, we state the problems in classical machine learning and data mining terms, give examples of implemented solutions and provide some room for alternatives. This will hopefully lead to an increased momentum for the development of algorithms and serve the needs of the community for computational methods. We note that the combination of such methods in an assisted workflow will ease both the usage of targeted proteomics in experimental studies as well as the further development of computational approaches.
Collapse
Affiliation(s)
- Daniel Reker
- ETH Zurich, Wolfgang-Pauli-Strasse 16, 8093 Zurich, Switzerland
| | | |
Collapse
|
23
|
Spivak M, Bereman MS, Maccoss MJ, Noble WS. Learning score function parameters for improved spectrum identification in tandem mass spectrometry experiments. J Proteome Res 2012; 11:4499-508. [PMID: 22866926 DOI: 10.1021/pr300234m] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The identification of proteins from spectra derived from a tandem mass spectrometry experiment involves several challenges: matching each observed spectrum to a peptide sequence, ranking the resulting collection of peptide-spectrum matches, assigning statistical confidence estimates to the matches, and identifying the proteins. The present work addresses algorithms to rank peptide-spectrum matches. Many of these algorithms, such as PeptideProphet, IDPicker, or Q-ranker, follow a similar methodology that includes representing peptide-spectrum matches as feature vectors and using optimization techniques to rank them. We propose a richer and more flexible feature set representation that is based on the parametrization of the SEQUEST XCorr score and that can be used by all of these algorithms. This extended feature set allows a more effective ranking of the peptide-spectrum matches based on the target-decoy strategy, in comparison to a baseline feature set devoid of these XCorr-based features. Ranking using the extended feature set gives 10-40% improvement in the number of distinct peptide identifications relative to a range of q-value thresholds. While this work is inspired by the model of the theoretical spectrum and the similarity measure between spectra used specifically by SEQUEST, the method itself can be applied to the output of any database search. Further, our approach can be trivially extended beyond XCorr to any linear operator that can serve as similarity score between experimental spectra and peptide sequences.
Collapse
Affiliation(s)
- Marina Spivak
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | | | | | | |
Collapse
|
24
|
Abstract
High-throughput proteomics experiments involving tandem mass spectrometry produce large volumes of complex data that require sophisticated computational analyses. As such, the field offers many challenges for computational biologists. In this article, we briefly introduce some of the core computational and statistical problems in the field and then describe a variety of outstanding problems that readers of PLoS Computational Biology might be able to help solve.
Collapse
Affiliation(s)
- William Stafford Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America.
| | | |
Collapse
|
25
|
Cottrell JS. Protein identification using MS/MS data. J Proteomics 2011; 74:1842-51. [PMID: 21635977 DOI: 10.1016/j.jprot.2011.05.014] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2011] [Revised: 05/04/2011] [Accepted: 05/09/2011] [Indexed: 12/28/2022]
Abstract
The subject of this tutorial is protein identification and characterisation by database searching of MS/MS Data. Peptide Mass Fingerprinting is excluded because it is covered in a separate tutorial. Practical aspects of database searching are emphasised, such as choice of sequence database, effect of mass tolerance, and how to identify post-translational modifications. The relationship between sensitivity and specificity is discussed, as is the challenge of using peptide match information to infer which proteins were present in the sample. Since these tutorials are introductory in nature, most references are to reviews, rather than primary research papers. Some familiarity with mass spectrometry and protein chemistry is assumed. There is an accompanying slide presentation, including speaker notes, and a collection of web-based, practical exercises, designed to reinforce key points. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP 6).
Collapse
|
26
|
Neta P, Stein SE. Charge states of y ions in the collision-induced dissociation of doubly charged tryptic peptide ions. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2011; 22:898-905. [PMID: 21472524 DOI: 10.1007/s13361-011-0089-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Revised: 01/20/2011] [Accepted: 01/22/2011] [Indexed: 05/30/2023]
Abstract
Bonds that break in collision-induced dissociation (CID) are often weakened by a nearby proton, which can, in principle, be carried away by either of the product fragments. Since peptide backbone dissociation is commonly charge-directed, relative intensities of charge states of product y- and b-ions depend on the final location of that proton. This study examines y-ion charge distributions for dissociation of doubly charged peptide ions, using a large reference library of peptide ion fragmentation generated from ion-trap CID of peptide ions from tryptic digests. Trends in relative intensities of y(2+) and y(1+) ions are examined as a function of bond cleavage position, peptide length (n), residues on either side of the bond and effects of residues remote from the bond. It is found that y(n-2)/b(2) dissociation is the most sensitive to adjacent amino acids, that y(2+)/y(1+) steadily increase with increasing peptide length, that the N-terminal amino acid can have a major influence in all dissociations, and in some cases other residues remote from the bond cleavage exert significant effects. Good correlation is found between the values of y(2+)/y(1+) for the peptide and the proton affinities of the amino acids present at the dissociating peptide bond. A few deviations from this correlation are rationalized by specific effects of the amino acid residues. These correlations can be used to estimate trends in y(2+)/y(1+) ratios for peptide ions from amino acid proton affinities.
Collapse
Affiliation(s)
- Pedatsur Neta
- Chemical and Biochemical Reference Data Division, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
| | | |
Collapse
|
27
|
Miskevich F, Davis A, Leeprapaiwong P, Giganti V, Kostić NM, Angel LA. Metal complexes as artificial proteases in proteomics: A palladium(II) complex cleaves various proteins in solutions containing detergents. J Inorg Biochem 2011; 105:675-83. [DOI: 10.1016/j.jinorgbio.2011.01.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2010] [Revised: 01/14/2011] [Accepted: 01/18/2011] [Indexed: 11/15/2022]
|
28
|
Escobar H, Reyes-Vargas E, Jensen PE, Delgado JC, Crockett DK. Utility of characteristic QTOF MS/MS fragmentation for MHC class I peptides. J Proteome Res 2011; 10:2494-507. [PMID: 21413816 DOI: 10.1021/pr101272k] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Systematic investigation of cellular process by mass spectrometric detection of peptides obtained from proteins digestion or directly from immuno-purification can be a powerful tool when used appropriately. The true sequence of these peptides is defined by the interpretation of spectral data using a variety of available algorithms. However peptide match algorithm scoring is typically based on some, but not all, of the mechanisms of peptide fragmentation. Although algorithm rules for soft ionization techniques generally fit very well to tryptic peptides, manual validation of spectra is often required for endogenous peptides such as MHC class I molecules where traditional trypsin digest techniques are not used. This study summarizes data mining and manual validation of hundreds of peptide sequences from MHC class I molecules in publically available data files. We herein describe several important features to improve and quantify manual validation for these endogenous peptides--post automated algorithm searching. Important fragmentation patterns are discussed for the studied MHC Class I peptides. These findings lead to practical rules that are helpful when performing manual validation. Furthermore, these observations may be useful to improve current peptide search algorithms or development of novel software tools.
Collapse
Affiliation(s)
- Hernando Escobar
- ARUP Institute for Clinical and Experimental Pathology, Department of Pathology, University of Utah School of Medicine, Salt Lake City, Utah 84112, United States
| | | | | | | | | |
Collapse
|
29
|
Data processing pipelines for comprehensive profiling of proteomics samples by label-free LC–MS for biomarker discovery. Talanta 2011; 83:1209-24. [DOI: 10.1016/j.talanta.2010.10.029] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2010] [Revised: 10/18/2010] [Accepted: 10/21/2010] [Indexed: 01/30/2023]
|
30
|
Li S, Arnold RJ, Tang H, Radivojac P. On the accuracy and limits of peptide fragmentation spectrum prediction. Anal Chem 2010; 83:790-6. [PMID: 21175207 DOI: 10.1021/ac102272r] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We estimated the reproducibility of tandem mass spectra for the widely used collision-induced dissociation (CID) of peptide ions. Using the Pearson correlation coefficient as a measure of spectral similarity, we found that the within-experiment reproducibility of fragment ion intensities is very high (about 0.85). However, across different experiments and instrument types/setups, the correlation decreases by more than 15% (to about 0.70). We further investigated the accuracy of current predictors of peptide fragmentation spectra and found that they are more accurate than the ad-hoc models generally used by search engines (e.g., SEQUEST) and, surprisingly, approaching the empirical upper limit set by the average across-experiment spectral reproducibility (especially for charge +1 and charge +2 precursor ions). These results provide evidence that, in terms of accuracy of modeling, predicted peptide fragmentation spectra provide a viable alternative to spectral libraries for peptide identification, with a higher coverage of peptides and lower storage requirements. Furthermore, using five data sets of proteome digests by two different proteases, we find that PeptideART (a data-driven machine learning approach) is generally more accurate than MassAnalyzer (an approach based on a kinetic model for peptide fragmentation) in predicting fragmentation spectra but that both models are significantly more accurate than the ad-hoc models.
Collapse
Affiliation(s)
- Sujun Li
- School of Informatics and Computing, Indiana University, Bloomington, Indiana 47408, USA
| | | | | | | |
Collapse
|
31
|
Lam AKY, Ryzhov V, O'Hair RAJ. Mobile protons versus mobile radicals: gas-phase unimolecular chemistry of radical cations of cysteine-containing peptides. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2010; 21:1296-1312. [PMID: 20189828 DOI: 10.1016/j.jasms.2010.01.027] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2009] [Revised: 01/22/2010] [Accepted: 01/28/2010] [Indexed: 05/28/2023]
Abstract
A combination of electrospray ionization (ESI), multistage, and high-resolution mass spectrometry experiments are used to examine the gas-phase fragmentation reactions of radical cations of cysteine containing di- and tripeptides. Two different chemical methods were used to form initial populations of radical cations in which the radical sites were located at different positions: (1) sulfur-centered cysteinyl radicals via bond homolysis of protonated S-nitrosocysteine containing peptides; and (2) alpha-carbon backbone-centered radicals via Siu's sequence of reactions (J. Am. Chem. Soc.2008, 130, 7862). Comparison of the fragmentation reactions of these regiospecifically generated radicals suggests that hydrogen atom transfer (HAT) between the alpha C-H of adjacent residues and the cysteinyl radical can occur. In addition, using accurate mass measurements, deuterium labeling, and comparison with an authentic sample, a novel loss of part of the N-terminal cysteine residue was shown to give rise to the protonated, truncated N-formyl peptide (an even-electron x(n) ion). DFT calculations were performed on the radical cation [GCG]*(+) to examine: the relative stabilities of isomers with different radical and protonation sites; the barriers associated with radical migration between four possible radical sites, [G*CG](+), [GC*G](+), [GCG*](+), and [GC(S*)G](+); and for dissociation from these sites to yield b(2)-type ions.
Collapse
Affiliation(s)
- Adrian K Y Lam
- School of Chemistry, The University of Melbourne, Victoria, Australia
| | | | | |
Collapse
|
32
|
Gucinski AC, Dodds ED, Li W, Wysocki VH. Understanding and exploiting Peptide fragment ion intensities using experimental and informatic approaches. Methods Mol Biol 2010; 604:73-94. [PMID: 20013365 DOI: 10.1007/978-1-60761-444-9_6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Tandem mass spectrometry is a widely used tool in proteomics. This section will address the properties that describe how protonated peptides fragment when activated by collisions in a mass spectrometer and how that information can be used to identify proteins. A review of the mobile proton model is presented, along with a summary of commonly observed peptide cleavage enhancements, including the proline effect. The methods used to elucidate peptide dissociation chemistry by using both small groups of model peptides and large datasets are also discussed. Finally, the role of peak intensity in commercially available and developmental peptide identification algorithms is examined.
Collapse
Affiliation(s)
- Ashley C Gucinski
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, AZ, USA
| | | | | | | |
Collapse
|
33
|
An experimental design approach using response surface techniques to obtain optimal liquid chromatography and mass spectrometry conditions to determine the alkaloids in Meconopsi species. J Chromatogr A 2009; 1216:7013-23. [DOI: 10.1016/j.chroma.2009.08.058] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Revised: 08/14/2009] [Accepted: 08/25/2009] [Indexed: 11/21/2022]
|
34
|
Jagannadham MV. Article Commentary: Identifying the Sequence and Distinguishing the Oxidized—Methionine from Phenylalanine Peptides by MALDI TOF/TOF Mass Spectrometry in an Antarctic Bacterium Pseudomonas Syringae. PROTEOMICS INSIGHTS 2009. [DOI: 10.4137/pri.s3158] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
This short note highlights a procedure to distinguish the residues having similar masses, oxidized methionine and phenylalanine containing peptides using MALDI TOF/TOF. The isotope intensities give a preliminary recognition of peptides containing oxidized methionine. In the peptides with partial oxidation of methionine a mass difference of 16 Da can be observed in the mass finger print of the peptide. Neutral loss of methane sulphenate (CH3 SOH) in the MS/MS spectra is the most abundant ion in the peptide containing oxidized methionine, whereas this fragment ion is not produced from phenylalanine containing peptide. The mass spectra of methionine, oxidized methionine and phenylalanine containing peptides were examined from the proteins of Pseudomonas syringae Lz4W, whose genome sequence is not known.
Collapse
Affiliation(s)
- M. V. Jagannadham
- Centre for Cellular and Molecular Biology (CSIR), Hydearabd-500 007, India
| |
Collapse
|
35
|
Prokai L. Misidentification of nitrated peptides: comments on Hong, S.J., Gokulrangan, G., Schöneich, C., 2007. Proteomic analysis of age-dependent nitration of rat cardiac proteins by solution isoelectric focusing coupled to nanoHPLC tandem mass spectrometry. Exp. Gerontol. 42, 639-651. Exp Gerontol 2009; 44:367-9. [PMID: 19285127 DOI: 10.1016/j.exger.2009.02.014] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2009] [Revised: 02/13/2009] [Accepted: 02/18/2009] [Indexed: 10/21/2022]
|