1
|
Chen Z, Wei J, Tang Y, Lin C, Costello CE, Hong P. GlycoDeNovo2: An Improved MS/MS-Based De Novo Glycan Topology Reconstruction Algorithm. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2022; 33:436-445. [PMID: 35157458 PMCID: PMC9149727 DOI: 10.1021/jasms.1c00288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Glycan structure identification is essential to understanding the roles of glycans in various biological processes. Previously, we developed GlycoDeNovo, a de novo algorithm for reconstructing glycan topologies from tandem mass spectra (MS/MS). In this work, we introduce GlycoDeNovo2 that contains two major improvements to GlycoDeNovo. First, we use the precursor mass measured for a peak that likely corresponds to a glycan to determine its potential compositions, which are used to constrain the search space, enable parallel computation, and hence speed up topology reconstruction. Second, we developed a procedure to calculate the empirical p-value of a reconstructed topology candidate. Experimental results are provided to demonstrate the effectiveness of GlycoDeNovo2.
Collapse
Affiliation(s)
- Zizhang Chen
- Department of Computer Science, Brandeis University, Waltham, Massachusetts 02453, United States
| | - Juan Wei
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
| | - Yang Tang
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Cheng Lin
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
| | - Catherine E Costello
- Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Pengyu Hong
- Department of Computer Science, Brandeis University, Waltham, Massachusetts 02453, United States
| |
Collapse
|
2
|
Wei J, Tang Y, Bai Y, Zaia J, Costello CE, Hong P, Lin C. Toward Automatic and Comprehensive Glycan Characterization by Online PGC-LC-EED MS/MS. Anal Chem 2020; 92:782-791. [PMID: 31829560 PMCID: PMC7082718 DOI: 10.1021/acs.analchem.9b03183] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Despite the recent advances in mass spectrometry (MS)-based methods for glycan structural analysis, characterization of glycomes remains a significant analytical challenge, in part due to the widespread presence of isomeric structures and the need to define the many structural variables for each glycan. Interpretation of the complex tandem mass spectra of glycans is often laborious and requires substantial expertise. Broad adoption of MS methods for glycomics, within and outside the glycoscience community, has been hindered by the shortage of bioinformatics tools for rapid and accurate glycan sequencing. Here, we developed an online porous graphitic carbon liquid chromatography (PGC-LC)-electronic excitation dissociation (EED) MS/MS method that takes advantage of the superior isomer resolving power of PGC and the structural details provided by EED MS/MS for characterization of glycan mixtures. We also made improvements to GlycoDeNovo, our de novo glycan sequencing algorithm, so that it can automatically and accurately identify glycan topologies from EED tandem mass spectra acquired online. The majority of linkages can also be determined de novo, although in some cases, biological insight may be needed to fully define the glycan structure. Application of this method to the analysis of N-glycans released from ribonuclease B not only revealed the presence of 18 high-mannose structures, including new isomers not previously reported, but also provided relative quantification for each isomeric structure. With fully automated data acquisition and topology analysis, the approach presented here holds great potential for automated and comprehensive glycan characterization.
Collapse
Affiliation(s)
- Juan Wei
- Center for Biomedical Mass Spectrometry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
| | - Yang Tang
- Center for Biomedical Mass Spectrometry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Yu Bai
- Beijing National Laboratory for Molecular Sciences, Institute of Analytical Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
| | - Catherine E. Costello
- Center for Biomedical Mass Spectrometry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
- Department of Chemistry, Boston University, Boston, Massachusetts 02215, United States
| | - Pengyu Hong
- Department of Computer Science, Brandeis University, Waltham, Massachusetts 02454, United States
| | - Cheng Lin
- Center for Biomedical Mass Spectrometry, Boston University School of Medicine, Boston, Massachusetts 02118, United States
| |
Collapse
|
3
|
De novo glycan structural identification from mass spectra using tree merging strategy. Comput Biol Chem 2019; 80:217-224. [DOI: 10.1016/j.compbiolchem.2019.03.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Accepted: 03/23/2019] [Indexed: 11/19/2022]
|
4
|
Sun W, Liu Y, Lajoie GA, Ma B, Zhang K. An Improved Approach for N-Linked Glycan Structure Identification from HCD MS/MS Spectra. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:388-395. [PMID: 28489544 DOI: 10.1109/tcbb.2017.2701819] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Glycosylation is a frequently observed post-translational modification on proteins. Currently, tandem mass spectrometry (MS/MS) serves as an efficient analytical technique for characterizing structures of oligosaccharides. However, developing effective computational approaches for identifying glycan structures from mass spectra is still a great challenge in glycoproteomics research. In this study, we proposed an approach for matching the input spectra with glycan structures acquired from a glycan structure database by incorporating a de novo sequencing assisted ranking scheme. The proposed approach is implemented as a software tool, GlycoNovoDB, for automated glycan structure identification from HCD MS/MS of glycopeptides. Experimental results showed that GlycoNovoDB can identify glycans effectively and has better performance than our previously proposed de novo sequencing algorithm as well as another software GlycoMaster DB.
Collapse
|
5
|
Muth T, Hartkopf F, Vaudel M, Renard BY. A Potential Golden Age to Come-Current Tools, Recent Use Cases, and Future Avenues for De Novo Sequencing in Proteomics. Proteomics 2018; 18:e1700150. [PMID: 29968278 DOI: 10.1002/pmic.201700150] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 05/23/2018] [Indexed: 01/15/2023]
Abstract
In shotgun proteomics, peptide and protein identification is most commonly conducted using database search engines, the method of choice when reference protein sequences are available. Despite its widespread use the database-driven approach is limited, mainly because of its static search space. In contrast, de novo sequencing derives peptide sequence information in an unbiased manner, using only the fragment ion information from the tandem mass spectra. In recent years, with the improvements in MS instrumentation, various new methods have been proposed for de novo sequencing. This review article provides an overview of existing de novo sequencing algorithms and software tools ranging from peptide sequencing to sequence-to-protein mapping. Various use cases are described for which de novo sequencing was successfully applied. Finally, limitations of current methods are highlighted and new directions are discussed for a wider acceptance of de novo sequencing in the community.
Collapse
Affiliation(s)
- Thilo Muth
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| | - Felix Hartkopf
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| | - Marc Vaudel
- K.G. Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, 5020, Bergen, Norway.,Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, 5020, Bergen, Norway
| | - Bernhard Y Renard
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| |
Collapse
|
6
|
Sun W, Liu Y, Zhang K. An approach for N-linked glycan identification from MS/MS spectra by target-decoy strategy. Comput Biol Chem 2018; 74:391-398. [PMID: 29580737 DOI: 10.1016/j.compbiolchem.2018.03.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 03/13/2018] [Indexed: 12/28/2022]
Abstract
Glycan structure determination serves as an essential step for the thorough investigation of the structure and function of protein. Currently, appropriate sample preparation followed by tandem mass spectrometry has emerged as the dominant technique for the characterization of glycans and glycopeptides. Although extensive efforts have been made to the development of computational approaches for the automated interpretation of glycopeptide spectra, the previously appeared methods lack a reasonable quality control strategy for the statistical validation of reported results. In this manuscript, we introduced a novel method that constructed a decoy glycan database based on the glycan structures in the target database, and searched the experimental spectra against both the target and decoy databases to find the best matched glycans. Specifically, a two-layer scoring scheme for calculating a normalized matching score is applied in the search procedure which enables the unbiased ranking of the matched glycans. Experimental analysis showed that our proposed method can report more structures with high confidence compared with previous approaches.
Collapse
Affiliation(s)
- Weiping Sun
- Department of Computer Science, University of Western Ontario, London, ON N6A5B7, Canada.
| | - Yi Liu
- Department of Computer Science, University of Western Ontario, London, ON N6A5B7, Canada
| | - Kaizhong Zhang
- Department of Computer Science, University of Western Ontario, London, ON N6A5B7, Canada
| |
Collapse
|
7
|
Hong P, Sun H, Sha L, Pu Y, Khatri K, Yu X, Tang Y, Lin C. GlycoDeNovo - an Efficient Algorithm for Accurate de novo Glycan Topology Reconstruction from Tandem Mass Spectra. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2017; 28:2288-2301. [PMID: 28786094 PMCID: PMC5647224 DOI: 10.1007/s13361-017-1760-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Revised: 07/03/2017] [Accepted: 07/09/2017] [Indexed: 05/15/2023]
Abstract
A major challenge in glycomics is the characterization of complex glycan structures that are essential for understanding their diverse roles in many biological processes. We present a novel efficient computational approach, named GlycoDeNovo, for accurate elucidation of the glycan topologies from their tandem mass spectra. Given a spectrum, GlycoDeNovo first builds an interpretation-graph specifying how to interpret each peak using preceding interpreted peaks. It then reconstructs the topologies of peaks that contribute to interpreting the precursor ion. We theoretically prove that GlycoDeNovo is highly efficient. A major innovative feature added to GlycoDeNovo is a data-driven IonClassifier which can be used to effectively rank candidate topologies. IonClassifier is automatically learned from experimental spectra of known glycans to distinguish B- and C-type ions from all other ion types. Our results showed that GlycoDeNovo is robust and accurate for topology reconstruction of glycans from their tandem mass spectra. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- Pengyu Hong
- Department of Computer Science, Brandeis University, Waltham, MA, 02453, USA.
| | - Hui Sun
- Department of Computer Science, Brandeis University, Waltham, MA, 02453, USA
| | - Long Sha
- Department of Computer Science, Brandeis University, Waltham, MA, 02453, USA
| | - Yi Pu
- Department of Chemistry, Boston University, Boston, MA, 02215, USA
| | - Kshitij Khatri
- Department of Biochemistry, Boston University School of Medicine, Boston, MA, 02118, USA
| | - Xiang Yu
- Department of Biochemistry, Boston University School of Medicine, Boston, MA, 02118, USA
| | - Yang Tang
- Department of Chemistry, Boston University, Boston, MA, 02215, USA
| | - Cheng Lin
- Department of Biochemistry, Boston University School of Medicine, Boston, MA, 02118, USA.
| |
Collapse
|
8
|
Hu H, Khatri K, Zaia J. Algorithms and design strategies towards automated glycoproteomics analysis. MASS SPECTROMETRY REVIEWS 2017; 36:475-498. [PMID: 26728195 PMCID: PMC4931994 DOI: 10.1002/mas.21487] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/30/2015] [Indexed: 05/09/2023]
Abstract
Glycoproteomics involves the study of glycosylation events on protein sequences ranging from purified proteins to whole proteome scales. Understanding these complex post-translational modification (PTM) events requires elucidation of the glycan moieties (monosaccharide sequences and glycosidic linkages between residues), protein sequences, as well as site-specific attachment of glycan moieties onto protein sequences, in a spatial and temporal manner in a variety of biological contexts. Compared with proteomics, bioinformatics for glycoproteomics is immature and many researchers still rely on tedious manual interpretation of glycoproteomics data. As sample preparation protocols and analysis techniques have matured, the number of publications on glycoproteomics and bioinformatics has increased substantially; however, the lack of consensus on tool development and code reuse limits the dissemination of bioinformatics tools because it requires significant effort to migrate a computational tool tailored for one method design to alternative methods. This review discusses algorithms and methods in glycoproteomics, and refers to the general proteomics field for potential solutions. It also introduces general strategies for tool integration and pipeline construction in order to better serve the glycoproteomics community. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:475-498, 2017.
Collapse
Affiliation(s)
- Han Hu
- Bioinformatics Program, Boston University, Boston, Massachusetts 02215, USA
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Kshitij Khatri
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| |
Collapse
|
9
|
Sun W, Kuljanin M, Pittock P, Ma B, Zhang K, Lajoie GA. An Effective Approach for Glycan Structure De Novo Sequencing From HCD Spectra. IEEE Trans Nanobioscience 2016; 15:177-84. [PMID: 26800543 DOI: 10.1109/tnb.2016.2519861] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Mass spectrometry has become a widely used analytical technique for proteomics study because of its high throughput and sensitivity. Among those applications, a specific one is to characterize glycan structure. Glycosylation is a frequently occurred post-translational modification of proteins which is relevant to humans' health. Therefore, it is significant to develop effective computational methods to automate the identification of glycan structures from mass spectral data. In our research, we mathematically formulated the glycan de novo sequencing problem and proposed a heuristic algorithm for glycan de novo sequencing from HCD MS/MS spectra of N-linked glycopeptides. The algorithm proceeds in a carefully designate pathway to construct the best matched tree structure from MS/MS spectrum. Experimental results showed that our proposed approach can effectively identify glycan structures from HCD MS/MS spectra.
Collapse
|
10
|
Kumozaki S, Sato K, Sakakibara Y. A Machine Learning Based Approach to de novo Sequencing of Glycans from Tandem Mass Spectrometry Spectrum. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1267-1274. [PMID: 26671799 DOI: 10.1109/tcbb.2015.2430317] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Recently, glycomics has been actively studied and various technologies for glycomics have been rapidly developed. Currently, tandem mass spectrometry (MS/MS) is one of the key experimental tools for identification of structures of oligosaccharides. MS/MS can observe MS/MS peaks of fragmented glycan ions including cross-ring ions resulting from internal cleavages, which provide valuable information to infer glycan structures. Thus, the aim of de novo sequencing of glycans is to find the most probable assignments of observed MS/MS peaks to glycan substructures without databases. However, there are few satisfiable algorithms for glycan de novo sequencing from MS/MS spectra. We present a machine learning based approach to de novo sequencing of glycans from MS/MS spectrum. First, we build a suitable model for the fragmentation of glycans including cross-ring ions, and implement a solver that employs Lagrangian relaxation with a dynamic programming technique. Then, to optimize scores for the algorithm, we introduce a machine learning technique called structured support vector machines that enable us to learn parameters including scores for cross-ring ions from training data, i.e., known glycan mass spectra. Furthermore, we implement additional constraints for core structures of well-known glycan types including N-linked glycans and O-linked glycans. This enables us to predict more accurate glycan structures if the glycan type of given spectra is known. Computational experiments show that our algorithm performs accurate de novo sequencing of glycans. The implementation of our algorithm and the datasets are available at http://glyfon.dna.bio.keio.ac.jp/.
Collapse
|
11
|
Dong L, Shi B, Tian G, Li Y, Wang B, Zhou M. An Accurate de novo Algorithm for Glycan Topology Determination from Mass Spectra. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:568-578. [PMID: 26357268 DOI: 10.1109/tcbb.2014.2368981] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Determining the glycan topology automatically from mass spectra represents a great challenge. Existing methods fall into approximate and exact ones. The former including greedy and heuristic ones can reduce the computational complexity, but suffer from information lost in the procedure of glycan interpretation. The latter including dynamic programming and exhaustive enumeration are much slower than the former. In the past years, nearly all emerging methods adopted a tree structure to represent a glycan. They share such problems as repetitive peak counting in reconstructing a candidate structure. Besides, tree-based glycan representation methods often have to give different computational formulas for binary and ternary glycans. We propose a new directed acyclic graph structure for glycan representation. Based on it, this work develops a de novo algorithm to accurately reconstruct the tree structure iteratively from mass spectra with logical constraints and some known biosynthesis rules, by a single computational formula. The experiments on multiple complex glycans extracted from human serum show that the proposed algorithm can achieve higher accuracy to determine a glycan topology than prior methods without increasing computational burden.
Collapse
|
12
|
Sun W, Lajoie GA, Ma B, Zhang K. A Novel Algorithm for Glycan de novo Sequencing Using Tandem Mass Spectrometry. BIOINFORMATICS RESEARCH AND APPLICATIONS 2015. [DOI: 10.1007/978-3-319-19048-8_27] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
13
|
Woodin CL, Maxon M, Desaire H. Software for automated interpretation of mass spectrometry data from glycans and glycopeptides. Analyst 2013; 138:2793-803. [PMID: 23293784 DOI: 10.1039/c2an36042j] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The purpose of this review is to provide those interested in glycosylation analysis with the most updated information on the availability of automated tools for MS characterization of N-linked and O-linked glycosylation types. Specifically, this review describes software tools that facilitate elucidation of glycosylation from MS data on the basis of mass alone, as well as software designed to speed the interpretation of glycan and glycopeptide fragmentation from MS/MS data. This review focuses equally on software designed to interpret the composition of released glycans and on tools to characterize N-linked and O-linked glycopeptides. Several websites have been compiled and described that will be helpful to the reader who is interested in further exploring the described tools.
Collapse
Affiliation(s)
- Carrie L Woodin
- Department of Chemistry, University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA
| | | | | |
Collapse
|
14
|
Wu SW, Liang SY, Pu TH, Chang FY, Khoo KH. Sweet-Heart - an integrated suite of enabling computational tools for automated MS2/MS3 sequencing and identification of glycopeptides. J Proteomics 2013; 84:1-16. [PMID: 23568021 DOI: 10.1016/j.jprot.2013.03.026] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Revised: 02/12/2013] [Accepted: 03/10/2013] [Indexed: 11/26/2022]
Abstract
UNLABELLED High efficiency identification of intact glycopeptides from a shotgun glycoproteomic LC-MS(2) dataset remains problematic. The prevalent mode of identifying the de-N-glycosylated peptides is littered with false positives and addresses only the issue of site occupancy. Here, we present Sweet-Heart, a computational tool set developed to tackle the heart of the problems in MS(2) sequencing of glycopeptide. It accepts low resolution and low accuracy ion trap MS(2) data, filters for glycopeptides, couples knowledge-based de novo interpretation of glycosylation-dependent fragmentation pattern with protein database search, and uses machine-learning algorithm to score the computed glyco and peptide combinations. Higher ranking candidates are then compiled into a list of MS(2)/MS(3) entries to drive subsequent rounds of targeted MS(3) sequencing of putative peptide backbone, allowing its validation by database search in a fully automated fashion. With additional fishing out of all related glycoforms and final data integration, the platform proves to be sufficiently sensitive and selective, conducive to novel glycosylation discovery, and robust enough to discriminate, among others, N-glycolyl neuraminic acid/fucose from N-acetyl neuraminic acid/hexose. A critical appraisal of its computing performance shows that Sweet-Heart allows high sensitivity comprehensive mapping of site-specific glycosylation for isolated glycoproteins and facilitates analysis of glycoproteomic data. BIOLOGICAL SIGNIFICANCE The biological relevance of protein site-specific glycosylation cannot be meaningfully addressed without first defining its pattern by direct analysis of glycopeptides. Sweet-Heart is a novel suite of computational tools allowing for automated analysis of mass spectrometry-based glycopeptide sequencing data. It is developed to accept ion trap MS2/MS3 data and uses a machine learning algorithm to score and rank the candidate peptide core and glycosyl substituent combinations. By eliminating the need for manual, labor-intensive, and subjective data interpretation, it facilitates high throughput shotgun glycoproteomic data analysis and is conducive to identification of unanticipated glycosylation, as demonstrated here with a recombinant EGFR.
Collapse
Affiliation(s)
- Sz-Wei Wu
- Institute of Biochemical Sciences, National Taiwan University, Taiwan
| | | | | | | | | |
Collapse
|
15
|
Li F, Glinskii OV, Glinsky VV. Glycobioinformatics: Current strategies and tools for data mining in MS-based glycoproteomics. Proteomics 2012; 13:341-54. [DOI: 10.1002/pmic.201200149] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2012] [Revised: 10/06/2012] [Accepted: 11/06/2012] [Indexed: 12/18/2022]
|
16
|
Woodin CL, Hua D, Maxon M, Rebecchi KR, Go EP, Desaire H. GlycoPep grader: a web-based utility for assigning the composition of N-linked glycopeptides. Anal Chem 2012; 84:4821-9. [PMID: 22540370 DOI: 10.1021/ac300393t] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
GlycoPep grader (GPG) is a freely available software tool designed to accelerate the process of accurately determining glycopeptide composition from tandem mass spectrometric data. GPG relies on the identification of unique dissociation patterns shown for high mannose, hybrid, and complex N-linked glycoprotein types, including patterns specific to those structures containing fucose or sialic acid residues. The novel GPG scoring algorithm scores potential candidate compositions of the same nominal mass against MS/MS data through evaluation of the Y(1) ion and other peptide-containing product ions, across multiple charge states, when applicable. In addition to evaluating the peptide portion of a given glycopeptide, the GPG algorithm predicts and scores product ions that result from unique neutral losses of terminal glycans. GPG has been applied to a variety of glycoproteins, including RNase B, asialofetuin, and transferrin, and the HIV envelope glycoprotein, CON-S gp140ΔCFI. The GPG software is implemented predominantly in PostgreSQL, with PHP as the presentation tier, and is publicly accessible online. Thus far, the algorithm has identified the correct compositional assignment from multiple candidate N-glycopeptides in all tests performed.
Collapse
Affiliation(s)
- Carrie L Woodin
- Department of Chemistry, University of Kansas, Lawrence, Kansas 66047, United States
| | | | | | | | | | | |
Collapse
|
17
|
Hart-Smith G, Raftery MJ. Detection and characterization of low abundance glycopeptides via higher-energy C-trap dissociation and orbitrap mass analysis. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2012; 23:124-140. [PMID: 22083589 DOI: 10.1007/s13361-011-0273-y] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Revised: 10/05/2011] [Accepted: 10/06/2011] [Indexed: 05/31/2023]
Abstract
Broad-scale mass spectrometric analyses of glycopeptides are constrained by the considerable complexity inherent to glycoproteomics, and techniques are still being actively developed to address the associated analytical difficulties. Here we apply Orbitrap mass analysis and higher-energy C-trap dissociation (HCD) to facilitate detailed insights into the compositions and heterogeneity of complex mixtures of low abundance glycopeptides. By generating diagnostic oxonium product ions at mass measurement errors of <5 ppm, highly selective glycopeptide precursor ion detections are made at sub-fmol limits of detection: analyses of proteolytic digests of a hen egg glycoprotein mixture detect 88 previously uncharacterized glycopeptides from 666 precursor ions selected for MS/MS, with only one false positive due to co-fragmentation of a non-glycosylated peptide with a glycopeptide. We also demonstrate that by (1) identifying multiple series of glycoforms using high mass accuracy single stage MS spectra, and (2) performing product ion scans at optimized HCD collision energies, the identification of peptide + N-acetylhexosamine (HexNAc) ions (Y1 ions) can be readily achieved at <5 ppm mass measurement errors. These data allow base peptide sequences and glycan compositional information to be attained with high confidence, even for glycopeptides that produce weak precursor ion signals and/or low quality MS/MS spectra. The glycopeptides characterized from low fmol abundances using these methods allow two previously unreported glycosylation sites on the Gallus gallus protein ovoglycoprotein (amino acids 82 and 90) to be confirmed; considerable glycan heterogeneities at amino acid 90 of ovoglycoprotein, and amino acids 34 and 77 of Gallus gallus ovomucoid are also revealed.
Collapse
Affiliation(s)
- Gene Hart-Smith
- NSW Systems Biology Initiative, University of New South Wales, Sydney, New South Wales 2052, Australia.
| | | |
Collapse
|
18
|
Mayampurath AM, Wu Y, Segu ZM, Mechref Y, Tang H. Improving confidence in detection and characterization of protein N-glycosylation sites and microheterogeneity. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2011; 25:2007-2019. [PMID: 21698683 DOI: 10.1002/rcm.5059] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Protein glycosylation is one of the most common post-translational modifications, estimated to occur in over 50% of human proteins. Mass spectrometry (MS)-based approaches involving different fragmentation mechanisms have been frequently used to detect and characterize protein N-linked glycosylations. In addition to the popular Collision-Induced Dissociation (CID), high-energy C-trap dissociation (HCD) fragmentation, which is a feature of a linear ion trap orbitrap hybrid mass spectrometer (LTQ Orbitrap), has been recently used for the fragmentation of tryptic N-linked glycopeptides in glycoprotein analysis. The oxonium ions observed with high mass accuracy in the HCD spectrum of glycopeptides can be combined with characteristic fragmentation patterns in the CID spectrum resulting from consecutive glycosidic bond cleavages, to improve the detection and characterization of N-linked glycopeptides. As a means of automating this process, we describe here GlypID 2.0, a software tool that implements several algorithmic approaches to utilize MS information including accurate precursor mass and spectral patterns from both HCD and CID spectra, thus allowing for an unequivocal and accurate characterization of N-linked glycosylation sites of proteins.
Collapse
Affiliation(s)
- Anoop M Mayampurath
- School of Informatics & Computing, Indiana University, Bloomington, IN 4708, USA
| | | | | | | | | |
Collapse
|
19
|
Böcker S, Kehr B, Rasche F. Determination of glycan structure from tandem mass spectra. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:976-986. [PMID: 21173459 DOI: 10.1109/tcbb.2010.129] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Glycans are molecules made from simple sugars that form complex tree structures. Glycans constitute one of the most important protein modifications and identification of glycans remains a pressing problem in biology. Unfortunately, the structure of glycans is hard to predict from the genome sequence of an organism. In this paper, we consider the problem of deriving the topology of a glycan solely from tandem mass spectrometry (MS) data. We study, how to generate glycan tree candidates that sufficiently match the sample mass spectrum, avoiding the combinatorial explosion of glycan structures. Unfortunately, the resulting problem is known to be computationally hard. We present an efficient exact algorithm for this problem based on fixed-parameter algorithmics that can process a spectrum in a matter of seconds. We also report some preliminary results of our method on experimental data, combining it with a preliminary candidate evaluation scheme. We show that our approach is fast in applications, and that we can reach very well de novo identification results. Finally, we show how to count the number of glycan topologies for a fixed size or a fixed mass. We generalize this result to count the number of (labeled) trees with bounded out degree, improving on results obtained using Pólya's enumeration theorem.
Collapse
Affiliation(s)
- Sebastian Böcker
- Faculty for Mathematics and Computer Science, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, Jena 07743, Germany.
| | | | | |
Collapse
|
20
|
Tousi F, Hancock WS, Hincapie M. Technologies and strategies for glycoproteomics and glycomics and their application to clinical biomarker research. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2011; 3:20-32. [PMID: 32938106 DOI: 10.1039/c0ay00413h] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Several approaches and technologies are currently available to study the glycosylated proteome (glycoproteomics) or the entire repertoire of glycans in a biological system (glycomics). The biological importance of glycosylation has driven the development of novel, sensitive separation and detection methods. New and improved methodologies, such as high throughput array systems and liquid chromatography-mass spectrometry for glycan profiling and sequencing, are emerging and are being applied in clinical research. A major thrust of glycoproteomics and glycomic clinical research is the application of these analytical tools to cancer research and is aimed at the discovery of glycan-based biomarkers for diagnosis of early stage human cancers, monitoring disease progression, measuring response to therapy, and detecting recurrence. The identification of cancer biomarkers requires a multidisciplinary approach and therefore this review discusses the strategies, technologies and methods currently used for N-glycoprotein/glycanbiomarker research.
Collapse
Affiliation(s)
- Fateme Tousi
- Barnett Institute and Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, USA.
| | - William S Hancock
- Barnett Institute and Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, USA.
| | - Marina Hincapie
- Barnett Institute and Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, USA.
| |
Collapse
|
21
|
Ma B. Challenges in Computational Analysis of Mass Spectrometry Data for Proteomics. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 2010; 25:107-123. [DOI: 10.1007/s11390-010-9309-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
22
|
Peltoniemi H, Joenväärä S, Renkonen R. De novo glycan structure search with the CID MS/MS spectra of native N-glycopeptides. Glycobiology 2009; 19:707-14. [PMID: 19270074 DOI: 10.1093/glycob/cwp034] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The aim of our study is to automatically analyze the glycan and peptide structures of N-glycopeptides without a need to release glycans from the glycopeptides. Our wet laboratory raw data represent a series of MS/MS mass spectra obtained from a reverse-phase liquid chromatography run of size-exclusion-enriched tryptic-digested glycopeptides from glycoproteins. The MS/MS spectra are first analyzed in order to identify glycosylated peptides and N-glycan monosaccharide compositions present on each glycopeptide. We further developed a Branch-and-Bound algorithm to search de novo N-glycan structures, i.e., monosaccharide compositions and their ordered sequences from native glycopeptides. Our de novo algorithm is based on iterative growth and selection of a population of glycan structures and it does not use databases of known glycan structures. We validate the algorithm with (i) in silico-generated spectra, with or without deteriorating deletions, (ii) with a purified glycoprotein transferrin, and (iii) with a complex mixture of N-glycopeptides enriched from human plasma. Our Branch-and-Bound algorithm depicted glycan structures from all the above-mentioned three input data types. Due to the large diversity of glycan structures, the results typically contained several proposed structures matching almost equally well to the spectra. In conclusion, this algorithm automatically identifies glycopeptides and their structures from the MS/MS spectra and thus greatly reduces the number of possible glycan structures from the vast amount of potential ones.
Collapse
Affiliation(s)
- Hannu Peltoniemi
- Transplantation Laboratory & Infection Biology Research Program, Haartman Institute, University of Helsinki, Finland
| | | | | |
Collapse
|