1
|
Zhong J, Luo Y, Yang C, Yuan M, Wang S. ResNeXt-Based Rescoring Model for Proteoform Characterization in Top-Down Mass Spectra. Interdiscip Sci 2025:10.1007/s12539-025-00701-x. [PMID: 40381130 DOI: 10.1007/s12539-025-00701-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 03/02/2025] [Accepted: 03/04/2025] [Indexed: 05/19/2025]
Abstract
In top-down proteomics, the accurate identification and characterization of proteoform through mass spectrometry represents a critical objective. As a result, achieving accuracy in identification results is essential. Multiple primary structure alterations in proteins generate a diverse range of proteoforms, resulting in an exponential increase in potential proteoform. Moreover, the absence of a definitive reference set complicates the standardization of results. Therefore, enhancing the accuracy of proteoform characterization continues to be a significant challenge. We introduced a ResNeXt-based deep learning model, PrSMBooster, for rescoring proteoform spectrum matches (PrSM) during proteoform characterization. As an ensemble method, PrSMBooster integrates four machine learning models, logistic regression, XGBoost, decision tree, and support vector machine, as weak learners to obtain PrSM features. The basic and latent features of PrSM are subsequently input into the ResNeXt model for final rescoring. To verify the effect and accuracy of the PrSMBooster model in rescoring proteoform characterization, it was compared with the characterization algorithm TopPIC across 47 independent mass spectrometry datasets from various species. The experimental results indicate that in most mass spectrometry datasets, the number of PrSMs obtained after rescoring with PrSMBooster increases at a false discovery rate (FDR) of 1%. Further analysis of the experimental results confirmed that PrSMBooster improves the accuracy of PrSM scoring, generates more mass spectrometry characterization results, and demonstrates strong generalization ability.
Collapse
Affiliation(s)
- Jiancheng Zhong
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Yicheng Luo
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Chen Yang
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Maoqi Yuan
- College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
| | - Shaokai Wang
- Department of Mathematics, Hong Kong University of Science and Technology, 999077, Hong Kong SAR, China.
| |
Collapse
|
2
|
Khristenko NA, Nagornov KO, Garcia C, Gasilova N, Gant M, Druart K, Kozhinov AN, Menin L, Chamot-Rooke J, Tsybin YO. Top-Down and Middle-Down Mass Spectrometry of Antibodies. Mol Cell Proteomics 2025:100989. [PMID: 40368137 DOI: 10.1016/j.mcpro.2025.100989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2025] [Revised: 04/19/2025] [Accepted: 04/22/2025] [Indexed: 05/16/2025] Open
Abstract
Therapeutic antibodies, primarily immunoglobulin G-based monoclonal antibodies, are developed to treat cancer, autoimmune disorders, and infectious diseases. Their large size, structural complexity, and heterogeneity pose significant analytical challenges, requiring the use of advanced characterization techniques. This review traces the 30-year evolution of top-down (TD) and middle-down (MD) mass spectrometry (MS) for antibody analysis, beginning with their initial applications and highlighting key advances and challenges throughout this period. TD MS allows for the analysis of intact antibodies, and MD MS performs analysis of the antibody subunits, even in complex biological samples. Both approaches preserve critical quality attributes such as sequence integrity, post-translational modifications (PTMs), disulfide bonds, and glycosylation patterns. Key milestones in TD and MD MS of antibodies include the use of structure-specific enzymes for subunit generation, the implementation of high-resolution mass spectrometers, and the adoption of non-ergodic ion activation methods such as electron transfer dissociation (ETD), electron capture dissociation (ECD), ultraviolet photodissociation (UVPD), and matrix-assisted laser desorption/ionization in-source decay (MALDI-ISD). The combination of complementary dissociation methods and the use of consecutive ion activation approaches has further enhanced TD/MD MS performance. The current TD MS record of antibody sequencing with terminal product ions is about 60% sequence coverage obtained using the activated ion-ETD approach on a high-resolution MS platform. Current MD MS analyses with about 95% sequence coverage were achieved using combinations of ion activation and dissociation techniques. The review explores TD and MD MS analysis of novel mAb modalities, including antibody-drug conjugates, bispecific antibodies, and endogenous antibodies from biofluids as well as immunoglobulin A and M-type classes. Content.
Collapse
Affiliation(s)
| | | | - Camille Garcia
- Institut Pasteur, Université Paris Cité, and CNRS UAR2024, Paris, France
| | - Natalia Gasilova
- Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Megan Gant
- Institut Pasteur, Université Paris Cité, and CNRS UAR2024, Paris, France
| | - Karen Druart
- Institut Pasteur, Université Paris Cité, and CNRS UAR2024, Paris, France
| | | | - Laure Menin
- Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Julia Chamot-Rooke
- Institut Pasteur, Université Paris Cité, and CNRS UAR2024, Paris, France
| | - Yury O Tsybin
- Spectrotech, 69006 Lyon, France; Spectroswiss, 1015 Lausanne, Switzerland.
| |
Collapse
|
3
|
Su T, Hollas MAR, Fellers RT, Kelleher NL. Identification of Splice Variants and Isoforms in Transcriptomics and Proteomics. Annu Rev Biomed Data Sci 2023; 6:357-376. [PMID: 37561601 PMCID: PMC10840079 DOI: 10.1146/annurev-biodatasci-020722-044021] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Alternative splicing is pivotal to the regulation of gene expression and protein diversity in eukaryotic cells. The detection of alternative splicing events requires specific omics technologies. Although short-read RNA sequencing has successfully supported a plethora of investigations on alternative splicing, the emerging technologies of long-read RNA sequencing and top-down mass spectrometry open new opportunities to identify alternative splicing and protein isoforms with less ambiguity. Here, we summarize improvements in short-read RNA sequencing for alternative splicing analysis, including percent splicing index estimation and differential analysis. We also review the computational methods used in top-down proteomics analysis regarding proteoform identification, including the construction of databases of protein isoforms and statistical analyses of search results. While many improvements in sequencing and computational methods will result from emerging technologies, there should be future endeavors to increase the effectiveness, integration, and proteome coverage of alternative splicing events.
Collapse
Affiliation(s)
- Taojunfeng Su
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois, USA;
| | - Michael A R Hollas
- Proteomics Center of Excellence, Northwestern University, Evanston, Illinois, USA
| | - Ryan T Fellers
- Proteomics Center of Excellence, Northwestern University, Evanston, Illinois, USA
| | - Neil L Kelleher
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois, USA;
- Proteomics Center of Excellence, Northwestern University, Evanston, Illinois, USA
- Department of Chemistry, Northwestern University, Evanston, Illinois, USA
| |
Collapse
|
4
|
Validation of De Novo Peptide Sequences with Bottom-Up Tag Convolution. Proteomes 2021; 10:proteomes10010001. [PMID: 35076636 PMCID: PMC8788492 DOI: 10.3390/proteomes10010001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 12/22/2021] [Accepted: 12/23/2021] [Indexed: 11/16/2022] Open
Abstract
De novo sequencing is indispensable for the analysis of proteins from organisms with unknown genomes, novel splice variants, and antibodies. However, despite a variety of methods developed to this end, distinguishing between the correct interpretation of a mass spectrum and a number of incorrect alternatives often remains a challenge. Tag convolution is computed for a set of peptide sequence tags of a fixed length k generated from the input tandem mass spectra and can be viewed as a generalization of the well-known spectral convolution. We demonstrate its utility for validating de novo peptide sequences by using a set of those generated by the algorithm PepNovo+ from high-resolution bottom-up data sets for carbonic anhydrase 2 and the Fab region of alemtuzumab and indicate its further potential applications.
Collapse
|
5
|
Srzentić K, Fornelli L, Tsybin YO, Loo JA, Seckler H, Agar JN, Anderson LC, Bai DL, Beck A, Brodbelt JS, van der Burgt YEM, Chamot-Rooke J, Chatterjee S, Chen Y, Clarke DJ, Danis PO, Diedrich JK, D'Ippolito RA, Dupré M, Gasilova N, Ge Y, Goo YA, Goodlett DR, Greer S, Haselmann KF, He L, Hendrickson CL, Hinkle JD, Holt MV, Hughes S, Hunt DF, Kelleher NL, Kozhinov AN, Lin Z, Malosse C, Marshall AG, Menin L, Millikin RJ, Nagornov KO, Nicolardi S, Paša-Tolić L, Pengelley S, Quebbemann NR, Resemann A, Sandoval W, Sarin R, Schmitt ND, Shabanowitz J, Shaw JB, Shortreed MR, Smith LM, Sobott F, Suckau D, Toby T, Weisbrod CR, Wildburger NC, Yates JR, Yoon SH, Young NL, Zhou M. Interlaboratory Study for Characterizing Monoclonal Antibodies by Top-Down and Middle-Down Mass Spectrometry. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:1783-1802. [PMID: 32812765 PMCID: PMC7539639 DOI: 10.1021/jasms.0c00036] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
The Consortium for Top-Down Proteomics (www.topdownproteomics.org) launched the present study to assess the current state of top-down mass spectrometry (TD MS) and middle-down mass spectrometry (MD MS) for characterizing monoclonal antibody (mAb) primary structures, including their modifications. To meet the needs of the rapidly growing therapeutic antibody market, it is important to develop analytical strategies to characterize the heterogeneity of a therapeutic product's primary structure accurately and reproducibly. The major objective of the present study is to determine whether current TD/MD MS technologies and protocols can add value to the more commonly employed bottom-up (BU) approaches with regard to confirming protein integrity, sequencing variable domains, avoiding artifacts, and revealing modifications and their locations. We also aim to gather information on the common TD/MD MS methods and practices in the field. A panel of three mAbs was selected and centrally provided to 20 laboratories worldwide for the analysis: Sigma mAb standard (SiLuLite), NIST mAb standard, and the therapeutic mAb Herceptin (trastuzumab). Various MS instrument platforms and ion dissociation techniques were employed. The present study confirms that TD/MD MS tools are available in laboratories worldwide and provide complementary information to the BU approach that can be crucial for comprehensive mAb characterization. The current limitations, as well as possible solutions to overcome them, are also outlined. A primary limitation revealed by the results of the present study is that the expert knowledge in both experiment and data analysis is indispensable to practice TD/MD MS.
Collapse
Affiliation(s)
- Kristina Srzentić
- Northwestern University, Evanston, Illinois 60208-0001, United States
| | - Luca Fornelli
- Northwestern University, Evanston, Illinois 60208-0001, United States
| | - Yury O Tsybin
- Spectroswiss, EPFL Innovation Park, Building I, 1015 Lausanne, Switzerland
| | - Joseph A Loo
- University of California-Los Angeles, Los Angeles, California 90095, United States
| | - Henrique Seckler
- Northwestern University, Evanston, Illinois 60208-0001, United States
| | - Jeffrey N Agar
- Northeastern University, Boston, Massachusetts 02115, United States
| | - Lissa C Anderson
- National High Magnetic Field Laboratory, Tallahassee, Florida 32310, United States
| | - Dina L Bai
- University of Virginia, Charlottesville, Virginia 22901, United States
| | - Alain Beck
- Centre d'immunologie Pierre Fabre, 74160 Saint-Julien-en-Genevois, France
| | | | | | | | | | - Yunqiu Chen
- Biogen, Inc., Cambridge, Massachusetts 02142-1031, United States
| | - David J Clarke
- The University of Edinburgh, EH9 3FJ Edinburgh, United Kingdom
| | - Paul O Danis
- Consortium for Top-Down Proteomics, Cambridge, Massachusetts 02142, United States
| | - Jolene K Diedrich
- The Scripps Research Institute, La Jolla, California 92037, United States
| | | | | | - Natalia Gasilova
- Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Ying Ge
- University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Young Ah Goo
- University of Maryland, Baltimore, Maryland 21201, United States
| | - David R Goodlett
- University of Maryland, Baltimore, Maryland 21201, United States
| | - Sylvester Greer
- University of Texas at Austin, Austin, Texas 78712-1224, United States
| | | | - Lidong He
- National High Magnetic Field Laboratory, Tallahassee, Florida 32310, United States
| | | | - Joshua D Hinkle
- University of Virginia, Charlottesville, Virginia 22901, United States
| | - Matthew V Holt
- Baylor College of Medicine, Houston, Texas 77030-3411, United States
| | - Sam Hughes
- The University of Edinburgh, EH9 3FJ Edinburgh, United Kingdom
| | - Donald F Hunt
- University of Virginia, Charlottesville, Virginia 22901, United States
| | - Neil L Kelleher
- Northwestern University, Evanston, Illinois 60208-0001, United States
| | - Anton N Kozhinov
- Spectroswiss, EPFL Innovation Park, Building I, 1015 Lausanne, Switzerland
| | - Ziqing Lin
- University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | | | - Alan G Marshall
- National High Magnetic Field Laboratory, Tallahassee, Florida 32310, United States
- Florida State University, Tallahassee, Florida 32310-4005, United States
| | - Laure Menin
- Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Robert J Millikin
- University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | | | - Simone Nicolardi
- Leiden University Medical Centre, 2300 RC Leiden, The Netherlands
| | - Ljiljana Paša-Tolić
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | | | - Neil R Quebbemann
- University of California-Los Angeles, Los Angeles, California 90095, United States
| | | | - Wendy Sandoval
- Genentech, Inc., South San Francisco, California 94080-4990, United States
| | - Richa Sarin
- Biogen, Inc., Cambridge, Massachusetts 02142-1031, United States
| | | | | | - Jared B Shaw
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | | | - Lloyd M Smith
- University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Frank Sobott
- University of Antwerp, 2000 Antwerp, Belgium
- University of Leeds, LS2 9JT Leeds, United Kingdom
| | | | - Timothy Toby
- Northwestern University, Evanston, Illinois 60208-0001, United States
| | - Chad R Weisbrod
- National High Magnetic Field Laboratory, Tallahassee, Florida 32310, United States
| | - Norelle C Wildburger
- Washington University School of Medicine, St. Louis, Missouri 63110, United States
| | - John R Yates
- The Scripps Research Institute, La Jolla, California 92037, United States
| | - Sung Hwan Yoon
- University of Maryland, Baltimore, Maryland 21201, United States
| | - Nicolas L Young
- Baylor College of Medicine, Houston, Texas 77030-3411, United States
| | - Mowei Zhou
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| |
Collapse
|
6
|
Zhong J, Sun Y, Xie M, Peng W, Zhang C, Wu FX, Wang J. Proteoform characterization based on top-down mass spectrometry. Brief Bioinform 2020; 22:1729-1750. [PMID: 32118252 DOI: 10.1093/bib/bbaa015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 01/23/2020] [Indexed: 12/16/2022] Open
Abstract
Proteins are dominant executors of living processes. Compared to genetic variations, changes in the molecular structure and state of a protein (i.e. proteoforms) are more directly related to pathological changes in diseases. Characterizing proteoforms involves identifying and locating primary structure alterations (PSAs) in proteoforms, which is of practical importance for the advancement of the medical profession. With the development of mass spectrometry (MS) technology, the characterization of proteoforms based on top-down MS technology has become possible. This type of method is relatively new and faces many challenges. Since the proteoform identification is the most important process in characterizing proteoforms, we comprehensively review the existing proteoform identification methods in this study. Before identifying proteoforms, the spectra need to be preprocessed, and protein sequence databases can be filtered to speed up the identification. Therefore, we also summarize some popular deconvolution algorithms, various filtering algorithms for improving the proteoform identification performance and various scoring methods for localizing proteoforms. Moreover, commonly used methods were evaluated and compared in this review. We believe our review could help researchers better understand the current state of the development in this field and design new efficient algorithms for the proteoform characterization.
Collapse
Affiliation(s)
- Jiancheng Zhong
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Yusui Sun
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Wei Peng
- Kunming University of Science and Technology, Kunming, Yunnan, China
| | - Chushu Zhang
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering at Central South University, Changsha, Hunan, China
| |
Collapse
|
7
|
Shaw JB, Liu W, Vasil′ev YV, Bracken CC, Malhan N, Guthals A, Beckman JS, Voinov VG. Direct Determination of Antibody Chain Pairing by Top-down and Middle-down Mass Spectrometry Using Electron Capture Dissociation and Ultraviolet Photodissociation. Anal Chem 2020; 92:766-773. [PMID: 31769659 PMCID: PMC7819135 DOI: 10.1021/acs.analchem.9b03129] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
One challenge associated with the discovery and development of monoclonal antibody (mAb) therapeutics is the determination of heavy chain and light chain pairing. Advances in MS instrumentation and MS/MS methods have greatly enhanced capabilities for the analysis of large intact proteins yielding much more detailed and accurate proteoform characterization. Consequently, direct interrogation of intact antibodies or F(ab')2 and Fab fragments has the potential to significantly streamline therapeutic mAb discovery processes. Here, we demonstrate for the first time the ability to efficiently cleave disulfide bonds linking heavy and light chains of mAbs using electron capture dissociation (ECD) and 157 nm ultraviolet photodissociation (UVPD). The combination of intact mAb, Fab, or F(ab')2 mass, intact LC and Fd masses, and CDR3 sequence coverage enabled determination of heavy chain and light chain pairing from a single experiment and experimental condition. These results demonstrate the potential of top-down and middle-down proteomics to significantly streamline therapeutic antibody discovery.
Collapse
Affiliation(s)
- Jared B. Shaw
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Weijing Liu
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Yury V. Vasil′ev
- e-MSion Inc., 2121 NE Jack London Drive, Corvallis, Oregon 97330, United States
- Linus Pauling Institute and the Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| | - Carter C. Bracken
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Neha Malhan
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Adrian Guthals
- Mapp Biopharmaceutical Inc., 6160 Lusk Boulevard #105, San Diego, California 92121, United States
| | - Joseph S. Beckman
- e-MSion Inc., 2121 NE Jack London Drive, Corvallis, Oregon 97330, United States
- Linus Pauling Institute and the Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| | - Valery G. Voinov
- e-MSion Inc., 2121 NE Jack London Drive, Corvallis, Oregon 97330, United States
- Linus Pauling Institute and the Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| |
Collapse
|
8
|
Muth T, Renard BY. Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019; 19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open
Abstract
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
Collapse
Affiliation(s)
- Thilo Muth
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Bernhard Y Renard
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
9
|
Srzentić K, Zhurov KO, Lobas AA, Nikitin G, Fornelli L, Gorshkov MV, Tsybin YO. Chemical-Mediated Digestion: An Alternative Realm for Middle-down Proteomics? J Proteome Res 2018; 17:2005-2016. [PMID: 29722266 DOI: 10.1021/acs.jproteome.7b00834] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Protein digestion in mass spectrometry (MS)-based bottom-up proteomics targets mainly lysine and arginine residues, yielding primarily 0.6-3 kDa peptides for the proteomes of organisms of all major kingdoms. Recent advances in MS technology enable analysis of complex mixtures of increasingly longer (>3 kDa) peptides in a high-throughput manner supporting the development of a middle-down proteomics (MDP) approach. Generating longer peptides is a paramount step in launching an MDP pipeline, but the quest for the selection of a cleaving agent that would provide the desired 3-15 kDa peptides remains open. Recent bioinformatics studies have shown that cleavage at the rarely occurring amino acid residues such as methionine (Met), tryptophan (Trp), or cysteine (Cys) would be suitable for MDP approach. Interestingly, chemical-mediated proteolytic cleavages uniquely allow targeting these rare amino acids, for which no specific proteolytic enzymes are known. Herein, as potential candidates for MDP-grade proteolysis, we have investigated the performance of chemical agents previously reported to target primarily Met, Trp, and Cys residues: CNBr, BNPS-Skatole (3-bromo-3-methyl-2-(2-nitrophenyl)sulfanylindole), and NTCB (2-nitro-5-thiobenzoic acid), respectively. Figures of merit such as digestion reproducibility, peptide size distribution, and occurrence of side reactions are discussed. The NTCB-based MDP workflow has demonstrated particularly attractive performance, and NTCB is put forward here as a potential cleaving agent for further MDP development.
Collapse
Affiliation(s)
- Kristina Srzentić
- Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015 , Switzerland
| | | | - Anna A Lobas
- V. L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Leninsky Prospect 38 , Moscow 119334 , Russia
| | - Gennady Nikitin
- Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015 , Switzerland
| | - Luca Fornelli
- Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015 , Switzerland
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics , Russian Academy of Sciences , Leninsky Prospect 38 , Moscow 119334 , Russia.,Moscow Institute of Physics and Technology (State University), 9 Institutskiy per. , Dolgoprudny, Moscow 141707 , Russia
| | - Yury O Tsybin
- Spectroswiss, EPFL Innovation Park , Lausanne 1015 , Switzerland
| |
Collapse
|
10
|
Affiliation(s)
- Bifan Chen
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Kyle A. Brown
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Ziqing Lin
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Human Proteomics Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Ying Ge
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Human Proteomics Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
11
|
Vyatkina K, Dekker LJM, Wu S, VanDuijn MM, Liu X, Tolić N, Luider TM, Paša-Tolić L. De Novo Sequencing of Peptides from High-Resolution Bottom-Up Tandem Mass Spectra using Top-Down Intended Methods. Proteomics 2017; 17. [PMID: 29110399 DOI: 10.1002/pmic.201600321] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2016] [Revised: 09/15/2017] [Indexed: 11/10/2022]
Abstract
Despite high-resolution mass spectrometers are becoming accessible for more and more laboratories, tandem (MS/MS) mass spectra are still often collected at a low resolution. And even if acquired at a high resolution, software tools used for their processing do not tend to benefit from that in full, and an ability to specify a relative mass tolerance in this case often remains the only feature the respective algorithms take advantage of. We argue that a more efficient way to analyze high-resolution MS/MS spectra should be with methods more explicitly accounting for the precision level, and sustain this claim through demonstrating that a de novo sequencing framework originally developed for (high-resolution) top-down MS/MS data is perfectly suitable for processing high-resolution bottom-up datasets, even though a top-down like deconvolution performed as the first step will leave in many spectra at most a few peaks.
Collapse
Affiliation(s)
- Kira Vyatkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia.,Department of Mathematical and Information Technologies, Saint Petersburg Academic University, Russian Academy of Sciences, Saint Petersburg, Russia.,Department of Information Technologies and Programming, ITMO University, Saint Petersburg, Russia.,Department of Computer Technologies and Informatics, Saint Petersburg Electrotechnical University LETI, Saint Petersburg, Russia
| | - Lennard J M Dekker
- Department of Neurology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
| | - Martijn M VanDuijn
- Department of Neurology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, USA.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Nikola Tolić
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Theo M Luider
- Department of Neurology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Ljiljana Paša-Tolić
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, USA
| |
Collapse
|
12
|
Vyatkina K. De Novo Sequencing of Top-Down Tandem Mass Spectra: A Next Step towards Retrieving a Complete Protein Sequence. Proteomes 2017; 5:E6. [PMID: 28248257 PMCID: PMC5372227 DOI: 10.3390/proteomes5010006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 01/30/2017] [Accepted: 02/04/2017] [Indexed: 11/16/2022] Open
Abstract
De novo sequencing of tandem (MS/MS) mass spectra represents the only way to determine the sequence of proteins from organisms with unknown genomes, or the ones not directly inscribed in a genome-such as antibodies, or novel splice variants. Top-down mass spectrometry provides new opportunities for analyzing such proteins; however, retrieving a complete protein sequence from top-down MS/MS spectra still remains a distant goal. In this paper, we review the state-of-the-art on this subject, and enhance our previously developed Twister algorithm for de novo sequencing of peptides from top-down MS/MS spectra to derive longer sequence fragments of a target protein.
Collapse
Affiliation(s)
- Kira Vyatkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, 7-9 Universitetskaya nab., St. Petersburg 199034, Russia.
- Department of Mathematical and Information Technologies, Saint Petersburg Academic University, 8/3 Khlopina st., St. Petersburg 194021, Russia.
| |
Collapse
|
13
|
Guan X, Brownstein NC, Young NL, Marshall AG. Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry and tandem mass spectrometry for peptide de novo amino acid sequencing for a seven-protein mixture by paired single-residue transposed Lys-N and Lys-C digestion. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2017; 31:207-217. [PMID: 27813191 DOI: 10.1002/rcm.7783] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Revised: 10/29/2016] [Accepted: 10/30/2016] [Indexed: 06/06/2023]
Abstract
RATIONALE Bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics to identify proteins from a sequence database. De novo sequencing is also available for sequencing peptides with relatively short sequence lengths. We recently showed that paired Lys-C and Lys-N proteases produce peptides of identical mass and similar retention time, but different tandem mass spectra. Such parallel experiments provide complementary information, and allow for up to 100% MS/MS sequence coverage. METHODS Here, we report digestion by paired Lys-C and Lys-N proteases of a seven-protein mixture: human hemoglobin alpha, bovine carbonic anhydrase 2, horse skeletal muscle myoglobin, hen egg white lysozyme, bovine pancreatic ribonuclease, bovine rhodanese, and bovine serum albumin, followed by reversed-phase nanoflow liquid chromatography, collision-induced dissociation, and 14.5 T Fourier transform ion cyclotron resonance mass spectrometry. RESULTS Matched pairs of product peptide ions of equal precursor mass and similar retention times from each digestion are compared, leveraging single-residue transposed information with independent interferences to confidently identify fragment ion types, residues, and peptides. Selected pairs of product ion mass spectra for de novo sequenced protein segments from each member of the mixture are presented. CONCLUSIONS Pairs of the transposed product ions as well as complementary information from the parallel experiments allow for both high MS/MS coverage for long peptide sequences and high confidence in the amino acid identification. Moreover, the parallel experiments in the de novo sequencing reduce false-positive matches of product ions from the single-residue transposed peptides from the same segment, and thereby further improve the confidence in protein identification. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Xiaoyan Guan
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Florida State University, 1800 East Paul Dirac Drive, Tallahassee, FL, 32310, USA
| | - Naomi C Brownstein
- Department of Behavioral Sciences and Social Medicine, College of Medicine, Florida State University, 1115 W. Call St., Tallahassee, FL, 32306, USA
- Department of Statistics, Florida State University, 117 N. Woodward Ave., Tallahassee, FL, 32306, USA
| | - Nicolas L Young
- Verna & Marrs McLean Department of Biochemistry & Molecular Biology, Baylor College of Medicine, One Baylor Plaza, MS-125, Houston, TX, 77030-3411, USA
| | - Alan G Marshall
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Florida State University, 1800 East Paul Dirac Drive, Tallahassee, FL, 32310, USA
- Department of Chemistry and Biochemistry, Florida State University, 95 Chieftain Way, Tallahassee, FL, 32303, USA
| |
Collapse
|