1
|
Luo Z, Wang Q, Xia Y, Zhu X, Yang S, Xu Z, Gu L. DLBWE-Cys: a deep-learning-based tool for identifying cysteine S-carboxyethylation sites using binary-weight encoding. Front Genet 2025; 15:1464976. [PMID: 39845187 PMCID: PMC11751040 DOI: 10.3389/fgene.2024.1464976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Accepted: 12/23/2024] [Indexed: 01/24/2025] Open
Abstract
Cysteine S-carboxyethylation, a novel post-translational modification (PTM), plays a critical role in the pathogenesis of autoimmune diseases, particularly ankylosing spondylitis. Accurate identification of S-carboxyethylation modification sites is essential for elucidating their functional mechanisms. Unfortunately, there are currently no computational tools that can accurately predict these sites, posing a significant challenge to this area of research. In this study, we developed a new deep learning model, DLBWE-Cys, which integrates CNN, BiLSTM, Bahdanau attention mechanisms, and a fully connected neural network (FNN), using Binary-Weight encoding specifically designed for the accurate identification of cysteine S-carboxyethylation sites. Our experimental results show that our model architecture outperforms other machine learning and deep learning models in 5-fold cross-validation and independent testing. Feature comparison experiments confirmed the superiority of our proposed Binary-Weight encoding method over other encoding techniques. t-SNE visualization further validated the model's effective classification capabilities. Additionally, we confirmed the similarity between the distribution of positional weights in our Binary-Weight encoding and the allocation of weights in attentional mechanisms. Further experiments proved the effectiveness of our Binary-Weight encoding approach. Thus, this model paves the way for predicting cysteine S-carboxyethylation modification sites in protein sequences. The source code of DLBWE-Cys and experiments data are available at: https://github.com/ztLuo-bioinfo/DLBWE-Cys.
Collapse
Affiliation(s)
- Zhengtao Luo
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, China
- Anhui Provincial Engineering Research Center for Agricultural Information Perception and Intelligent Computing, Anhui Agricultural University, Hefei, Anhui, China
| | - Qingyong Wang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, China
- Anhui Provincial Engineering Research Center for Agricultural Information Perception and Intelligent Computing, Anhui Agricultural University, Hefei, Anhui, China
| | - Yingchun Xia
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, China
- Anhui Provincial Engineering Research Center for Agricultural Information Perception and Intelligent Computing, Anhui Agricultural University, Hefei, Anhui, China
| | - Xiaolei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, China
- Anhui Provincial Engineering Research Center for Agricultural Information Perception and Intelligent Computing, Anhui Agricultural University, Hefei, Anhui, China
| | - Shuai Yang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, China
- Anhui Provincial Engineering Research Center for Agricultural Information Perception and Intelligent Computing, Anhui Agricultural University, Hefei, Anhui, China
| | - Zhaochun Xu
- Computer Department, Jingdezhen Ceramic University, Jingdezhen, China
- School for Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, China
| | - Lichuan Gu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Hefei, Anhui, China
- Anhui Provincial Engineering Research Center for Agricultural Information Perception and Intelligent Computing, Anhui Agricultural University, Hefei, Anhui, China
| |
Collapse
|
2
|
Na S, Paek E. Demystifying PTM Identification Using MODplus: Best Practices and Pitfalls. Methods Mol Biol 2024; 2836:37-55. [PMID: 38995534 DOI: 10.1007/978-1-0716-4007-4_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Tandem mass spectrometry (MS/MS) facilitates the rapid identification of posttranslational modifications (PTMs), which play a pivotal role in regulating numerous biological processes. This chapter explores recent advancements that expand the types of detectable PTMs and enhance the speed of the PTM searches. We also delve into computational challenges associated with searching for a multitude of PTMs simultaneously. The latter section introduces an automated procedure to identify an extensive range of PTMs using MODplus, a free PTM analysis software tool. We guide the reader through the preparation of the modification search, the determination of optional search parameters, the execution of the search, and the analysis of results, exemplified by a case study using specific MS/MS dataset.
Collapse
Affiliation(s)
- Seungjin Na
- Digital Omics Research Center, Korea Basic Science Institute, Cheongju, South Korea
| | - Eunok Paek
- Department of Computer Science, Hanyang University, Seoul, South Korea.
- Department of Artificial Intelligence, Hanyang University, Seoul, South Korea.
- Institute for Artificial Intelligence Research, Hanyang University, Seoul, South Korea.
| |
Collapse
|
3
|
Zhai Y, Chen L, Zhao Q, Zheng ZH, Chen ZN, Bian H, Yang X, Lu HY, Lin P, Chen X, Chen R, Sun HY, Fan LN, Zhang K, Wang B, Sun XX, Feng Z, Zhu YM, Zhou JS, Chen SR, Zhang T, Chen SY, Chen JJ, Zhang K, Wang Y, Chang Y, Zhang R, Zhang B, Wang LJ, Li XM, He Q, Yang XM, Nan G, Xie RH, Yang L, Yang JH, Zhu P. Cysteine carboxyethylation generates neoantigens to induce HLA-restricted autoimmunity. Science 2023; 379:eabg2482. [PMID: 36927018 DOI: 10.1126/science.abg2482] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Autoimmune diseases such as ankylosing spondylitis (AS) can be driven by emerging neoantigens that disrupt immune tolerance. Here, we developed a workflow to profile posttranslational modifications involved in neoantigen formation. Using mass spectrometry, we identified a panel of cysteine residues differentially modified by carboxyethylation that required 3-hydroxypropionic acid to generate neoantigens in patients with AS. The lysosomal degradation of integrin αIIb [ITGA2B (CD41)] carboxyethylated at Cys96 (ITGA2B-ceC96) generated carboxyethylated peptides that were presented by HLA-DRB1*04 to stimulate CD4+ T cell responses and induce autoantibody production. Immunization of HLA-DR4 transgenic mice with the ITGA2B-ceC96 peptide promoted colitis and vertebral bone erosion. Thus, metabolite-induced cysteine carboxyethylation can give rise to pathogenic neoantigens that lead to autoreactive CD4+ T cell responses and autoantibody production in autoimmune diseases.
Collapse
Affiliation(s)
- Yue Zhai
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Liang Chen
- School of Medicine, Shanghai University, Shanghai 200444, China
| | - Qian Zhao
- Clinical Systems Biology Laboratories, Translational Medicine Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450001, China
| | - Zhao-Hui Zheng
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Zhi-Nan Chen
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Huijie Bian
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Xu Yang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Huan-Yu Lu
- Department of Occupational and Environmental Health and the Ministry of Education Key Lab of Hazard Assessment and Control in Special Operational Environment, School of Public Health, Fourth Military Medical University, Xi'an 710032, China
| | - Peng Lin
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Xi Chen
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Ruo Chen
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Hao-Yang Sun
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Lin-Ni Fan
- State Key Laboratory of Cancer Biology, Department of Pathology, Xijing Hospital and School of Basic Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Kun Zhang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Bin Wang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Xiu-Xuan Sun
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Zhuan Feng
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Yu-Meng Zhu
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Jian-Sheng Zhou
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Shi-Rui Chen
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Tao Zhang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Si-Yu Chen
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Jun-Jie Chen
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Kui Zhang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Yan Wang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Yang Chang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Rui Zhang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Bei Zhang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Li-Juan Wang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Xiao-Min Li
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Qian He
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Xiang-Min Yang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Gang Nan
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Rong-Hua Xie
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Liu Yang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| | - Jing-Hua Yang
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
- Clinical Systems Biology Laboratories, Translational Medicine Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450001, China
| | - Ping Zhu
- Department of Clinical Immunology, Xijing Hospital, and Department of Cell Biology of National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an 710032, China
| |
Collapse
|
4
|
Arab I, Fondrie WE, Laukens K, Bittremieux W. Semisupervised Machine Learning for Sensitive Open Modification Spectral Library Searching. J Proteome Res 2023; 22:585-593. [PMID: 36688569 DOI: 10.1021/acs.jproteome.2c00616] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
A key analysis task in mass spectrometry proteomics is matching the acquired tandem mass spectra to their originating peptides by sequence database searching or spectral library searching. Machine learning is an increasingly popular postprocessing approach to maximize the number of confident spectrum identifications that can be obtained at a given false discovery rate threshold. Here, we have integrated semisupervised machine learning in the ANN-SoLo tool, an efficient spectral library search engine that is optimized for open modification searching to identify peptides with any type of post-translational modification. We show that machine learning rescoring boosts the number of spectra that can be identified for both standard searching and open searching, and we provide insights into relevant spectrum characteristics harnessed by the machine learning model. The semisupervised machine learning functionality has now been fully integrated into ANN-SoLo, which is available as open source under the permissive Apache 2.0 license on GitHub at https://github.com/bittremieux/ANN-SoLo.
Collapse
Affiliation(s)
- Issar Arab
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | | | - Kris Laukens
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| |
Collapse
|
5
|
Kacen A, Javitt A, Kramer MP, Morgenstern D, Tsaban T, Shmueli MD, Teo GC, da Veiga Leprevost F, Barnea E, Yu F, Admon A, Eisenbach L, Samuels Y, Schueler-Furman O, Levin Y, Nesvizhskii AI, Merbl Y. Post-translational modifications reshape the antigenic landscape of the MHC I immunopeptidome in tumors. Nat Biotechnol 2023; 41:239-251. [PMID: 36203013 PMCID: PMC11197725 DOI: 10.1038/s41587-022-01464-2] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 08/09/2022] [Indexed: 11/08/2022]
Abstract
Post-translational modification (PTM) of antigens provides an additional source of specificities targeted by immune responses to tumors or pathogens, but identifying antigen PTMs and assessing their role in shaping the immunopeptidome is challenging. Here we describe the Protein Modification Integrated Search Engine (PROMISE), an antigen discovery pipeline that enables the analysis of 29 different PTM combinations from multiple clinical cohorts and cell lines. We expanded the antigen landscape, uncovering human leukocyte antigen class I binding motifs defined by specific PTMs with haplotype-specific binding preferences and revealing disease-specific modified targets, including thousands of new cancer-specific antigens that can be shared between patients and across cancer types. Furthermore, we uncovered a subset of modified peptides that are specific to cancer tissue and driven by post-translational changes that occurred in the tumor proteome. Our findings highlight principles of PTM-driven antigenicity, which may have broad implications for T cell-mediated therapies in cancer and beyond.
Collapse
Affiliation(s)
- Assaf Kacen
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Aaron Javitt
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Matthias P Kramer
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - David Morgenstern
- De Botton Institute for Protein Profiling, Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot, Israel
| | - Tomer Tsaban
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University, Jerusalem, Israel
| | - Merav D Shmueli
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | | | - Eilon Barnea
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, Israel
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Arie Admon
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, Israel
| | - Lea Eisenbach
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Yardena Samuels
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University, Jerusalem, Israel
| | - Yishai Levin
- De Botton Institute for Protein Profiling, Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot, Israel
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Yifat Merbl
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
6
|
Robusti G, Vai A, Bonaldi T, Noberini R. Investigating pathological epigenetic aberrations by epi-proteomics. Clin Epigenetics 2022; 14:145. [PMID: 36371348 PMCID: PMC9652867 DOI: 10.1186/s13148-022-01371-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 11/04/2022] [Indexed: 11/13/2022] Open
Abstract
Epigenetics includes a complex set of processes that alter gene activity without modifying the DNA sequence, which ultimately determines how the genetic information common to all the cells of an organism is used to generate different cell types. Dysregulation in the deposition and maintenance of epigenetic features, which include histone posttranslational modifications (PTMs) and histone variants, can result in the inappropriate expression or silencing of genes, often leading to diseased states, including cancer. The investigation of histone PTMs and variants in the context of clinical samples has highlighted their importance as biomarkers for patient stratification and as key players in aberrant epigenetic mechanisms potentially targetable for therapy. Mass spectrometry (MS) has emerged as the most powerful and versatile tool for the comprehensive, unbiased and quantitative analysis of histone proteoforms. In recent years, these approaches-which we refer to as "epi-proteomics"-have demonstrated their usefulness for the investigation of epigenetic mechanisms in pathological conditions, offering a number of advantages compared with the antibody-based methods traditionally used to profile clinical samples. In this review article, we will provide a critical overview of the MS-based approaches that can be employed to study histone PTMs and variants in clinical samples, with a strong focus on the latest advances in this area, such as the analysis of uncommon modifications and the integration of epi-proteomics data into multi-OMICs approaches, as well as the challenges to be addressed to fully exploit the potential of this novel field of research.
Collapse
Affiliation(s)
- Giulia Robusti
- grid.15667.330000 0004 1757 0843Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, 20139 Milan, Italy
| | - Alessandro Vai
- grid.15667.330000 0004 1757 0843Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, 20139 Milan, Italy
| | - Tiziana Bonaldi
- grid.15667.330000 0004 1757 0843Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, 20139 Milan, Italy ,grid.4708.b0000 0004 1757 2822Department of Oncology and Hematology-Oncology, University of Milan, 20122 Milan, Italy
| | - Roberta Noberini
- grid.15667.330000 0004 1757 0843Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, 20139 Milan, Italy
| |
Collapse
|
7
|
Kawai T, Matsumori N, Otsuka K. Recent advances in microscale separation techniques for lipidome analysis. Analyst 2021; 146:7418-7430. [PMID: 34787600 DOI: 10.1039/d1an00967b] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This review paper highlights the recent research on liquid-phase microscale separation techniques for lipidome analysis over the last 10 years, mainly focusing on capillary liquid chromatography (LC) and capillary electrophoresis (CE) coupled with mass spectrometry (MS). Lipids are one of the most important classes of biomolecules which are involved in the cell membrane, energy storage, signal transduction, and so on. Since lipids include a variety of hydrophobic compounds including numerous structural isomers, lipidomes are a challenging target in bioanalytical chemistry. MS is the key technology that comprehensively identifies lipids; however, separation techniques like LC and CE are necessary prior to MS detection in order to avoid ionization suppression and resolve structural isomers. Separation techniques using μm-scale columns, such as a fused silica capillary and microfluidic device, are effective at realizing high-resolution separation. Microscale separation usually employs a nL-scale flow, which is also compatible with nanoelectrospray ionization-MS that achieves high sensitivity. Owing to such analytical advantages, microscale separation techniques like capillary/microchip LC and CE have been employed for more than 100 lipidome studies. Such techniques are still being evolved and achieving further higher resolution and wider coverage of lipidomes. Therefore, microscale separation techniques are promising as the fundamental technology in next-generation lipidome analysis.
Collapse
Affiliation(s)
- Takayuki Kawai
- Department of Chemistry, Faculty of Science, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan.
| | - Nobuaki Matsumori
- Department of Chemistry, Faculty of Science, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan.
| | - Koji Otsuka
- Department of Material Chemistry, Graduate School of Engineering, Kyoto University, Katsura, Nishikyo-ku, Kyoto 615-8510, Japan.
| |
Collapse
|
8
|
Xu L, Lu Z, Yu S, Li G, Chen Y. Quantitative global proteome and phosphorylome analyses reveal potential biomarkers in kidney cancer. Oncol Rep 2021; 46:237. [PMID: 34528699 PMCID: PMC8453689 DOI: 10.3892/or.2021.8188] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2019] [Accepted: 08/05/2021] [Indexed: 12/12/2022] Open
Abstract
Currently, high‑throughput quantitative proteomic and transcriptomic approaches have been widely used for exploring the molecular mechanisms and acquiring biomarkers for cancers. Our study aimed to illuminate the multi-dimensional molecular mechanisms underlying renal cell carcinoma (RCC) via investigating the quantitative global proteome and the profile of phosphorylation. A total of 5,428 proteins and 8,632 phosphorylation sites were quantified in RCC tissues, with 709 proteins and 649 phosphorylation sites found to be altered in expression compared with the matched adjacent non‑tumor tissues. These differentially expressed proteins were mainly involved in metabolic process terms involving the glycolysis pathway, oxidative phosphorylation and fatty acid metabolism which have been considered to be a potential mechanism of RCC progression. Moreover, phosphorylation analysis indicated that these upregulated phosphorylated proteins are implicated in the glucagon signaling pathway and cholesterol metabolism, while the downregulated phosphorylated proteins were found to be predominantly involved in glycolysis, the pentose phosphate pathway, carbon metabolism and biosynthesis of amino acids. In addition, several new candidate proteins, CD14, MPO, NCF2, SOD2, PARP1, were found to be upregulated and MUT, ACADM, PCK1 were downregulated in RCC. These proteins may be recognized as new biomarkers for RCC. These findings could broaden our insight into the underlying molecular mechanisms of RCC and identify candidate biomarkers for the treatment of RCC.
Collapse
Affiliation(s)
- Liwei Xu
- Department of Urology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310016, P.R. China
| | - Zeyi Lu
- Department of Urology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310016, P.R. China
| | - Shicheng Yu
- Department of Urology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310016, P.R. China
| | - Gonghui Li
- Department of Urology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310016, P.R. China
| | - Yuanlei Chen
- Department of Urology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang 310016, P.R. China
| |
Collapse
|
9
|
Keenan EK, Zachman DK, Hirschey MD. Discovering the landscape of protein modifications. Mol Cell 2021; 81:1868-1878. [PMID: 33798408 PMCID: PMC8106652 DOI: 10.1016/j.molcel.2021.03.015] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 02/21/2021] [Accepted: 03/10/2021] [Indexed: 02/08/2023]
Abstract
Protein modifications modulate nearly every aspect of cell biology in organisms, ranging from Archaea to Eukaryotes. The earliest evidence of covalent protein modifications was found in the early 20th century by studying the amino acid composition of proteins by chemical hydrolysis. These discoveries challenged what defined a canonical amino acid. The advent and rapid adoption of mass-spectrometry-based proteomics in the latter part of the 20th century enabled a veritable explosion in the number of known protein modifications, with more than 500 discrete modifications counted today. Now, new computational tools in data science, machine learning, and artificial intelligence are poised to allow researchers to make significant progress in discovering new protein modifications and determining their function. In this review, we take an opportunity to revisit the historical discovery of key post-translational modifications, quantify the current landscape of covalent protein adducts, and assess the role that new computational tools will play in the future of this field.
Collapse
Affiliation(s)
- E Keith Keenan
- Duke Molecular Physiology Institute and Sarah W. Stedman Nutrition and Metabolism Center, Duke University Medical Center, Durham, NC 27701, USA; Department of Pharmacology & Cancer Biology, Duke University Medical Center, Durham, NC 27710, USA
| | - Derek K Zachman
- Duke Molecular Physiology Institute and Sarah W. Stedman Nutrition and Metabolism Center, Duke University Medical Center, Durham, NC 27701, USA
| | - Matthew D Hirschey
- Duke Molecular Physiology Institute and Sarah W. Stedman Nutrition and Metabolism Center, Duke University Medical Center, Durham, NC 27701, USA; Department of Pharmacology & Cancer Biology, Duke University Medical Center, Durham, NC 27710, USA; Division of Endocrinology, Metabolism, & Nutrition, Department of Medicine, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
10
|
Bugyi F, Szabó D, Szabó G, Révész Á, Pape VFS, Soltész-Katona E, Tóth E, Kovács O, Langó T, Vékey K, Drahos L. Influence of Post-Translational Modifications on Protein Identification in Database Searches. ACS OMEGA 2021; 6:7469-7477. [PMID: 33778259 PMCID: PMC7992065 DOI: 10.1021/acsomega.0c05997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 03/02/2021] [Indexed: 06/12/2023]
Abstract
Comprehensive analysis of post-translation modifications (PTMs) is an important mission of proteomics. However, the consideration of PTMs increases the search space and may therefore impair the efficiency of protein identification. Using thousands of proteomic searches, we investigated the practical aspects of considering multiple PTMs in Byonic searches for the maximization of protein and peptide hits. The inclusion of all PTMs, which occur with at least 2% frequency in the sample, has an advantageous effect on protein and peptide identification. A linear relationship was established between the number of considered PTMs and the number of reliably identified peptides and proteins. Even though they handle multiple modifications less efficiently, the results of MASCOT (using the Percolator function) and Andromeda (the search engine included in MaxQuant) became comparable to those of Byonic, in the case of a few PTMs.
Collapse
Affiliation(s)
- Fanni Bugyi
- Institute
of Organic Chemistry, Research Centre for
Natural Sciences, Magyar Tudósok krt 2, H-1117 Budapest, Hungary
- Hevesy
György PhD School of Chemistry, Eötvös
Loránd University, Pázmány Péter sétány 1/A, H-1117 Budapest, Hungary
| | - Dániel Szabó
- Institute
of Organic Chemistry, Research Centre for
Natural Sciences, Magyar Tudósok krt 2, H-1117 Budapest, Hungary
- Hevesy
György PhD School of Chemistry, Eötvös
Loránd University, Pázmány Péter sétány 1/A, H-1117 Budapest, Hungary
| | - Győző Szabó
- Institute
of Organic Chemistry, Research Centre for
Natural Sciences, Magyar Tudósok krt 2, H-1117 Budapest, Hungary
- Faculty
of Informatics, Eötvös Loránd
University, Pázmány
Péter sétány 1/C, H-1117 Budapest, Hungary
| | - Ágnes Révész
- Institute
of Organic Chemistry, Research Centre for
Natural Sciences, Magyar Tudósok krt 2, H-1117 Budapest, Hungary
| | - Veronika F. S. Pape
- Department
of Physiology, Faculty of Medicine, Semmelweis
University, Tűzoltó utca 37-47, H-1094 Budapest, Hungary
| | - Eszter Soltész-Katona
- Department
of Physiology, Faculty of Medicine, Semmelweis
University, Tűzoltó utca 37-47, H-1094 Budapest, Hungary
- ELKH
Supported Research Groups, Gellérthegy u. 30-32, H-1016 Budapest, Hungary
| | - Eszter Tóth
- Institute
of Organic Chemistry, Research Centre for
Natural Sciences, Magyar Tudósok krt 2, H-1117 Budapest, Hungary
- Institute
of Enzymology, Research Centre for Natural
Sciences, Magyar Tudósok krt 2., H-1117 Budapest, Hungary
| | - Orsolya Kovács
- Department
of Physiology, Faculty of Medicine, Semmelweis
University, Tűzoltó utca 37-47, H-1094 Budapest, Hungary
- Department
of Genetics, Cell- and Immunobiology, Semmelweis
University, Nagyvárad tér 4, H-1089 Budapest, Hungary
| | - Tamás Langó
- Institute
of Enzymology, Research Centre for Natural
Sciences, Magyar Tudósok krt 2., H-1117 Budapest, Hungary
| | - Károly Vékey
- Institute
of Organic Chemistry, Research Centre for
Natural Sciences, Magyar Tudósok krt 2, H-1117 Budapest, Hungary
| | - László Drahos
- Institute
of Organic Chemistry, Research Centre for
Natural Sciences, Magyar Tudósok krt 2, H-1117 Budapest, Hungary
| |
Collapse
|
11
|
Kawai T. Recent Advances in Trace Bioanalysis by Capillary Electrophoresis. ANAL SCI 2021; 37:27-36. [PMID: 33041311 DOI: 10.2116/analsci.20sar12] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 09/29/2020] [Indexed: 07/25/2024]
Abstract
Recently, single cell analysis is becoming more and more important to elucidate cellular heterogeneity. Except for nucleic acid that can be amplified by PCR, the required technical level for single cell analysis is extremely high and the appropriate design of sample preparation and a sensitive analytical system is necessary. Capillary/microchip electrophoresis (CE/MCE) can separate biomolecules in nL-scale solution with high resolution, and it is highly compatible with trace samples like a single cell. Coupled with highly sensitive detectors such as laser-induced fluorescence and nano-electrospray ionization-mass spectrometry, zmol level analytes can be detected. For further enhancing sensitivity, online sample preconcentration techniques can be employed. By integrating these high-sensitive techniques, single cell analysis of metabolites, proteins, and lipids have been achieved. This review paper highlights successful research on CE/MCE-based trace bioanalysis in recent 10 years. Firstly, an overview of basic knowledge on CE/MCE including sensitivity enhancement techniques is provided. Applications to trace bioanalysis are then introduced with discussion on current issues and future prospects.
Collapse
Affiliation(s)
- Takayuki Kawai
- RIKEN Center for Biosystems Dynamics Research
- Graduate School of Frontier Biosciences, Osaka University
| |
Collapse
|
12
|
Abstract
Glycoproteomics is unquestionably on the rise and its current development benefits from past experience in proteomics, in particular when attending to bioinformatics needs. An extensive range of software solutions is available, but the reproducibility of mass spectrometry data processing remains challenging. One of the key issues in running automated glycopeptide identification software is the selection of a reference glycan composition file. The default choices are often too broad, and a fastidious literature search to properly target this selection can be avoided. This chapter suggests the use of GlyConnect Compozitor to collect relevant information on glycosylation in a given tissue or cell line and shape an appropriate glycan composition set that can be input in the majority of search engines accommodating user-defined compositions.
Collapse
|
13
|
Smolikova G, Gorbach D, Lukasheva E, Mavropolo-Stolyarenko G, Bilova T, Soboleva A, Tsarev A, Romanovskaya E, Podolskaya E, Zhukov V, Tikhonovich I, Medvedev S, Hoehenwarter W, Frolov A. Bringing New Methods to the Seed Proteomics Platform: Challenges and Perspectives. Int J Mol Sci 2020; 21:E9162. [PMID: 33271881 PMCID: PMC7729594 DOI: 10.3390/ijms21239162] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 11/26/2020] [Accepted: 11/27/2020] [Indexed: 12/14/2022] Open
Abstract
For centuries, crop plants have represented the basis of the daily human diet. Among them, cereals and legumes, accumulating oils, proteins, and carbohydrates in their seeds, distinctly dominate modern agriculture, thus play an essential role in food industry and fuel production. Therefore, seeds of crop plants are intensively studied by food chemists, biologists, biochemists, and nutritional physiologists. Accordingly, seed development and germination as well as age- and stress-related alterations in seed vigor, longevity, nutritional value, and safety can be addressed by a broad panel of analytical, biochemical, and physiological methods. Currently, functional genomics is one of the most powerful tools, giving direct access to characteristic metabolic changes accompanying plant development, senescence, and response to biotic or abiotic stress. Among individual post-genomic methodological platforms, proteomics represents one of the most effective ones, giving access to cellular metabolism at the level of proteins. During the recent decades, multiple methodological advances were introduced in different branches of life science, although only some of them were established in seed proteomics so far. Therefore, here we discuss main methodological approaches already employed in seed proteomics, as well as those still waiting for implementation in this field of plant research, with a special emphasis on sample preparation, data acquisition, processing, and post-processing. Thereby, the overall goal of this review is to bring new methodologies emerging in different areas of proteomics research (clinical, food, ecological, microbial, and plant proteomics) to the broad society of seed biologists.
Collapse
Affiliation(s)
- Galina Smolikova
- Department of Plant Physiology and Biochemistry, St. Petersburg State University; 199034 St. Petersburg, Russia; (G.S.); (T.B.); (S.M.)
| | - Daria Gorbach
- Department of Biochemistry, St. Petersburg State University; 199178 St. Petersburg, Russia; (D.G.); (E.L.); (G.M.-S.); (A.S.); (A.T.); (E.R.)
| | - Elena Lukasheva
- Department of Biochemistry, St. Petersburg State University; 199178 St. Petersburg, Russia; (D.G.); (E.L.); (G.M.-S.); (A.S.); (A.T.); (E.R.)
| | - Gregory Mavropolo-Stolyarenko
- Department of Biochemistry, St. Petersburg State University; 199178 St. Petersburg, Russia; (D.G.); (E.L.); (G.M.-S.); (A.S.); (A.T.); (E.R.)
| | - Tatiana Bilova
- Department of Plant Physiology and Biochemistry, St. Petersburg State University; 199034 St. Petersburg, Russia; (G.S.); (T.B.); (S.M.)
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry; 06120 Halle (Saale), Germany
| | - Alena Soboleva
- Department of Biochemistry, St. Petersburg State University; 199178 St. Petersburg, Russia; (D.G.); (E.L.); (G.M.-S.); (A.S.); (A.T.); (E.R.)
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry; 06120 Halle (Saale), Germany
| | - Alexander Tsarev
- Department of Biochemistry, St. Petersburg State University; 199178 St. Petersburg, Russia; (D.G.); (E.L.); (G.M.-S.); (A.S.); (A.T.); (E.R.)
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry; 06120 Halle (Saale), Germany
| | - Ekaterina Romanovskaya
- Department of Biochemistry, St. Petersburg State University; 199178 St. Petersburg, Russia; (D.G.); (E.L.); (G.M.-S.); (A.S.); (A.T.); (E.R.)
| | - Ekaterina Podolskaya
- Institute of Analytical Instrumentation, Russian Academy of Science; 190103 St. Petersburg, Russia;
- Institute of Toxicology, Russian Federal Medical Agency; 192019 St. Petersburg, Russia
| | - Vladimir Zhukov
- All-Russia Research Institute for Agricultural Microbiology; 196608 St. Petersburg, Russia; (V.Z.); (I.T.)
| | - Igor Tikhonovich
- All-Russia Research Institute for Agricultural Microbiology; 196608 St. Petersburg, Russia; (V.Z.); (I.T.)
- Department of Genetics and Biotechnology, St. Petersburg State University; 199034 St. Petersburg, Russia
| | - Sergei Medvedev
- Department of Plant Physiology and Biochemistry, St. Petersburg State University; 199034 St. Petersburg, Russia; (G.S.); (T.B.); (S.M.)
| | - Wolfgang Hoehenwarter
- Proteome Analytics Research Group, Leibniz Institute of Plant Biochemistry, 06120 Halle (Saale), Germany;
| | - Andrej Frolov
- Department of Biochemistry, St. Petersburg State University; 199178 St. Petersburg, Russia; (D.G.); (E.L.); (G.M.-S.); (A.S.); (A.T.); (E.R.)
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry; 06120 Halle (Saale), Germany
| |
Collapse
|
14
|
Wang H, Leeming MG, Cochran BJ, Hook JM, Ho J, Nguyen GTH, Zhong L, Supuran CT, Donald WA. Nontargeted Identification of Plasma Proteins O-, N-, and S-Transmethylated by O-Methyl Organophosphates. Anal Chem 2020; 92:15420-15428. [PMID: 33200920 DOI: 10.1021/acs.analchem.0c03077] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Organophosphates (OPs) are used worldwide as pesticides. However, acute and chronic exposure to OPs can cause serious adverse health effects. The mechanism of delayed OP toxicity is thought to involve off-target inhibition of serine proteases, although the precise molecular details remain unclear owing to the lack of an analytical method for global detection of protein targets of OPs. Here, we report the development of a mass spectrometry method to identify OP-adducted proteins from complex mixtures in a nontargeted manner. Human plasma was incubated with the OP dichlorvos that was 50% isotopically labeled and 50% unlabeled. Proteins and protein adducts were extracted, digested, and analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) to detect "twin ions" of peptides that were covalently modified by a chemical reaction with dichlorvos. The LC-MS/MS data were processed by a blended data analytics software (Xenophile) to detect the amino acid residue sites of proteins that were covalently modified by exposure to OPs. We discovered that OPs can transmethylate the N, S, and O side chains of His, Cys, Glu, Asp, and Lys residues. For model systems, such transmethylation reactions were confirmed by LC-MS, nuclear magnetic resonance (NMR), and rationalized using electronic structure calculations. Methylation of the ubiquitous antioxidant glutathione by dichlorvos can decrease the reducing/oxidizing equilibrium of glutathione in liver extracts, which has been implicated in diseases and pathological conditions associated with delayed OP toxicity.
Collapse
Affiliation(s)
- Huixin Wang
- School of Chemistry, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Michael G Leeming
- Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, Victoria 3052, Australia
| | - Blake J Cochran
- School of Medical Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - James M Hook
- School of Chemistry, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Junming Ho
- School of Chemistry, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Giang T H Nguyen
- School of Chemistry, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Ling Zhong
- Mark Wainwright Analytical Centre, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Claudiu T Supuran
- Department of Neuroscience, Psychology, Drug Research and Child's Health, Section of Pharmaceutical and Nutraceutical Sciences, University of Florence, Sesto Fiorentino 50019, Italy
| | - William A Donald
- School of Chemistry, University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
15
|
Whetton AD, Preston GW, Abubeker S, Geifman N. Proteomics and Informatics for Understanding Phases and Identifying Biomarkers in COVID-19 Disease. J Proteome Res 2020; 19:4219-4232. [PMID: 32657586 PMCID: PMC7384384 DOI: 10.1021/acs.jproteome.0c00326] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Indexed: 02/07/2023]
Abstract
The emergence of novel coronavirus disease 2019 (COVID-19), caused by the SARS-CoV-2 coronavirus, has necessitated the urgent development of new diagnostic and therapeutic strategies. Rapid research and development, on an international scale, has already generated assays for detecting SARS-CoV-2 RNA and host immunoglobulins. However, the complexities of COVID-19 are such that fuller definitions of patient status, trajectory, sequelae, and responses to therapy are now required. There is accumulating evidence-from studies of both COVID-19 and the related disease SARS-that protein biomarkers could help to provide this definition. Proteins associated with blood coagulation (D-dimer), cell damage (lactate dehydrogenase), and the inflammatory response (e.g., C-reactive protein) have already been identified as possible predictors of COVID-19 severity or mortality. Proteomics technologies, with their ability to detect many proteins per analysis, have begun to extend these early findings. To be effective, proteomics strategies must include not only methods for comprehensive data acquisition (e.g., using mass spectrometry) but also informatics approaches via which to derive actionable information from large data sets. Here we review applications of proteomics to COVID-19 and SARS and outline how pipelines involving technologies such as artificial intelligence could be of value for research on these diseases.
Collapse
Affiliation(s)
- Anthony D. Whetton
- Stoller
Biomarker Discovery Centre, Faculty of Biology Medicine and Health
(FBMH), University of Manchester, Manchester M20 4GJ, United Kingdom
- Stem
Cell and Leukaemia Proteomics Laboratory, Manchester Cancer Research
Centre, University of Manchester, Manchester M13 9PL, United Kingdom
- Manchester
National Institute for Health Biomedical Research Centre, Manchester M13 9WL, United Kingdom
| | - George W. Preston
- Stoller
Biomarker Discovery Centre, Faculty of Biology Medicine and Health
(FBMH), University of Manchester, Manchester M20 4GJ, United Kingdom
- Stem
Cell and Leukaemia Proteomics Laboratory, Manchester Cancer Research
Centre, University of Manchester, Manchester M13 9PL, United Kingdom
| | - Semira Abubeker
- Stoller
Biomarker Discovery Centre, Faculty of Biology Medicine and Health
(FBMH), University of Manchester, Manchester M20 4GJ, United Kingdom
- Stem
Cell and Leukaemia Proteomics Laboratory, Manchester Cancer Research
Centre, University of Manchester, Manchester M13 9PL, United Kingdom
| | - Nophar Geifman
- Centre
for Health Informatics, FBMH, University
of Manchester, Manchester M13 9PL, United Kingdom
| |
Collapse
|
16
|
Preston GW, Yang L, Phillips DH, Maier CS. Visualisation tools for dependent peptide searches to support the exploration of in vitro protein modifications. PLoS One 2020; 15:e0235263. [PMID: 32639981 PMCID: PMC7343161 DOI: 10.1371/journal.pone.0235263] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 06/11/2020] [Indexed: 01/16/2023] Open
Abstract
Dependent peptide searching is a method for discovering covalently-modified peptides-and therefore proteins-in mass-spectrometry-based proteomics experiments. Being more permissive than standard search methods, it has the potential to discover novel modifications (e.g., post-translational modifications occurring in vivo, or modifications introduced in vitro). However, few studies have explored dependent peptide search results in an untargeted way. In the present study, we sought to evaluate dependent peptide searching as a means of characterising proteins that have been modified in vitro. We generated a model data set by analysing N-ethylmaleimide-treated bovine serum albumin, and performed dependent peptide searches using the popular MaxQuant software. To facilitate interpretation of the search results (hundreds of dependent peptides), we developed a series of visualisation tools (R scripts). We used the tools to assess the diversity of putative modifications in the albumin, and to pinpoint hypothesised modifications. We went on to explore the tools' generality via analyses of public data from studies of rat and human proteomes. Of 19 expected sites of modification (one in rat cofilin-1 and 18 across six different human plasma proteins), eight were found and correctly localised. Apparently, some sites went undetected because chemical enrichment had depleted necessary analytes (potential 'base' peptides). Our results demonstrate (i) the ability of the tools to provide accurate and informative visualisations, and (ii) the usefulness of dependent peptide searching for characterising in vitro protein modifications. Our model data are available via PRIDE/ProteomeXchange (accession number PXD013040).
Collapse
Affiliation(s)
- George W. Preston
- Department of Analytical, MRC-PHE Centre for Environment & Health, Environmental & Forensic Sciences, School of Population Health & Environmental Sciences, Faculty of Life Sciences & Medicine, King’s College London, London, England, United Kingdom
- Department of Chemistry, Oregon State University, Corvallis, OR, United States of America
| | - Liping Yang
- Department of Chemistry, Oregon State University, Corvallis, OR, United States of America
| | - David H. Phillips
- Department of Analytical, MRC-PHE Centre for Environment & Health, Environmental & Forensic Sciences, School of Population Health & Environmental Sciences, Faculty of Life Sciences & Medicine, King’s College London, London, England, United Kingdom
| | - Claudia S. Maier
- Department of Chemistry, Oregon State University, Corvallis, OR, United States of America
| |
Collapse
|
17
|
Na S, Paek E. Computational methods in mass spectrometry-based structural proteomics for studying protein structure, dynamics, and interactions. Comput Struct Biotechnol J 2020; 18:1391-1402. [PMID: 32637038 PMCID: PMC7322682 DOI: 10.1016/j.csbj.2020.06.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Revised: 06/01/2020] [Accepted: 06/01/2020] [Indexed: 12/28/2022] Open
Abstract
Mass spectrometry (MS) has made enormous contributions to comprehensive protein identification and quantification in proteomics. MS is also gaining momentum for structural biology in a variety of ways, complementing conventional structural biology techniques. Here, we will review how MS-based techniques, such as hydrogen/deuterium exchange, covalent labeling, and chemical cross-linking, enable the characterization of protein structure, dynamics, and interactions, especially from a perspective of their data analyses. Structural information encoded by chemical probes in intact proteins is decoded by interpreting MS data at a peptide level, i.e., revealing conformational and dynamic changes in local regions of proteins. The structural MS data are not amenable to data analyses in traditional proteomics workflow, requiring dedicated software for each type of data. We first provide basic principles of data interpretation, including isotopic distribution and peptide sequencing. We then focus particularly on computational methods for structural MS data analyses and discuss outstanding challenges in a proteome-wide large scale analysis.
Collapse
Affiliation(s)
- Seungjin Na
- Dept. of Computer Science, Hanyang University, Seoul 04763, Republic of Korea
| | - Eunok Paek
- Dept. of Computer Science, Hanyang University, Seoul 04763, Republic of Korea
| |
Collapse
|
18
|
Chang HY, Kong AT, da Veiga Leprevost F, Avtonomov DM, Haynes SE, Nesvizhskii AI. Crystal-C: A Computational Tool for Refinement of Open Search Results. J Proteome Res 2020; 19:2511-2515. [PMID: 32338005 DOI: 10.1021/acs.jproteome.0c00119] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Shotgun proteomics using liquid chromatography coupled to mass spectrometry (LC-MS) is commonly used to identify peptides containing post-translational modifications. With the emergence of fast database search tools such as MSFragger, the approach of enlarging precursor mass tolerances during the search (termed "open search") has been increasingly used for comprehensive characterization of post-translational and chemical modifications of protein samples. However, not all mass shifts detected using the open search strategy represent true modifications, as artifacts exist from sources such as unaccounted missed cleavages or peptide co-fragmentation (chimeric MS/MS spectra). Here, we present Crystal-C, a computational tool that detects and removes such artifacts from open search results. Our analysis using Crystal-C shows that, in a typical shotgun proteomics data set, the number of such observations is relatively small. Nevertheless, removing these artifacts helps to simplify the interpretation of the mass shift histograms, which in turn should improve the ability of open search-based tools to detect potentially interesting mass shifts for follow-up investigation.
Collapse
Affiliation(s)
- Hui-Yin Chang
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Andy T Kong
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | | | - Dmitry M Avtonomov
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Sarah E Haynes
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States.,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
19
|
Kote S, Pirog A, Bedran G, Alfaro J, Dapic I. Mass Spectrometry-Based Identification of MHC-Associated Peptides. Cancers (Basel) 2020; 12:cancers12030535. [PMID: 32110973 PMCID: PMC7139412 DOI: 10.3390/cancers12030535] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 02/15/2020] [Accepted: 02/20/2020] [Indexed: 02/06/2023] Open
Abstract
Neoantigen-based immunotherapies promise to improve patient outcomes over the current standard of care. However, detecting these cancer-specific antigens is one of the significant challenges in the field of mass spectrometry. Even though the first sequencing of the immunopeptides was done decades ago, today there is still a diversity of the protocols used for neoantigen isolation from the cell surface. This heterogeneity makes it difficult to compare results between the laboratories and the studies. Isolation of the neoantigens from the cell surface is usually done by mild acid elution (MAE) or immunoprecipitation (IP) protocol. However, limited amounts of the neoantigens present on the cell surface impose a challenge and require instrumentation with enough sensitivity and accuracy for their detection. Detecting these neopeptides from small amounts of available patient tissue limits the scope of most of the studies to cell cultures. Here, we summarize protocols for the extraction and identification of the major histocompatibility complex (MHC) class I and II peptides. We aimed to evaluate existing methods in terms of the appropriateness of the isolation procedure, as well as instrumental parameters used for neoantigen detection. We also focus on the amount of the material used in the protocols as the critical factor to consider when analyzing neoantigens. Beyond experimental aspects, there are numerous readily available proteomics suits/tools applicable for neoantigen discovery; however, experimental validation is still necessary for neoantigen characterization.
Collapse
|
20
|
Bittremieux W, Laukens K, Noble WS. Extremely Fast and Accurate Open Modification Spectral Library Searching of High-Resolution Mass Spectra Using Feature Hashing and Graphics Processing Units. J Proteome Res 2019; 18:3792-3799. [PMID: 31448616 PMCID: PMC6886738 DOI: 10.1021/acs.jproteome.9b00291] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Open modification searching (OMS) is a powerful search strategy to identify peptides with any type of modification. OMS works by using a very wide precursor mass window to allow modified spectra to match against their unmodified variants, after which the modification types can be inferred from the corresponding precursor mass differences. A disadvantage of this strategy, however, is the large computational cost, because each query spectrum has to be compared against a multitude of candidate peptides. We have previously introduced the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. Here we demonstrate how this candidate selection procedure can be further optimized using graphics processing units. Additionally, we introduce a feature hashing scheme to convert high-resolution spectra to low-dimensional vectors. On the basis of these algorithmic advances, along with low-level code optimizations, the new version of ANN-SoLo is up to an order of magnitude faster than its initial version. This makes it possible to efficiently perform open searches on a large scale to gain a deeper understanding about the protein modification landscape. We demonstrate the computational efficiency and identification performance of ANN-SoLo based on a large data set of the draft human proteome. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Kris Laukens
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
- Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| |
Collapse
|
21
|
Na S, Kim J, Paek E. MODplus: Robust and Unrestrictive Identification of Post-Translational Modifications Using Mass Spectrometry. Anal Chem 2019; 91:11324-11333. [PMID: 31365238 DOI: 10.1021/acs.analchem.9b02445] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Post-translational modifications regulate various cellular processes and are of great biological interest. Unrestrictive searches of mass spectrometry data enable the detection of any type of modification. Here we propose MODplus, which makes practical unrestrictive searches possible by allowing (1) hundreds of modifications, (2) multiple modifications per peptide, (3) the whole proteome database, and (4) any tolerant values in search parameters. The utility of MODplus was demonstrated in large human data sets of HEK293 cells and TMT-labeled phosphorylation enrichment. Notably, MODplus supports identifying different modification types at multiple sites and reports real chemical and biological modifications, as it has been very labor intensive to link unrestrictive search results to real modifications. We also confirmed the presence of Missing Precursor (MP) spectra that were not identifiable using targeted precursor masses. The MP spectra mostly resulted in identifications of wrong modifications and negatively affected the overall performance, often by as much as 10%. MODplus can rapidly recognize MP spectra and correct their identifications, resulting in increased identification rate up to 70% in the HEK293 data set as well as improved reliability.
Collapse
Affiliation(s)
- Seungjin Na
- Department of Computer Science , Hanyang University , Seoul 04763 , South Korea
| | - Jihyung Kim
- Department of Computer Science , Hanyang University , Seoul 04763 , South Korea
| | - Eunok Paek
- Department of Computer Science , Hanyang University , Seoul 04763 , South Korea
| |
Collapse
|
22
|
Nunes J, Charneira C, Nunes C, Gouveia-Fernandes S, Serpa J, Morello J, Antunes AMM. A Metabolomics-Inspired Strategy for the Identification of Protein Covalent Modifications. Front Chem 2019; 7:532. [PMID: 31417895 PMCID: PMC6684772 DOI: 10.3389/fchem.2019.00532] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 07/11/2019] [Indexed: 11/13/2022] Open
Abstract
Identification of protein covalent modifications (adducts) is a challenging task mainly due to the lack of data processing approaches for adductomics studies. Despite the huge technological advances in mass spectrometry (MS) instrumentation and bioinformatics tools for proteomics studies, these methodologies have very limited success on the identification of low abundant protein adducts. Herein we report a novel strategy inspired on the metabolomics workflows for the identification of covalently-modified peptides that consists on LC-MS data preprocessing followed by statistical analysis. The usefulness of this strategy was evaluated using experimental LC-MS data of histones isolated from HepG2 and THLE2 cells exposed to the chemical carcinogen glycidamide. LC-MS data was preprocessed using the open-source software MZmine and potential adducts were selected based on the m/z increments corresponding to glycidamide incorporation. Then, statistical analysis was applied to reveal the potential adducts as those ions are differently present in cells exposed and not exposed to glycidamide. The results were compared with the ones obtained upon the standard proteomics methodology, which relies on producing comprehensive MS/MS data by data dependent acquisition and analysis with proteomics data search engines. Our novel strategy was able to differentiate HepG2 and THLE2 and to identify adducts that were not detected by the standard methodology of adductomics. Thus, this metabolomics driven approach in adductomics will not only open new opportunities for the identification of protein epigenetic modifications, but also adducts formed by endogenous and exogenous exposure to chemical agents.
Collapse
Affiliation(s)
- João Nunes
- Centro de Química Estrutural, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| | - Catarina Charneira
- Centro de Química Estrutural, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| | - Carolina Nunes
- CEDOC, Chronic Diseases Research Centre, Faculdade de Ciências Médicas, NOVA Medical School, Universidade NOVA de Lisboa, Lisbon, Portugal.,Unidade de Investigação em Patobiologia Molecular do Instituto Português de Oncologia de Lisboa Francisco Gentil, Lisbon, Portugal
| | - Sofia Gouveia-Fernandes
- CEDOC, Chronic Diseases Research Centre, Faculdade de Ciências Médicas, NOVA Medical School, Universidade NOVA de Lisboa, Lisbon, Portugal.,Unidade de Investigação em Patobiologia Molecular do Instituto Português de Oncologia de Lisboa Francisco Gentil, Lisbon, Portugal
| | - Jacinta Serpa
- CEDOC, Chronic Diseases Research Centre, Faculdade de Ciências Médicas, NOVA Medical School, Universidade NOVA de Lisboa, Lisbon, Portugal.,Unidade de Investigação em Patobiologia Molecular do Instituto Português de Oncologia de Lisboa Francisco Gentil, Lisbon, Portugal
| | - Judit Morello
- Centro de Química Estrutural, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| | - Alexandra M M Antunes
- Centro de Química Estrutural, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| |
Collapse
|
23
|
Devabhaktuni A, Lin S, Zhang L, Swaminathan K, Gonzalez CG, Olsson N, Pearlman SM, Rawson K, Elias JE. TagGraph reveals vast protein modification landscapes from large tandem mass spectrometry datasets. Nat Biotechnol 2019; 37:469-479. [PMID: 30936560 PMCID: PMC6447449 DOI: 10.1038/s41587-019-0067-5] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 02/12/2019] [Indexed: 02/06/2023]
Abstract
Although mass spectrometry is well suited to identifying thousands of potential protein post-translational modifications (PTMs), it has historically been biased towards just a few. To measure the entire set of PTMs across diverse proteomes, software must overcome the dual challenges of covering enormous search spaces and distinguishing correct from incorrect spectrum interpretations. Here, we describe TagGraph, a computational tool that overcomes both challenges with an unrestricted string-based search method that is as much as 350-fold faster than existing approaches, and a probabilistic validation model that we optimized for PTM assignments. We applied TagGraph to a published human proteomic dataset of 25 million mass spectra and tripled confident spectrum identifications compared to its original analysis. We identified thousands of modification types on almost 1 million sites in the proteome. We show alternative contexts for highly abundant yet understudied PTMs such as proline hydroxylation, and its unexpected association with cancer mutations. By enabling broad characterization of PTMs, TagGraph informs as to how their functions and regulation intersect.
Collapse
Affiliation(s)
- Arun Devabhaktuni
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Sarah Lin
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Lichao Zhang
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Kavya Swaminathan
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Carlos G Gonzalez
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Niclas Olsson
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Samuel M Pearlman
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Keith Rawson
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA
| | - Joshua E Elias
- Department of Chemical and Systems Biology Stanford School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
24
|
An Z, Zhai L, Ying W, Qian X, Gong F, Tan M, Fu Y. PTMiner: Localization and Quality Control of Protein Modifications Detected in an Open Search and Its Application to Comprehensive Post-translational Modification Characterization in Human Proteome. Mol Cell Proteomics 2019; 18:391-405. [PMID: 30420486 PMCID: PMC6356076 DOI: 10.1074/mcp.ra118.000812] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 11/02/2018] [Indexed: 12/27/2022] Open
Abstract
The open (mass tolerant) search of tandem mass spectra of peptides shows great potential in the comprehensive detection of post-translational modifications (PTMs) in shotgun proteomics. However, this search strategy has not been widely used by the community, and one bottleneck of it is the lack of appropriate algorithms for automated and reliable post-processing of the coarse and error-prone search results. Here we present PTMiner, a software tool for confident filtering and localization of modifications (mass shifts) detected in an open search. After mass-shift-grouped false discovery rate (FDR) control of peptide-spectrum matches (PSMs), PTMiner uses an empirical Bayesian method to localize modifications through iterative learning of the prior probabilities of each type of modification occurring on different amino acids. The performance of PTMiner was evaluated on three data sets, including simulated data, chemically synthesized peptide library data and modified-peptide spiked-in proteome data. The results showed that PTMiner can effectively control the PSM FDR and accurately localize the modification sites. At 1% real false localization rate (FLR), PTMiner localized 93%, 84 and 83% of the modification sites in the three data sets, respectively, far higher than two open search engines we used and an extended version of the Ascore localization algorithm. We then used PTMiner to analyze a draft map of human proteome containing 25 million spectra from 30 tissues, and confidently identified over 1.7 million modified PSMs at 1% FDR and 1% FLR, which provided a system-wide view of both known and unknown PTMs in the human proteome.
Collapse
Affiliation(s)
- Zhiwu An
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Linhui Zhai
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Wantao Ying
- State key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, National Engineering Research Center for Protein Drugs, Beijing 102206, China, Beijing Institute of Lifeomics, Beijing 100850, China
| | - Xiaohong Qian
- State key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, National Engineering Research Center for Protein Drugs, Beijing 102206, China, Beijing Institute of Lifeomics, Beijing 100850, China
| | - Fuzhou Gong
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Minjia Tan
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China;.
| | - Yan Fu
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
25
|
Bittremieux W, Meysman P, Noble WS, Laukens K. Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing. J Proteome Res 2018; 17:3463-3474. [PMID: 30184435 PMCID: PMC6173621 DOI: 10.1021/acs.jproteome.8b00359] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. A drawback of this strategy, however, is that it leads to a large increase in search time. Although performing an open search can be done using existing spectral library search engines by simply setting a wide precursor mass window, none of these tools have been optimized for OMS, leading to excessive runtimes and suboptimal identification results. We present the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. This approach is combined with a cascade search strategy to maximize the number of identified unmodified and modified spectra while strictly controlling the false discovery rate as well as a shifted dot product score to sensitively match modified spectra to their unmodified counterparts. ANN-SoLo achieves state-of-the-art performance in terms of speed and the number of identifications. On a previously published human cell line data set, ANN-SoLo confidently identifies more spectra than SpectraST or MSFragger and achieves a speedup of an order of magnitude compared with SpectraST. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Pieter Meysman
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
- Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| | - Kris Laukens
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| |
Collapse
|
26
|
Ludwig C, Gillet L, Rosenberger G, Amon S, Collins BC, Aebersold R. Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial. Mol Syst Biol 2018; 14:e8126. [PMID: 30104418 PMCID: PMC6088389 DOI: 10.15252/msb.20178126] [Citation(s) in RCA: 685] [Impact Index Per Article: 97.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Revised: 05/11/2018] [Accepted: 05/15/2018] [Indexed: 01/16/2023] Open
Abstract
Many research questions in fields such as personalized medicine, drug screens or systems biology depend on obtaining consistent and quantitatively accurate proteomics data from many samples. SWATH-MS is a specific variant of data-independent acquisition (DIA) methods and is emerging as a technology that combines deep proteome coverage capabilities with quantitative consistency and accuracy. In a SWATH-MS measurement, all ionized peptides of a given sample that fall within a specified mass range are fragmented in a systematic and unbiased fashion using rather large precursor isolation windows. To analyse SWATH-MS data, a strategy based on peptide-centric scoring has been established, which typically requires prior knowledge about the chromatographic and mass spectrometric behaviour of peptides of interest in the form of spectral libraries and peptide query parameters. This tutorial provides guidelines on how to set up and plan a SWATH-MS experiment, how to perform the mass spectrometric measurement and how to analyse SWATH-MS data using peptide-centric scoring. Furthermore, concepts on how to improve SWATH-MS data acquisition, potential trade-offs of parameter settings and alternative data analysis strategies are discussed.
Collapse
Affiliation(s)
- Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technical University of Munich (TUM), Freising, Germany
| | - Ludovic Gillet
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - George Rosenberger
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Sabine Amon
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ben C Collins
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Faculty of Science, University of Zurich, Zurich, Switzerland
| |
Collapse
|
27
|
Van Bael S, Zels S, Boonen K, Beets I, Schoofs L, Temmerman L. A Caenorhabditis elegans Mass Spectrometric Resource for Neuropeptidomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2018; 29:879-889. [PMID: 29299835 DOI: 10.1007/s13361-017-1856-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 11/13/2017] [Accepted: 11/19/2017] [Indexed: 06/07/2023]
Abstract
Neuropeptides are important signaling molecules used by nervous systems to mediate and fine-tune neuronal communication. They can function as neurotransmitters or neuromodulators in neural circuits, or they can be released as neurohormones to target distant cells and tissues. Neuropeptides are typically cleaved from larger precursor proteins by the action of proteases and can be the subject of post-translational modifications. The short, mature neuropeptide sequences often entail the only evolutionarily reasonably conserved regions in these precursor proteins. Therefore, it is particularly challenging to predict all putative bioactive peptides through in silico mining of neuropeptide precursor sequences. Peptidomics is an approach that allows de novo characterization of peptides extracted from body fluids, cells, tissues, organs, or whole-body preparations. Mass spectrometry, often combined with on-line liquid chromatography, is a hallmark technique used in peptidomics research. Here, we used an acidified methanol extraction procedure and a quadrupole-Orbitrap LC-MS/MS pipeline to analyze the neuropeptidome of Caenorhabditis elegans. We identified an unprecedented number of 203 mature neuropeptides from C. elegans whole-body extracts, including 35 peptides from known, hypothetical, as well as from completely novel neuropeptide precursor proteins that have not been predicted in silico. This set of biochemically verified peptide sequences provides the most elaborate C. elegans reference neurpeptidome so far. To exploit this resource to the fullest, we make our in-house database of known and predicted neuropeptides available to the community as a valuable resource. We are providing these collective data to help the community progress, amongst others, by supporting future differential and/or functional studies. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- Sven Van Bael
- Animal Physiology and Neurobiology, Department of Biology, KU Leuven (University of Leuven), Leuven, Belgium
| | - Sven Zels
- Animal Physiology and Neurobiology, Department of Biology, KU Leuven (University of Leuven), Leuven, Belgium
| | - Kurt Boonen
- Animal Physiology and Neurobiology, Department of Biology, KU Leuven (University of Leuven), Leuven, Belgium
| | - Isabel Beets
- Animal Physiology and Neurobiology, Department of Biology, KU Leuven (University of Leuven), Leuven, Belgium
| | - Liliane Schoofs
- Animal Physiology and Neurobiology, Department of Biology, KU Leuven (University of Leuven), Leuven, Belgium
| | - Liesbet Temmerman
- Animal Physiology and Neurobiology, Department of Biology, KU Leuven (University of Leuven), Leuven, Belgium.
| |
Collapse
|
28
|
Dorl S, Winkler S, Mechtler K, Dorfer V. PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search. J Proteome Res 2017; 17:290-295. [PMID: 29057658 DOI: 10.1021/acs.jproteome.7b00563] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Standard proteomics workflows use tandem mass spectrometry followed by sequence database search to analyze complex biological samples. The identification of proteins carrying post-translational modifications, for example, phosphorylation, is typically addressed by allowing variable modifications in the searched sequences. Accounting for these variations exponentially increases the combinatorial space in the database, which leads to increased processing times and more false positive identifications. The here-presented tool PhoStar identifies spectra that originate from phosphorylated peptides before database search using a supervised machine learning approach. The model for the prediction of phosphorylation was trained and validated with an accuracy of 97.6% on a large set of high-confidence spectra collected from publicly available experimental data. Its power was further validated by predicting phosphorylation in the complete NIST human and mouse high collision-dissociation spectral libraries, achieving an accuracy of 98.2 and 97.9%, respectively. We demonstrate the application of PhoStar by using it for spectra filtering before database search. In database search of HeLa samples the peptide search space was reduced by 27-66% while finding at least 97% of total peptide identifications (at 1% FDR) compared with a standard workflow.
Collapse
Affiliation(s)
- Sebastian Dorl
- University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria
| | - Stephan Winkler
- University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria
| | - Karl Mechtler
- Research Institute of Molecular Pathology (IMP) , Protein Chemistry, Campus-Vienna-Biocenter 1, 1030 Vienna, Austria.,Institute of Molecular Biotechnology (IMBA), Protein Chemistry , Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Viktoria Dorfer
- University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria
| |
Collapse
|
29
|
David M, Fertin G, Rogniaux H, Tessier D. SpecOMS: A Full Open Modification Search Method Performing All-to-All Spectra Comparisons within Minutes. J Proteome Res 2017; 16:3030-3038. [PMID: 28660767 DOI: 10.1021/acs.jproteome.7b00308] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The analysis of discovery proteomics experiments relies on algorithms that identify peptides from their tandem mass spectra. The almost exhaustive interpretation of these spectra remains an unresolved issue. At present, an important number of missing interpretations is probably due to peptides displaying post-translational modifications and variants that yield spectra that are particularly difficult to interpret. However, the emergence of a new generation of mass spectrometers that provide high fragment ion accuracy has paved the way for more efficient algorithms. We present a new software, SpecOMS, that can handle the computational complexity of pairwise comparisons of spectra in the context of large volumes. SpecOMS can compare a whole set of experimental spectra generated by a discovery proteomics experiment to a whole set of theoretical spectra deduced from a protein database in a few minutes on a standard workstation. SpecOMS can ingeniously exploit those capabilities to improve the peptide identification process, allowing strong competition between all possible peptides for spectrum interpretation. Remarkably, this software resolves the drawbacks (i.e., efficiency problems and decreased sensitivity) that usually accompany open modification searches. We highlight this promising approach using results obtained from the analysis of a public human data set downloaded from the PRIDE (PRoteomics IDEntification) database.
Collapse
Affiliation(s)
- Matthieu David
- LS2N UMR CNRS 6004, Université de Nantes , F-44300 Nantes, France.,INRA UR1268 Biopolymères Interactions Assemblages, F-44316 Nantes, France
| | - Guillaume Fertin
- LS2N UMR CNRS 6004, Université de Nantes , F-44300 Nantes, France
| | - Hélène Rogniaux
- INRA UR1268 Biopolymères Interactions Assemblages, F-44316 Nantes, France
| | - Dominique Tessier
- INRA UR1268 Biopolymères Interactions Assemblages, F-44316 Nantes, France
| |
Collapse
|
30
|
Hu H, Khatri K, Zaia J. Algorithms and design strategies towards automated glycoproteomics analysis. MASS SPECTROMETRY REVIEWS 2017; 36:475-498. [PMID: 26728195 PMCID: PMC4931994 DOI: 10.1002/mas.21487] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/30/2015] [Indexed: 05/09/2023]
Abstract
Glycoproteomics involves the study of glycosylation events on protein sequences ranging from purified proteins to whole proteome scales. Understanding these complex post-translational modification (PTM) events requires elucidation of the glycan moieties (monosaccharide sequences and glycosidic linkages between residues), protein sequences, as well as site-specific attachment of glycan moieties onto protein sequences, in a spatial and temporal manner in a variety of biological contexts. Compared with proteomics, bioinformatics for glycoproteomics is immature and many researchers still rely on tedious manual interpretation of glycoproteomics data. As sample preparation protocols and analysis techniques have matured, the number of publications on glycoproteomics and bioinformatics has increased substantially; however, the lack of consensus on tool development and code reuse limits the dissemination of bioinformatics tools because it requires significant effort to migrate a computational tool tailored for one method design to alternative methods. This review discusses algorithms and methods in glycoproteomics, and refers to the general proteomics field for potential solutions. It also introduces general strategies for tool integration and pipeline construction in order to better serve the glycoproteomics community. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:475-498, 2017.
Collapse
Affiliation(s)
- Han Hu
- Bioinformatics Program, Boston University, Boston, Massachusetts 02215, USA
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Kshitij Khatri
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| |
Collapse
|
31
|
Rosenberger G, Liu Y, Röst HL, Ludwig C, Buil A, Bensimon A, Soste M, Spector TD, Dermitzakis ET, Collins BC, Malmström L, Aebersold R. Inference and quantification of peptidoforms in large sample cohorts by SWATH-MS. Nat Biotechnol 2017; 35:781-788. [PMID: 28604659 PMCID: PMC5593115 DOI: 10.1038/nbt.3908] [Citation(s) in RCA: 88] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 05/22/2017] [Indexed: 01/01/2023]
Abstract
Consistent detection and quantification of protein post-translational modifications (PTMs) across sample cohorts is a prerequisite for functional analysis of biological processes. Data-independent acquisition (DIA) is a bottom-up mass spectrometry approach that provides complete information on precursor and fragment ions. However, owing to the convoluted structure of DIA data sets, confident, systematic identification and quantification of peptidoforms has remained challenging. Here, we present inference of peptidoforms (IPF), a fully automated algorithm that uses spectral libraries to query, validate and quantify peptidoforms in DIA data sets. The method was developed on data acquired by the DIA method SWATH-MS and benchmarked using a synthetic phosphopeptide reference data set and phosphopeptide-enriched samples. IPF reduced false site-localization by more than sevenfold compared with previous approaches, while recovering 85.4% of the true signals. Using IPF, we quantified peptidoforms in DIA data acquired from >200 samples of blood plasma of a human twin cohort and assessed the contribution of heritable, environmental and longitudinal effects on their PTMs.
Collapse
Affiliation(s)
- George Rosenberger
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,PhD Program in Systems Biology, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Yansheng Liu
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Hannes L Röst
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,Department of Genetics, Stanford University, Stanford, California, USA
| | - Christina Ludwig
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,Bavarian Biomolecular Mass Spectrometry Center (BayBioMS), Technical University Munich, Freising, Germany
| | - Alfonso Buil
- Research Institute of Biological Psychiatry, Mental Health Center Sct. Hans, Roskilde, Denmark
| | - Ariel Bensimon
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Martin Soste
- Department of Biology, Institute of Biochemistry, ETH Zurich, Zurich, Switzerland
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas' Hospital Campus, London, UK
| | - Emmanouil T Dermitzakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Ben C Collins
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Lars Malmström
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,S3IT, University of Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,Faculty of Science, University of Zurich, Zurich, Switzerland
| |
Collapse
|
32
|
KAWAI T. Recent Studies on Online Sample Preconcentration Methods inCapillary Electrophoresis Coupled with Mass Spectrometry. CHROMATOGRAPHY 2017. [DOI: 10.15583/jpchrom.2017.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Affiliation(s)
- Takayuki KAWAI
- Quantitative Biology Center, RIKEN
- Japan Science and Technology Agency, PRESTO
- Graduate School of Frontier Biosciences, Osaka University
| |
Collapse
|
33
|
Willems S, Dhaenens M, Govaert E, De Clerck L, Meert P, Van Neste C, Van Nieuwerburgh F, Deforce D. Flagging False Positives Following Untargeted LC–MS Characterization of Histone Post-Translational Modification Combinations. J Proteome Res 2016; 16:655-664. [DOI: 10.1021/acs.jproteome.6b00724] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- Sander Willems
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Maarten Dhaenens
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Elisabeth Govaert
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Laura De Clerck
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Paulien Meert
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Christophe Van Neste
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
- Bioinformatics
Institute Ghent, Ghent University, Ghent, 9052, Belgium
- Center
for Medical Genetics Ghent, Ghent University, Ghent, 9000, Belgium
| | | | - Dieter Deforce
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| |
Collapse
|
34
|
Zhang J, Yang MK, Zeng H, Ge F. GAPP: A Proteogenomic Software for Genome Annotation and Global Profiling of Post-translational Modifications in Prokaryotes. Mol Cell Proteomics 2016; 15:3529-3539. [PMID: 27630248 DOI: 10.1074/mcp.m116.060046] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Indexed: 11/06/2022] Open
Abstract
Although the number of sequenced prokaryotic genomes is growing rapidly, experimentally verified annotation of prokaryotic genome remains patchy and challenging. To facilitate genome annotation efforts for prokaryotes, we developed an open source software called GAPP for genome annotation and global profiling of post-translational modifications (PTMs) in prokaryotes. With a single command, it provides a standard workflow to validate and refine predicted genetic models and discover diverse PTM events. We demonstrated the utility of GAPP using proteomic data from Helicobacter pylori, one of the major human pathogens that is responsible for many gastric diseases. Our results confirmed 84.9% of the existing predicted H. pylori proteins, identified 20 novel protein coding genes, and corrected four existing gene models with regard to translation initiation sites. In particular, GAPP revealed a large repertoire of PTMs using the same proteomic data and provided a rich resource that can be used to examine the functions of reversible modifications in this human pathogen. This software is a powerful tool for genome annotation and global discovery of PTMs and is applicable to any sequenced prokaryotic organism; we expect that it will become an integral part of ongoing genome annotation efforts for prokaryotes. GAPP is freely available at https://sourceforge.net/projects/gappproteogenomic/.
Collapse
Affiliation(s)
- Jia Zhang
- From the ‡Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Ming-Kun Yang
- From the ‡Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China
| | - Honghui Zeng
- §Wuhan Branch, Supercomputing Center, Chinese Academy of Sciences, China
| | - Feng Ge
- From the ‡Key Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China; .,§Wuhan Branch, Supercomputing Center, Chinese Academy of Sciences, China
| |
Collapse
|
35
|
Na S, Payne SH, Bandeira N. Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks. Mol Cell Proteomics 2016; 15:3501-3512. [PMID: 27609420 PMCID: PMC5098046 DOI: 10.1074/mcp.o116.060913] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Indexed: 11/25/2022] Open
Abstract
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.
Collapse
Affiliation(s)
- Seungjin Na
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093.,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093
| | - Samuel H Payne
- ¶Pacific Northwest National Laboratory, Richland, Washington 99354
| | - Nuno Bandeira
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093; .,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093.,‖Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, 92093
| |
Collapse
|
36
|
Trenchevska O, Nelson RW, Nedelkov D. Mass Spectrometric Immunoassays in Characterization of Clinically Significant Proteoforms. Proteomes 2016; 4:proteomes4010013. [PMID: 28248223 PMCID: PMC5217360 DOI: 10.3390/proteomes4010013] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Revised: 03/10/2016] [Accepted: 03/14/2016] [Indexed: 02/07/2023] Open
Abstract
Proteins can exist as multiple proteoforms in vivo, as a result of alternative splicing and single-nucleotide polymorphisms (SNPs), as well as posttranslational processing. To address their clinical significance in a context of diagnostic information, proteoforms require a more in-depth analysis. Mass spectrometric immunoassays (MSIA) have been devised for studying structural diversity in human proteins. MSIA enables protein profiling in a simple and high-throughput manner, by combining the selectivity of targeted immunoassays, with the specificity of mass spectrometric detection. MSIA has been used for qualitative and quantitative analysis of single and multiple proteoforms, distinguishing between normal fluctuations and changes related to clinical conditions. This mini review offers an overview of the development and application of mass spectrometric immunoassays for clinical and population proteomics studies. Provided are examples of some recent developments, and also discussed are the trends and challenges in mass spectrometry-based immunoassays for the next-phase of clinical applications.
Collapse
Affiliation(s)
- Olgica Trenchevska
- The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA.
| | - Randall W Nelson
- The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA.
| | - Dobrin Nedelkov
- The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA.
| |
Collapse
|
37
|
Abstract
Mass spectrometry-based proteomics provides a powerful tool for large-scale analysis of protein modifications. Statistical and computational analysis of mass spectrometry data is a key step in protein modification identification. This chapter presents common and advanced data analysis strategies for modification identification, including variable modification search, unrestrictive approaches for modification discovery, false discovery rate estimation and control methods, and tools for modification site localization.
Collapse
Affiliation(s)
- Yan Fu
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Zhongguancun East Road 55, Beijing, 100190, China.
| |
Collapse
|
38
|
Horlacher O, Lisacek F, Müller M. Mining Large Scale Tandem Mass Spectrometry Data for Protein Modifications Using Spectral Libraries. J Proteome Res 2015; 15:721-31. [PMID: 26653734 DOI: 10.1021/acs.jproteome.5b00877] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Experimental improvements in post-translational modification (PTM) detection by tandem mass spectrometry (MS/MS) has allowed the identification of vast numbers of PTMs. Open modification searches (OMSs) of MS/MS data, which do not require prior knowledge of the modifications present in the sample, further increased the diversity of detected PTMs. Despite much effort, there is still a lack of functional annotation of PTMs. One possibility to narrow the annotation gap is to mine MS/MS data deposited in public repositories and to correlate the PTM presence with biological meta-information attached to the data. Since the data volume can be quite substantial and contain tens of millions of MS/MS spectra, the data mining tools must be able to cope with big data. Here, we present two tools, Liberator and MzMod, which are built using the MzJava class library and the Apache Spark large scale computing framework. Liberator builds large MS/MS spectrum libraries, and MzMod searches them in an OMS mode. We applied these tools to a recently published set of 25 million spectra from 30 human tissues and present tissue specific PTMs. We also compared the results to the ones obtained with the OMS tool MODa and the search engine X!Tandem.
Collapse
Affiliation(s)
- Oliver Horlacher
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva 1211, Switzerland.,Centre Universitaire de Bioinformatique, University of Geneva , Geneva 1211, Switzerland
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva 1211, Switzerland.,Centre Universitaire de Bioinformatique, University of Geneva , Geneva 1211, Switzerland
| | - Markus Müller
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva 1211, Switzerland.,Centre Universitaire de Bioinformatique, University of Geneva , Geneva 1211, Switzerland
| |
Collapse
|
39
|
Computational and statistical methods for high-throughput analysis of post-translational modifications of proteins. J Proteomics 2015. [PMID: 26216596 DOI: 10.1016/j.jprot.2015.07.016] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The investigation of post-translational modifications (PTMs) represents one of the main research focuses for the study of protein function and cell signaling. Mass spectrometry instrumentation with increasing sensitivity improved protocols for PTM enrichment and recently established pipelines for high-throughput experiments allow large-scale identification and quantification of several PTM types. This review addresses the concurrently emerging challenges for the computational analysis of the resulting data and presents PTM-centered approaches for spectra identification, statistical analysis, multivariate analysis and data interpretation. We furthermore discuss the potential of future developments that will help to gain deep insight into the PTM-ome and its biological role in cells. This article is part of a Special Issue entitled: Computational Proteomics.
Collapse
|
40
|
Newman RH, Zhang J, Zhu H. Toward a systems-level view of dynamic phosphorylation networks. Front Genet 2014; 5:263. [PMID: 25177341 PMCID: PMC4133750 DOI: 10.3389/fgene.2014.00263] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Accepted: 07/16/2014] [Indexed: 11/13/2022] Open
Abstract
To better understand how cells sense and respond to their environment, it is important to understand the organization and regulation of the phosphorylation networks that underlie most cellular signal transduction pathways. These networks, which are composed of protein kinases, protein phosphatases and their respective cellular targets, are highly dynamic. Importantly, to achieve signaling specificity, phosphorylation networks must be regulated at several levels, including at the level of protein expression, substrate recognition, and spatiotemporal modulation of enzymatic activity. Here, we briefly summarize some of the traditional methods used to study the phosphorylation status of cellular proteins before focusing our attention on several recent technological advances, such as protein microarrays, quantitative mass spectrometry, and genetically-targetable fluorescent biosensors, that are offering new insights into the organization and regulation of cellular phosphorylation networks. Together, these approaches promise to lead to a systems-level view of dynamic phosphorylation networks.
Collapse
Affiliation(s)
- Robert H Newman
- Department of Biology, North Carolina Agricultural and Technical State University Greensboro, NC, USA
| | - Jin Zhang
- Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine Baltimore, MD, USA ; The Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine Baltimore, MD, USA ; Department of Oncology, Johns Hopkins University School of Medicine Baltimore, MD, USA ; Department of Chemical and Biomolecular Engineering, Johns Hopkins University School of Medicine Baltimore, MD, USA
| | - Heng Zhu
- Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine Baltimore, MD, USA ; High-Throughput Biology Center, Institute for Basic Biomedical Sciences, Johns Hopkins University Baltimore, MD, USA
| |
Collapse
|