1
|
Adams C, Gabriel W, Laukens K, Picciani M, Wilhelm M, Bittremieux W, Boonen K. Fragment ion intensity prediction improves the identification rate of non-tryptic peptides in timsTOF. Nat Commun 2024; 15:3956. [PMID: 38730277 PMCID: PMC11087512 DOI: 10.1038/s41467-024-48322-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 04/29/2024] [Indexed: 05/12/2024] Open
Abstract
Immunopeptidomics is crucial for immunotherapy and vaccine development. Because the generation of immunopeptides from their parent proteins does not adhere to clear-cut rules, rather than being able to use known digestion patterns, every possible protein subsequence within human leukocyte antigen (HLA) class-specific length restrictions needs to be considered during sequence database searching. This leads to an inflation of the search space and results in lower spectrum annotation rates. Peptide-spectrum match (PSM) rescoring is a powerful enhancement of standard searching that boosts the spectrum annotation performance. We analyze 302,105 unique synthesized non-tryptic peptides from the ProteomeTools project on a timsTOF-Pro to generate a ground-truth dataset containing 93,227 MS/MS spectra of 74,847 unique peptides, that is used to fine-tune the deep learning-based fragment ion intensity prediction model Prosit. We demonstrate up to 3-fold improvement in the identification of immunopeptides, as well as increased detection of immunopeptides from low input samples.
Collapse
Affiliation(s)
- Charlotte Adams
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Wassim Gabriel
- Computational Mass Spectrometry, Technical University of Munich, 85354, Freising, Germany
| | - Kris Laukens
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Mario Picciani
- Computational Mass Spectrometry, Technical University of Munich, 85354, Freising, Germany
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich, 85354, Freising, Germany
- Munich Data Science Institute, Technical University of Munich, 85748, Garching, Germany
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, Antwerp, Belgium.
| | - Kurt Boonen
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium.
- Sustainable Health Department, Flemish Institute for Technological Research (VITO), Antwerp, Belgium.
| |
Collapse
|
2
|
Peng H, Wang H, Kong W, Li J, Goh WWB. Optimizing differential expression analysis for proteomics data via high-performing rules and ensemble inference. Nat Commun 2024; 15:3922. [PMID: 38724498 PMCID: PMC11082229 DOI: 10.1038/s41467-024-47899-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 04/16/2024] [Indexed: 05/12/2024] Open
Abstract
Identification of differentially expressed proteins in a proteomics workflow typically encompasses five key steps: raw data quantification, expression matrix construction, matrix normalization, missing value imputation (MVI), and differential expression analysis. The plethora of options in each step makes it challenging to identify optimal workflows that maximize the identification of differentially expressed proteins. To identify optimal workflows and their common properties, we conduct an extensive study involving 34,576 combinatoric experiments on 24 gold standard spike-in datasets. Applying frequent pattern mining techniques to top-ranked workflows, we uncover high-performing rules that demonstrate optimality has conserved properties. Via machine learning, we confirm optimal workflows are indeed predictable, with average cross-validation F1 scores and Matthew's correlation coefficients surpassing 0.84. We introduce an ensemble inference to integrate results from individual top-performing workflows for expanding differential proteome coverage and resolve inconsistencies. Ensemble inference provides gains in pAUC (up to 4.61%) and G-mean (up to 11.14%) and facilitates effective aggregation of information across varied quantification approaches such as topN, directLFQ, MaxLFQ intensities, and spectral counts. However, further development and evaluation are needed to establish acceptable frameworks for conducting ensemble inference on multiple proteomics workflows.
Collapse
Affiliation(s)
- Hui Peng
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - He Wang
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Weijia Kong
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Jinyan Li
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Wilson Wen Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore.
- Center for Biomedical Informatics, Nanyang Technological University, Singapore, Singapore.
- Center of AI in Medicine, Nanyang Technological University, Singapore, Singapore.
- Division of Neurology, Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, UK.
| |
Collapse
|
3
|
Strauss MT, Bludau I, Zeng WF, Voytik E, Ammar C, Schessner JP, Ilango R, Gill M, Meier F, Willems S, Mann M. AlphaPept: a modern and open framework for MS-based proteomics. Nat Commun 2024; 15:2168. [PMID: 38461149 PMCID: PMC10924963 DOI: 10.1038/s41467-024-46485-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 02/20/2024] [Indexed: 03/11/2024] Open
Abstract
In common with other omics technologies, mass spectrometry (MS)-based proteomics produces ever-increasing amounts of raw data, making efficient analysis a principal challenge. A plethora of different computational tools can process the MS data to derive peptide and protein identification and quantification. However, during the last years there has been dramatic progress in computer science, including collaboration tools that have transformed research and industry. To leverage these advances, we develop AlphaPept, a Python-based open-source framework for efficient processing of large high-resolution MS data sets. Numba for just-in-time compilation on CPU and GPU achieves hundred-fold speed improvements. AlphaPept uses the Python scientific stack of highly optimized packages, reducing the code base to domain-specific tasks while accessing the latest advances. We provide an easy on-ramp for community contributions through the concept of literate programming, implemented in Jupyter Notebooks. Large datasets can rapidly be processed as shown by the analysis of hundreds of proteomes in minutes per file, many-fold faster than acquisition. AlphaPept can be used to build automated processing pipelines with web-serving functionality and compatibility with downstream analysis tools. It provides easy access via one-click installation, a modular Python library for advanced users, and via an open GitHub repository for developers.
Collapse
Affiliation(s)
- Maximilian T Strauss
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Isabell Bludau
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Wen-Feng Zeng
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Eugenia Voytik
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Constantin Ammar
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Julia P Schessner
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | | | - Florian Meier
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
- Functional Proteomics, Jena University Hospital, Jena, Germany
| | - Sander Willems
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
- NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
4
|
Lapin J, Yan X, Dong Q. UniSpec: Deep Learning for Predicting the Full Range of Peptide Fragment Ion Series to Enhance the Proteomics Data Analysis Workflow. Anal Chem 2024. [PMID: 38329031 DOI: 10.1021/acs.analchem.3c02321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
We present UniSpec, an attention-driven deep neural network designed to predict comprehensive collision-induced fragmentation spectra, thereby improving peptide identification in shotgun proteomics. Utilizing a training data set of 1.8 million unique high-quality tandem mass spectra (MS2) from 0.8 million unique peptide ions, UniSpec learned with a peptide fragmentation dictionary encompassing 7919 fragment peaks. Among these, 5712 are neutral loss peaks, with 2310 corresponding to modification-specific neutral losses. Remarkably, UniSpec can predict 73%-77% of fragment intensities based on our NIST reference library spectra, a significant leap from the 35%-45% coverage of only b and y ions. Comparative studies with Prosit elucidate that while both models are strong at predicting their respective fragment ion series, UniSpec particularly shines in generating more complex MS2 spectra with diverse ion annotations. The integration of UniSpec's predictions into shotgun proteomics data analysis boosts the identification rate of tryptic peptides by 48% at a 1% false discovery rate (FDR) and 60% at a more confident 0.1% FDR. Using UniSpec's predicted in-silico spectral library, the search results closely matched those from search engines and experimental spectral libraries used in peptide identification, highlighting its potential as a stand-alone identification tool. The source code and Python scripts are available on GitHub (https://github.com/usnistgov/UniSpec) and Zenodo (https://zenodo.org/records/10452792), and all data sets and analysis results generated in this work were deposited in Zenodo (https://zenodo.org/records/10052268).
Collapse
Affiliation(s)
- Joel Lapin
- Department of Physics, Georgetown University, Washington, D.C. 20057, United States
- Associate, Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Xinjian Yan
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Qian Dong
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
5
|
Jumel T, Shevchenko A. Multispecies Benchmark Analysis for LC-MS/MS Validation and Performance Evaluation in Bottom-Up Proteomics. J Proteome Res 2024; 23:684-691. [PMID: 38243904 PMCID: PMC10845134 DOI: 10.1021/acs.jproteome.3c00531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 12/04/2023] [Accepted: 01/04/2024] [Indexed: 01/22/2024]
Abstract
We present an instrument-independent benchmark procedure and software (LFQ_bout) for the validation and comparative evaluation of the performance of LC-MS/MS and data processing workflows in bottom-up proteomics. The procedure enables a back-to-back comparison of common and emerging workflows, e.g., diaPASEF or ScanningSWATH, and evaluates the impact of arbitrary and inadequately documented settings or black-box data processing algorithms. It enhances the overall performance and quantification accuracy by recognizing and reporting common quantification errors.
Collapse
Affiliation(s)
- Tobias Jumel
- Max Planck Institute of
Molecular Cell Biology and Genetics (MPI-CBG), Pfotenhauerstraße 108, 01307 Dresden, Germany
| | - Andrej Shevchenko
- Max Planck Institute of
Molecular Cell Biology and Genetics (MPI-CBG), Pfotenhauerstraße 108, 01307 Dresden, Germany
| |
Collapse
|
6
|
Li Y, He Q, Guo H, Shuai SC, Cheng J, Liu L, Shuai J. AttnPep: A Self-Attention-Based Deep Learning Method for Peptide Identification in Shotgun Proteomics. J Proteome Res 2024; 23:834-843. [PMID: 38252705 DOI: 10.1021/acs.jproteome.3c00729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.
Collapse
Affiliation(s)
- Yulin Li
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Qingzu He
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Huan Guo
- Department of Physics, Xiamen University, Xiamen 361005, China
| | - Stella C Shuai
- Biological Science, Northwestern University, Evanston, Illinois 60208, United States
| | - Jinyan Cheng
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Liyu Liu
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| | - Jianwei Shuai
- Department of Physics, Xiamen University, Xiamen 361005, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang 325001, China
| |
Collapse
|
7
|
Guzman UH, Martinez-Val A, Ye Z, Damoc E, Arrey TN, Pashkova A, Renuse S, Denisov E, Petzoldt J, Peterson AC, Harking F, Østergaard O, Rydbirk R, Aznar S, Stewart H, Xuan Y, Hermanson D, Horning S, Hock C, Makarov A, Zabrouskov V, Olsen JV. Ultra-fast label-free quantification and comprehensive proteome coverage with narrow-window data-independent acquisition. Nat Biotechnol 2024:10.1038/s41587-023-02099-7. [PMID: 38302753 DOI: 10.1038/s41587-023-02099-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 12/13/2023] [Indexed: 02/03/2024]
Abstract
Mass spectrometry (MS)-based proteomics aims to characterize comprehensive proteomes in a fast and reproducible manner. Here we present the narrow-window data-independent acquisition (nDIA) strategy consisting of high-resolution MS1 scans with parallel tandem MS (MS/MS) scans of ~200 Hz using 2-Th isolation windows, dissolving the differences between data-dependent and -independent methods. This is achieved by pairing a quadrupole Orbitrap mass spectrometer with the asymmetric track lossless (Astral) analyzer which provides >200-Hz MS/MS scanning speed, high resolving power and sensitivity, and low-ppm mass accuracy. The nDIA strategy enables profiling of >100 full yeast proteomes per day, or 48 human proteomes per day at the depth of ~10,000 human protein groups in half-an-hour or ~7,000 proteins in 5 min, representing 3× higher coverage compared with current state-of-the-art MS. Multi-shot acquisition of offline fractionated samples provides comprehensive coverage of human proteomes in ~3 h. High quantitative precision and accuracy are demonstrated in a three-species proteome mixture, quantifying 14,000+ protein groups in a single half-an-hour run.
Collapse
Affiliation(s)
- Ulises H Guzman
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Ana Martinez-Val
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Zilu Ye
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
- State Key Laboratory of Common Mechanism Research for Major Diseases, Suzhou Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Suzhou, China
| | - Eugen Damoc
- Thermo Fisher Scientific (Bremen) GmbH, Bremen, Germany
| | | | - Anna Pashkova
- Thermo Fisher Scientific (Bremen) GmbH, Bremen, Germany
| | | | | | | | | | - Florian Harking
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Ole Østergaard
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Rydbirk
- Center for Functional Genomics and Tissue Plasticity (ATLAS), Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Susana Aznar
- Centre for Neuroscience and Stereology, Copenhagen University Hospital, Copenhagen, Denmark
| | | | - Yue Xuan
- Thermo Fisher Scientific (Bremen) GmbH, Bremen, Germany
| | | | | | | | | | | | - Jesper V Olsen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
8
|
Lou R, Shui W. Acquisition and Analysis of DIA-Based Proteomic Data: A Comprehensive Survey in 2023. Mol Cell Proteomics 2024; 23:100712. [PMID: 38182042 PMCID: PMC10847697 DOI: 10.1016/j.mcpro.2024.100712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/07/2024] Open
Abstract
Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.
Collapse
Affiliation(s)
- Ronghui Lou
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| | - Wenqing Shui
- iHuman Institute, ShanghaiTech University, Shanghai, China; School of Life Science and Technology, ShanghaiTech University, Shanghai, China.
| |
Collapse
|
9
|
Hay BN, Akinlaja MO, Baker TC, Houfani AA, Stacey RG, Foster LJ. Integration of data-independent acquisition (DIA) with co-fractionation mass spectrometry (CF-MS) to enhance interactome mapping capabilities. Proteomics 2023; 23:e2200278. [PMID: 37144656 DOI: 10.1002/pmic.202200278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 04/03/2023] [Accepted: 04/14/2023] [Indexed: 05/06/2023]
Abstract
Proteomics technologies are continually advancing, providing opportunities to develop stronger and more robust protein interaction networks (PINs). In part, this is due to the ever-growing number of high-throughput proteomics methods that are available. This review discusses how data-independent acquisition (DIA) and co-fractionation mass spectrometry (CF-MS) can be integrated to enhance interactome mapping abilities. Furthermore, integrating these two techniques can improve data quality and network generation through extended protein coverage, less missing data, and reduced noise. CF-DIA-MS shows promise in expanding our knowledge of interactomes, notably for non-model organisms (NMOs). CF-MS is a valuable technique on its own, but upon the integration of DIA, the potential to develop robust PINs increases, offering a unique approach for researchers to gain an in-depth understanding into the dynamics of numerous biological processes.
Collapse
Affiliation(s)
- Brenna N Hay
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Mopelola O Akinlaja
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Teesha C Baker
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Aicha Asma Houfani
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - R Greg Stacey
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
10
|
Postoenko VI, Garibova LA, Levitsky LI, Bubis JA, Gorshkov MV, Ivanov MV. IQMMA: Efficient MS1 Intensity Extraction Pipeline Using Multiple Feature Detection Algorithms for DDA Proteomics. J Proteome Res 2023; 22:2827-2835. [PMID: 37579078 DOI: 10.1021/acs.jproteome.3c00075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
One of the key steps in data dependent acquisition (DDA) proteomics is detection of peptide isotopic clusters, also called "features", in MS1 spectra and matching them to MS/MS-based peptide identifications. A number of peptide feature detection tools became available in recent years, each relying on its own matching algorithm. Here, we provide an integrated solution, the intensity-based Quantitative Mix and Match Approach (IQMMA), which integrates a number of untargeted peptide feature detection algorithms and returns the most probable intensity values for the MS/MS-based identifications. IQMMA was tested using available proteomic data acquired for both well-characterized (ground truth) and real-world biological samples, including a mix of Yeast and E. coli digests spiked at different concentrations into the Human K562 digest used as a background, and a set of glioblastoma cell lines. Three open-source feature detection algorithms were integrated: Dinosaur, biosaur2, and OpenMS FeatureFinder. None of them was found optimal when applied individually to all the data sets employed in this work; however, their combined use in IQMMA improved efficiency of subsequent protein quantitation. The software implementing IQMMA is freely available at https://github.com/PostoenkoVI/IQMMA under Apache 2.0 license.
Collapse
Affiliation(s)
- Valeriy I Postoenko
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology, National Research University, G. Dolgoprudny, Institutsky Lane 9, Dolgoprudny 141701, Russia
| | - Leyla A Garibova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology, National Research University, G. Dolgoprudny, Institutsky Lane 9, Dolgoprudny 141701, Russia
| | - Lev I Levitsky
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| |
Collapse
|
11
|
Ismail NH, Mussa A, Al-Khreisat MJ, Mohamed Yusoff S, Husin A, Johan MF. Proteomic Alteration in the Progression of Multiple Myeloma: A Comprehensive Review. Diagnostics (Basel) 2023; 13:2328. [PMID: 37510072 PMCID: PMC10378430 DOI: 10.3390/diagnostics13142328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/18/2023] [Accepted: 06/30/2023] [Indexed: 07/30/2023] Open
Abstract
Multiple myeloma (MM) is an incurable hematologic malignancy. Most MM patients are diagnosed at a late stage because the early symptoms of the disease can be uncertain and nonspecific, often resembling other, more common conditions. Additionally, MM patients are commonly associated with rapid relapse and an inevitable refractory phase. MM is characterized by the abnormal proliferation of monoclonal plasma cells in the bone marrow. During the progression of MM, massive genomic alterations occur that target multiple signaling pathways and are accompanied by a multistep process involving differentiation, proliferation, and invasion. Moreover, the transformation of healthy plasma cell biology into genetically heterogeneous MM clones is driven by a variety of post-translational protein modifications (PTMs), which has complicated the discovery of effective treatments. PTMs have been identified as the most promising candidates for biomarker detection, and further research has been recommended to develop promising surrogate markers. Proteomics research has begun in MM, and a comprehensive literature review is available. However, proteomics applications in MM have yet to make significant progress. Exploration of proteomic alterations in MM is worthwhile to improve understanding of the pathophysiology of MM and to search for new treatment targets. Proteomics studies using mass spectrometry (MS) in conjunction with robust bioinformatics tools are an excellent way to learn more about protein changes and modifications during disease progression MM. This article addresses in depth the proteomic changes associated with MM disease transformation.
Collapse
Affiliation(s)
- Nor Hayati Ismail
- Department of Haematology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| | - Ali Mussa
- Department of Haematology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
- Department of Biology, Faculty of Education, Omdurman Islamic University, Omdurman P.O. Box 382, Sudan
| | - Mutaz Jamal Al-Khreisat
- Department of Haematology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| | - Shafini Mohamed Yusoff
- Department of Haematology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| | - Azlan Husin
- Department of Internal Medicine, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| | - Muhammad Farid Johan
- Department of Haematology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| |
Collapse
|
12
|
Kirkpatrick J, Stemmer PM, Searle BC, Herring LE, Martin L, Midha MK, Phinney BS, Shan B, Palmblad M, Wang Y, Jagtap PD, Neely BA. 2019 Association of Biomolecular Resource Facilities Multi-Laboratory Data-Independent Acquisition Proteomics Study. J Biomol Tech 2023; 34:3fc1f5fe.9b78d780. [PMID: 37435391 PMCID: PMC10332336 DOI: 10.7171/3fc1f5fe.9b78d780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2023]
Abstract
Despite the advantages of fewer missing values by collecting fragment ion data on all analytes in the sample as well as the potential for deeper coverage, the adoption of data-independent acquisition (DIA) in proteomics core facility settings has been slow. The Association of Biomolecular Resource Facilities conducted a large interlaboratory study to evaluate DIA performance in proteomics laboratories with various instrumentation. Participants were supplied with generic methods and a uniform set of test samples. The resulting 49 DIA datasets act as benchmarks and have utility in education and tool development. The sample set consisted of a tryptic HeLa digest spiked with high or low levels of 4 exogenous proteins. Data are available in MassIVE MSV000086479. Additionally, we demonstrate how the data can be analyzed by focusing on 2 datasets using different library approaches and show the utility of select summary statistics. These data can be used by DIA newcomers, software developers, or DIA experts evaluating performance with different platforms, acquisition settings, and skill levels.
Collapse
Affiliation(s)
- Joanna Kirkpatrick
- Leibniz Institute on AgingFritz Lipmann Institute07745JenaGermany
- The Francis Crick InstituteLondonNW1 1ATUnited Kingdom
| | | | - Brian C. Searle
- Department of Biomedical InformaticsThe Ohio State UniversityColumbusOhio43210USA
- Pelotonia Institute for Immuno-OncologyThe Ohio State University Comprehensive Cancer CenterColumbusOhio43210USA
| | - Laura E. Herring
- UNC Proteomics Core FacilityDepartment of PharmacologyUniversity of North Carolina at Chapel HillChapel HillNorth Carolina27514USA
| | | | | | | | - Baozhen Shan
- Bioinformatics Solutions Inc.WaterlooON N2L 3K8Canada
| | - Magnus Palmblad
- Center for Proteomics and MetabolomicsLeiden University Medical Center2333 ZC LeidenThe Netherlands
| | - Yan Wang
- National Institute of Dental and Craniofacial ResearchNational Institutes of HealthBethesdaMaryland20892USA
| | - Pratik D. Jagtap
- Department of BiochemistryMolecular Biology and BiophysicsUniversity of MinnesotaMinneapolisMinnesota55455USA
| | - Benjamin A. Neely
- National Institute of Standards and TechnologyCharlestonSouth Carolina29412USA
| |
Collapse
|
13
|
Debrie E, Malfait M, Gabriels R, Declerq A, Sticker A, Martens L, Clement L. Quality Control for the Target Decoy Approach for Peptide Identification. J Proteome Res 2023; 22:350-358. [PMID: 36648107 DOI: 10.1021/acs.jproteome.2c00423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Reliable peptide identification is key in mass spectrometry (MS) based proteomics. To this end, the target decoy approach (TDA) has become the cornerstone for extracting a set of reliable peptide-to-spectrum matches (PSMs) that will be used in downstream analysis. Indeed, TDA is now the default method to estimate the false discovery rate (FDR) for a given set of PSMs, and users typically view it as a universal solution for assessing the FDR in the peptide identification step. However, the TDA also relies on a minimal set of assumptions, which are typically never verified in practice. We argue that a violation of these assumptions can lead to poor FDR control, which can be detrimental to any downstream data analysis. We here therefore first clearly spell out these TDA assumptions, and introduce TargetDecoy, a Bioconductor package with all the necessary functionality to control the TDA quality and its underlying assumptions for a given set of PSMs.
Collapse
Affiliation(s)
- Elke Debrie
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000Ghent, Belgium
| | - Milan Malfait
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000Ghent, Belgium.,Statistics and Decision Sciences, Janssen Pharmaceutical Companies of Johnson and Johnson, 2340Beerse, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, 9052Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000Ghent, Belgium
| | - Arthur Declerq
- VIB-UGent Center for Medical Biotechnology, VIB, 9052Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000Ghent, Belgium
| | - Adriaan Sticker
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000Ghent, Belgium.,VIB-UGent Center for Medical Biotechnology, VIB, 9052Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9052Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000Ghent, Belgium
| | - Lieven Clement
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000Ghent, Belgium
| |
Collapse
|
14
|
Rehfeldt T, Gabriels R, Bouwmeester R, Gessulat S, Neely BA, Palmblad M, Perez-Riverol Y, Schmidt T, Vizcaíno JA, Deutsch EW. ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics. J Proteome Res 2023; 22:632-636. [PMID: 36693629 PMCID: PMC9903315 DOI: 10.1021/acs.jproteome.2c00629] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Data set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC) coupled to mass spectrometry (MS) data sets, due to the high levels of data reduction that occur between raw data and machine learning-ready data. Since predictive proteomics is an emerging field, when predicting peptide behavior in LC-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based data sets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides data sets that are useful for comparing state-of-the-art machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available at https://www.proteomicsml.org/, and we welcome the entire proteomics community to contribute to the project at https://github.com/ProteomicsML/ProteomicsML.
Collapse
Affiliation(s)
- Tobias
G. Rehfeldt
- Institute
for Mathematics and Computer Science, University
of Southern Denmark, 5000 Odense, Denmark
| | - Ralf Gabriels
- VIB-UGent
Center for Medical Biotechnology, VIB, Ghent 9052, Belgium,Department
of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Robbin Bouwmeester
- VIB-UGent
Center for Medical Biotechnology, VIB, Ghent 9052, Belgium,Department
of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | | | - Benjamin A. Neely
- National
Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Magnus Palmblad
- Center for
Proteomics and Metabolomics, Leiden University
Medical Center, 2300 RC Leiden, The Netherlands
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust
Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust
Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom,Juan
Antonio Vizcaíno: , Phone: +44 (0) 1223 492686
| | - Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States,Eric Deutsch: ,
Phone: 206-732-1200, Fax: 206-732-1299
| |
Collapse
|
15
|
Dorl S, Winkler S, Mechtler K, Dorfer V. MS Ana: Improving Sensitivity in Peptide Identification with Spectral Library Search. J Proteome Res 2023; 22:462-470. [PMID: 36688604 PMCID: PMC9903325 DOI: 10.1021/acs.jproteome.2c00658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Spectral library search can enable more sensitive peptide identification in tandem mass spectrometry experiments. However, its drawbacks are the limited availability of high-quality libraries and the added difficulty of creating decoy spectra for result validation. We describe MS Ana, a new spectral library search engine that enables high sensitivity peptide identification using either curated or predicted spectral libraries as well as robust false discovery control through its own decoy library generation algorithm. MS Ana identifies on average 36% more spectrum matches and 4% more proteins than database search in a benchmark test on single-shot human cell-line data. Further, we demonstrate the quality of the result validation with tests on synthetic peptide pools and show the importance of library selection through a comparison of library search performance with different configurations of publicly available human spectral libraries.
Collapse
Affiliation(s)
- Sebastian Dorl
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,Department
of Computer Science, Johannes Kepler University
Linz, Altenbergerstraße
69, 4040Linz, Austria,E-mail: . Phone: +43 (0) 50804
27145
| | - Stephan Winkler
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,Department
of Computer Science, Johannes Kepler University
Linz, Altenbergerstraße
69, 4040Linz, Austria
| | - Karl Mechtler
- Research
Institute of Molecular Pathology (IMP), Protein Chemistry, Campus-Vienna-Biocenter 1, 1030Vienna, Austria,Institute
of Molecular Biotechnology (IMBA), Protein Chemistry, Vienna Biocenter
(VBC), Dr. Bohr-Gasse 3, 1030Vienna, Austria,Gregor
Mendel Institute of Molecular Plant Biology of the Austrian Academy
of Sciences (GMI), Dr.
Bohr Gasse 3, 1030Vienna, Austria
| | - Viktoria Dorfer
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,E-mail: . Phone: +43 (0) 50804
22740
| |
Collapse
|
16
|
Proteomic Comparison of Three Wild-Type Pseudorabies Virus Strains and the Attenuated Bartha Strain Reveals Reduced Incorporation of Several Tegument Proteins in Bartha Virions. J Virol 2022; 96:e0115822. [PMID: 36453884 PMCID: PMC9769387 DOI: 10.1128/jvi.01158-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Pseudorabies virus (PRV) is a member of the alphaherpesvirus subfamily and the causative agent of Aujeszky's disease in pigs. Driven by the large economic losses associated with PRV infection, several vaccines and vaccine programs have been developed. To this day, the attenuated Bartha strain, generated by serial passaging, represents the golden standard for PRV vaccination. However, a proteomic comparison of the Bartha virion to wild-type (WT) PRV virions is lacking. Here, we present a comprehensive mass spectrometry-based proteome comparison of the attenuated Bartha strain and three commonly used WT PRV strains: Becker, Kaplan, and NIA3. We report the detection of 40 structural and 14 presumed nonstructural proteins through a combination of data-dependent and data-independent acquisition. Interstrain comparisons revealed that packaging of the capsid and most envelope proteins is largely comparable in-between all four strains, except for the envelope protein pUL56, which is less abundant in Bartha virions. However, distinct differences were noted for several tegument proteins. Most strikingly, we noted a severely reduced incorporation of the tegument proteins IE180, VP11/12, pUS3, VP22, pUL41, pUS1, and pUL40 in Bartha virions. Moreover, and likely as a consequence, we also observed that Bartha virions are on average smaller and more icosahedral compared to WT virions. Finally, we detected at least 28 host proteins that were previously described in PRV virions and noticed considerable strain-specific differences with regard to host proteins, arguing that the potential role of packaged host proteins in PRV replication and spread should be further explored. IMPORTANCE The pseudorabies virus (PRV) vaccine strain Bartha-an attenuated strain created by serial passaging-represents an exceptional success story in alphaherpesvirus vaccination. Here, we used mass spectrometry to analyze the Bartha virion composition in comparison to three established WT PRV strains. Many viral tegument proteins that are considered nonessential for viral morphogenesis were drastically less abundant in Bartha virions compared to WT virions. Interestingly, many of the proteins that are less incorporated in Bartha participate in immune evasion strategies of alphaherpesviruses. In addition, we observed a reduced size and more icosahedral morphology of the Bartha virions compared to WT PRV. Given that the Bartha vaccine strain elicits potent immune responses, our findings here suggest that differences in protein packaging may contribute to its immunogenicity. Further exploration of these observations could aid the development of efficacious vaccines against other alphaherpesvirus vaccines such as HSV-1/2 or EHV-1.
Collapse
|
17
|
Koopmans F, Li KW, Klaassen RV, Smit AB. MS-DAP Platform for Downstream Data Analysis of Label-Free Proteomics Uncovers Optimal Workflows in Benchmark Data Sets and Increased Sensitivity in Analysis of Alzheimer's Biomarker Data. J Proteome Res 2022; 22:374-386. [PMID: 36541440 PMCID: PMC9903323 DOI: 10.1021/acs.jproteome.2c00513] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
In the rapidly moving proteomics field, a diverse patchwork of data analysis pipelines and algorithms for data normalization and differential expression analysis is used by the community. We generated a mass spectrometry downstream analysis pipeline (MS-DAP) that integrates both popular and recently developed algorithms for normalization and statistical analyses. Additional algorithms can be easily added in the future as plugins. MS-DAP is open-source and facilitates transparent and reproducible proteome science by generating extensive data visualizations and quality reporting, provided as standardized PDF reports. Second, we performed a systematic evaluation of methods for normalization and statistical analysis on a large variety of data sets, including additional data generated in this study, which revealed key differences. Commonly used approaches for differential testing based on moderated t-statistics were consistently outperformed by more recent statistical models, all integrated in MS-DAP. Third, we introduced a novel normalization algorithm that rescues deficiencies observed in commonly used normalization methods. Finally, we used the MS-DAP platform to reanalyze a recently published large-scale proteomics data set of CSF from AD patients. This revealed increased sensitivity, resulting in additional significant target proteins which improved overlap with results reported in related studies and includes a large set of new potential AD biomarkers in addition to previously reported.
Collapse
Affiliation(s)
- Frank Koopmans
- Department
of Molecular and Cellular Neurobiology, Center for Neurogenomics and
Cognitive Research, Amsterdam Neuroscience, VU University, 1081 HV Amsterdam, The Netherlands,
| | - Ka Wan Li
- Department
of Molecular and Cellular Neurobiology, Center for Neurogenomics and
Cognitive Research, Amsterdam Neuroscience, VU University, 1081 HV Amsterdam, The Netherlands
| | - Remco V. Klaassen
- Department
of Molecular and Cellular Neurobiology, Center for Neurogenomics and
Cognitive Research, Amsterdam Neuroscience, VU University, 1081 HV Amsterdam, The Netherlands
| | - August B. Smit
- Department
of Molecular and Cellular Neurobiology, Center for Neurogenomics and
Cognitive Research, Amsterdam Neuroscience, VU University, 1081 HV Amsterdam, The Netherlands
| |
Collapse
|
18
|
A Comprehensive Study of Gradient Conditions for Deep Proteome Discovery in a Complex Protein Matrix. Int J Mol Sci 2022; 23:ijms231911714. [PMID: 36233016 PMCID: PMC9569591 DOI: 10.3390/ijms231911714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 11/28/2022] Open
Abstract
Bottom–up mass-spectrometry-based proteomics is a well-developed technology based on complex peptide mixtures from proteolytic cleavage of proteins and is widely applied in protein identification, characterization, and quantitation. A tims-ToF mass spectrometer is an excellent platform for bottom–up proteomics studies due to its rapid acquisition with high sensitivity. It remains challenging for bottom–up proteomics approaches to achieve 100% proteome coverage. Liquid chromatography (LC) is commonly used prior to mass spectrometry (MS) analysis to fractionate peptide mixtures, and the LC gradient can affect the peptide fractionation and proteome coverage. We investigated the effects of gradient type and time duration to find optimal gradient conditions. Five gradient types (linear, logarithm-like, exponent-like, stepwise, and step-linear), three different gradient lengths (22 min, 44 min, and 66 min), two sample loading amounts (100 ng and 200 ng), and two loading conditions (the use of trap column and no trap column) were studied. The effect of these chromatography variables on protein groups, peptides, and spectral counts using HeLa cell digests was explored. The results indicate that (1) a step-linear gradient performs best among the five gradient types studied; (2) the optimal gradient duration depends on protein sample loading amount; (3) the use of a trap column helps to enhance protein identification, especially low-abundance proteins; (4) MSFragger and PEAKS Studio have high similarity in protein group identification; (5) MSFragger identified more protein groups among the different gradient conditions compared to PEAKS Studio; and (6) combining results from both database search engines can expand identified protein groups by 9–11%.
Collapse
|