1
|
Yu F, Deng Y, Nesvizhskii AI. MSFragger-DDA+ enhances peptide identification sensitivity with full isolation window search. Nat Commun 2025; 16:3329. [PMID: 40199897 PMCID: PMC11978857 DOI: 10.1038/s41467-025-58728-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Accepted: 03/27/2025] [Indexed: 04/10/2025] Open
Abstract
Liquid chromatography-mass spectrometry based proteomics, particularly in the bottom-up approach, relies on the digestion of proteins into peptides for subsequent separation and analysis. The most prevalent method for identifying peptides from data-dependent acquisition mass spectrometry data is database search. Traditional tools typically focus on identifying a single peptide per tandem mass spectrum, often neglecting the frequent occurrence of peptide co-fragmentations leading to chimeric spectra. Here, we introduce MSFragger-DDA+, a database search algorithm that enhances peptide identification by detecting co-fragmented peptides with high sensitivity and speed. Utilizing MSFragger's fragment ion indexing algorithm, MSFragger-DDA+ performs a comprehensive search within the full isolation window for each tandem mass spectrum, followed by robust feature detection, filtering, and rescoring procedures to refine search results. Evaluation against established tools across diverse datasets demonstrated that, integrated within the FragPipe computational platform, MSFragger-DDA+ significantly increases identification sensitivity while maintaining stringent false discovery rate control. It is also uniquely suited for wide-window acquisition data. MSFragger-DDA+ provides an efficient and accurate solution for peptide identification, enhancing the detection of low-abundance co-fragmented peptides. Coupled with the FragPipe platform, MSFragger-DDA+ enables more comprehensive and accurate analysis of proteomics data.
Collapse
Affiliation(s)
- Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
- Gilbert S. Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
2
|
Nagy K, Sándor P, Vékey K, Drahos L, Révész Á. The Enzyme Effect: Broadening the Horizon of MS Optimization to Nontryptic Digestion in Proteomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2025; 36:299-308. [PMID: 39803703 PMCID: PMC11808764 DOI: 10.1021/jasms.4c00396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Revised: 12/27/2024] [Accepted: 12/31/2024] [Indexed: 02/06/2025]
Abstract
In recent years, alternative enzymes with varied specificities have gained importance in MS-based bottom-up proteomics, offering orthogonal information about biological samples and advantages in certain applications. However, most mass spectrometric workflows are optimized for tryptic digests. This raises the questions of whether enzyme specificity impacts mass spectrometry and if current methods for nontryptic digests are suboptimal. The success of peptide and protein identifications relies on the information content of MS/MS spectra, influenced by collision energy in collision-induced dissociation. We investigated this by conducting LC-MS/MS measurements with different enzymes, including trypsin, Arg-C, Glu-C, Asp-N, and chymotrypsin, at varying collision energies. We analyzed peptide scores for thousands of peptides and determined optimal collision energy (CE) values. Our results showed a linear m/z dependence for all enzymes, with Glu-C, Asp-N, and chymotrypsin requiring significantly lower energies than trypsin and Arg-C. We proposed a tailored CE selection method for these alternative enzymes, applying ca. 20% lower energy compared to tryptic peptides. This would result in a 10-15 eV decrease on a Bruker QTof instrument and a 5-6 NCE% (normalized collision energy) difference on an Orbitrap. The optimized method improved bottom-up proteomics performance by 8-32%, as measured by peptide identification and sequence coverage. The different trends in fragmentation behavior were linked to the effects of C-terminal basic amino acids for Arg-C and trypsin, stabilizing y fragment ions. This optimized method boosts the performance and provides insight into the impact of enzyme specificity. Data sets are available in the MassIVE repository (MSV000095066).
Collapse
Affiliation(s)
- Kinga Nagy
- MS
Proteomics Research Group, HUN-REN Research
Centre for Natural Sciences, Magyar Tudósok körútja 2, H-1117 Budapest, Hungary
- Hevesy
György PhD School of Chemistry, ELTE
Eötvös Loránd University, Faculty of Science,
Institute of Chemistry, Pázmány Péter sétány 1/A, Budapest H-1117, Hungary
| | - Péter Sándor
- MS
Proteomics Research Group, HUN-REN Research
Centre for Natural Sciences, Magyar Tudósok körútja 2, H-1117 Budapest, Hungary
| | - Károly Vékey
- MS
Proteomics Research Group, HUN-REN Research
Centre for Natural Sciences, Magyar Tudósok körútja 2, H-1117 Budapest, Hungary
| | - László Drahos
- MS
Proteomics Research Group, HUN-REN Research
Centre for Natural Sciences, Magyar Tudósok körútja 2, H-1117 Budapest, Hungary
| | - Ágnes Révész
- MS
Proteomics Research Group, HUN-REN Research
Centre for Natural Sciences, Magyar Tudósok körútja 2, H-1117 Budapest, Hungary
| |
Collapse
|
3
|
Korchak J, Jeffery ED, Bandyopadhyay S, Jordan BT, Lehe MD, Watts EF, Fenix A, Wilhelm M, Sheynkman GM. IS-PRM-Based Peptide Targeting Informed by Long-Read Sequencing for Alternative Proteome Detection. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:2614-2630. [PMID: 39012054 PMCID: PMC11544703 DOI: 10.1021/jasms.4c00119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 05/24/2024] [Accepted: 06/25/2024] [Indexed: 07/17/2024]
Abstract
Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. Recently, a family of advanced targeted MS strategies, termed internal standard parallel reaction monitoring (IS-PRM), have demonstrated multiplexed, sensitive detection of predefined peptides of interest. Such approaches have not yet been used to confirm existence of novel peptides. Here, we present a targeted proteogenomic approach that leverages sample-matched long-read RNA sequencing (lrRNA-seq) data to predict potential protein isoforms with prior transcript evidence. Predicted tryptic isoform-specific peptides, which are specific to individual gene product isoforms, serve as "triggers" and "targets" in the IS-PRM method, Tomahto. Using the model human stem cell line WTC11, LR RNaseq data were generated and used to inform the generation of synthetic standards for 192 isoform-specific peptides (114 isoforms from 55 genes). These synthetic "trigger" peptides were labeled with super heavy tandem mass tags (TMT) and spiked into TMT-labeled WTC11 tryptic digest, predicted to contain corresponding endogenous "target" peptides. Compared to DDA mode, Tomahto increased detectability of isoforms by 3.6-fold, resulting in the identification of five previously unannotated isoforms. Our method detected protein isoform expression for 43 out of 55 genes corresponding to 54 resolved isoforms. This lrRNA-seq-informed Tomahto targeted approach is a new modality for generating protein-level evidence of alternative isoforms─a critical first step in designing functional studies and eventually clinical assays.
Collapse
Affiliation(s)
- Jennifer
A. Korchak
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Erin D. Jeffery
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Saikat Bandyopadhyay
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
- Center
for Public Health Genomics, University of
Virginia, Charlottesville, Virginia 22903, United States
| | - Ben T. Jordan
- Cancer
Genomics Research Laboratory, Frederick
National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Micah D. Lehe
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Emily F. Watts
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Aidan Fenix
- Department
of Laboratory Medicine and Pathology, University
of Washington, Seattle, Washington 98195, United States
| | - Mathias Wilhelm
- Computational
Mass Spectrometry, Technical University
of Munich (TUM), D-85354 Freising, Germany
| | - Gloria M. Sheynkman
- Department
of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
- Department
of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia 22903, United States
- UVA
Comprehensive Cancer Center, University
of Virginia, Charlottesville, Virginia 22903, United States
| |
Collapse
|
4
|
Yu F, Deng Y, Nesvizhskii AI. MSFragger-DDA+ Enhances Peptide Identification Sensitivity with Full Isolation Window Search. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.12.618041. [PMID: 39463976 PMCID: PMC11507693 DOI: 10.1101/2024.10.12.618041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Liquid chromatography-mass spectrometry (LC-MS) based proteomics, particularly in the bottom-up approach, relies on the digestion of proteins into peptides for subsequent separation and analysis. The most prevalent method for identifying peptides from data-dependent acquisition (DDA) mass spectrometry data is database search. Traditional tools typically focus on identifying a single peptide per tandem mass spectrum (MS2), often neglecting the frequent occurrence of peptide co-fragmentations leading to chimeric spectra. Here, we introduce MSFragger-DDA+, a novel database search algorithm that enhances peptide identification by detecting co-fragmented peptides with high sensitivity and speed. Utilizing MSFragger's fragment ion indexing algorithm, MSFragger-DDA+ performs a comprehensive search within the full isolation window for each MS2, followed by robust feature detection, filtering, and rescoring procedures to refine search results. Evaluation against established tools across diverse datasets demonstrated that, integrated within the FragPipe computational platform, MSFragger-DDA+ significantly increases identification sensitivity while maintaining stringent false discovery rate (FDR) control. It is also uniquely suited for wide-window acquisition (WWA) data. MSFragger-DDA+ provides an efficient and accurate solution for peptide identification, enhancing the detection of low-abundance co-fragmented peptides. Coupled with the FragPipe platform, MSFragger-DDA+ enables more comprehensive and accurate analysis of proteomics data.
Collapse
Affiliation(s)
- Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Yamei Deng
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I. Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
5
|
Peters-Clarke TM, Coon JJ, Riley NM. Instrumentation at the Leading Edge of Proteomics. Anal Chem 2024; 96:7976-8010. [PMID: 38738990 PMCID: PMC11996003 DOI: 10.1021/acs.analchem.3c04497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2024]
Affiliation(s)
- Trenton M. Peters-Clarke
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Joshua J. Coon
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
- Morgridge Institute for Research, Madison, WI, USA
| | | |
Collapse
|
6
|
Korchak JA, Jeffery ED, Bandyopadhyay S, Jordan BT, Lehe M, Watts EF, Fenix A, Wilhelm M, Sheynkman GM. IS-PRM-based peptide targeting informed by long-read sequencing for alternative proteome detection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.01.587549. [PMID: 38617311 PMCID: PMC11014528 DOI: 10.1101/2024.04.01.587549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. Recently, a family of advanced targeted MS strategies, termed internal standard parallel reaction monitoring (IS-PRM), have demonstrated multiplexed, sensitive detection of pre-defined peptides of interest. Such approaches have not yet been used to confirm existence of novel peptides. Here, we present a targeted proteogenomic approach that leverages sample-matched long-read RNA sequencing (LR RNAseq) data to predict potential protein isoforms with prior transcript evidence. Predicted tryptic isoform-specific peptides, which are specific to individual gene product isoforms, serve as "triggers" and "targets" in the IS-PRM method, Tomahto. Using the model human stem cell line WTC11, LR RNAseq data were generated and used to inform the generation of synthetic standards for 192 isoform-specific peptides (114 isoforms from 55 genes). These synthetic "trigger" peptides were labeled with super heavy tandem mass tags (TMT) and spiked into TMT-labeled WTC11 tryptic digest, predicted to contain corresponding endogenous "target" peptides. Compared to DDA mode, Tomahto increased detectability of isoforms by 3.6-fold, resulting in the identification of five previously unannotated isoforms. Our method detected protein isoform expression for 43 out of 55 genes corresponding to 54 resolved isoforms. This LR RNA seq-informed Tomahto targeted approach, called LRP-IS-PRM, is a new modality for generating protein-level evidence of alternative isoforms - a critical first step in designing functional studies and eventually clinical assays.
Collapse
Affiliation(s)
- Jennifer A. Korchak
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Erin D. Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Saikat Bandyopadhyay
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Ben T. Jordan
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, MD USA
| | - Micah Lehe
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Emily F. Watts
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Aidan Fenix
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA, USA
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich (TUM), D-85354 Freising, Germany
| | - Gloria M. Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
- UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, VA, USA
| |
Collapse
|
7
|
Sinitcyn P, Richards AL, Weatheritt RJ, Brademan DR, Marx H, Shishkova E, Meyer JG, Hebert AS, Westphall MS, Blencowe BJ, Cox J, Coon JJ. Global detection of human variants and isoforms by deep proteome sequencing. Nat Biotechnol 2023; 41:1776-1786. [PMID: 36959352 PMCID: PMC10713452 DOI: 10.1038/s41587-023-01714-x] [Citation(s) in RCA: 94] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 02/15/2023] [Indexed: 03/25/2023]
Abstract
An average shotgun proteomics experiment detects approximately 10,000 human proteins from a single sample. However, individual proteins are typically identified by peptide sequences representing a small fraction of their total amino acids. Hence, an average shotgun experiment fails to distinguish different protein variants and isoforms. Deeper proteome sequencing is therefore required for the global discovery of protein isoforms. Using six different human cell lines, six proteases, deep fractionation and three tandem mass spectrometry fragmentation methods, we identify a million unique peptides from 17,717 protein groups, with a median sequence coverage of approximately 80%. Direct comparison with RNA expression data provides evidence for the translation of most nonsynonymous variants. We have also hypothesized that undetected variants likely arise from mutation-induced protein instability. We further observe comparable detection rates for exon-exon junction peptides representing constitutive and alternative splicing events. Our dataset represents a resource for proteoform discovery and provides direct evidence that most frame-preserving alternatively spliced isoforms are translated.
Collapse
Affiliation(s)
- Pavel Sinitcyn
- Computational Systems Biochemistry Research Group, Max Planck Institute of Biochemistry, Martinsried, Germany
- Morgridge Institute for Research, Madison, WI, USA
| | - Alicia L Richards
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Robert J Weatheritt
- EMBL Australia and Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Dain R Brademan
- Morgridge Institute for Research, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Harald Marx
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria
| | - Evgenia Shishkova
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Jesse G Meyer
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Alexander S Hebert
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
| | - Michael S Westphall
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Benjamin J Blencowe
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Jürgen Cox
- Computational Systems Biochemistry Research Group, Max Planck Institute of Biochemistry, Martinsried, Germany.
| | - Joshua J Coon
- Morgridge Institute for Research, Madison, WI, USA.
- National Center for Quantitative Biology of Complex Systems, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
8
|
Wilburn DB, Shannon AE, Spicer V, Richards AL, Yeung D, Swaney DL, Krokhin OV, Searle BC. Deep learning from harmonized peptide libraries enables retention time prediction of diverse post translational modifications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.30.542978. [PMID: 37398395 PMCID: PMC10312522 DOI: 10.1101/2023.05.30.542978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
In proteomics experiments, peptide retention time (RT) is an orthogonal property to fragmentation when assessing detection confidence. Advances in deep learning enable accurate RT prediction for any peptide from sequence alone, including those yet to be experimentally observed. Here we present Chronologer, an open-source software tool for rapid and accurate peptide RT prediction. Using new approaches to harmonize and false-discovery correct across independently collected datasets, Chronologer is built on a massive database with >2.2 million peptides including 10 common post-translational modification (PTM) types. By linking knowledge learned across diverse peptide chemistries, Chronologer predicts RTs with less than two-thirds the error of other deep learning tools. We show how RT for rare PTMs, such as OGlcNAc, can be learned with high accuracy using as few as 10-100 example peptides in newly harmonized datasets. This iteratively updatable workflow enables Chronologer to comprehensively predict RTs for PTM-marked peptides across entire proteomes.
Collapse
|
9
|
Puliasis SS, Lewandowska D, Hemsley PA, Zhang R. ProtView: A Versatile Tool for In Silico Protease Evaluation and Selection in a Proteomic and Proteogenomic Context. J Proteome Res 2023. [PMID: 37248202 DOI: 10.1021/acs.jproteome.3c00135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Many tools have been created to generate in silico proteome digests with different protease enzymes and provide useful information for selecting optimal digest schemes for specific needs. This can save on time and resources and generate insights on the observable proteome. However, there remains a need for a tool that evaluates digest schemes beyond protein and amino acid coverages in the proteomic domain. Here, we present ProtView, a versatile in silico protease combination digest evaluation workflow that maps in silico-digested peptides to both protein and genome references, so that the potential observable portions of the proteome, transcriptome, and genome can be identified. The proteomic identification and quantification of evidence for transcriptional, co-transcriptional, post-transcriptional, translational, and post-translational regulation can all be examined in silico with ProtView prior to an experiment. Benchmarking against biological data comparing multiple proteases shows that ProtView can correctly estimate performances among the digest schemes. ProtView provides this information in a way that is easy to interpret, allowing for digest schemes to be evaluated before carrying out an experiment, in context that can optimize both proteomic and proteogenomic experiments. ProtView is available at https://github.com/SSPuliasis/ProtView.
Collapse
Affiliation(s)
- Sophia S Puliasis
- Division of Plant Sciences, School of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, Scotland, UK
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, Scotland, UK
| | - Dominika Lewandowska
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, Scotland, UK
| | - Piers A Hemsley
- Division of Plant Sciences, School of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, Scotland, UK
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, Scotland, UK
| | - Runxuan Zhang
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee DD2 5DA, Scotland, UK
| |
Collapse
|
10
|
Abstract
Spectrum library searching is a powerful alternative to database searching for data dependent acquisition experiments, but has been historically limited to identifying previously observed peptides in libraries. Here we present Scribe, a new library search engine designed to leverage deep learning fragmentation prediction software such as Prosit. Rather than relying on highly curated DDA libraries, this approach predicts fragmentation and retention times for every peptide in a FASTA database. Scribe embeds Percolator for false discovery rate correction and an interference tolerant, label-free quantification integrator for an end-to-end proteomics workflow. By leveraging expected relative fragmentation and retention time values, we find that library searching with Scribe can outperform traditional database searching tools both in terms of sensitivity and quantitative precision. Scribe and its graphical interface are easy to use, freely accessible, and fully open source.
Collapse
Affiliation(s)
- Brian C Searle
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
- Proteome Software Inc., Portland, Oregon97219, United States
| | - Ariana E Shannon
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| | - Damien Beau Wilburn
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| |
Collapse
|
11
|
Abstract
Proteins are the key biological actors within cells, driving many biological processes integral to both healthy and diseased states. Understanding the depth of complexity represented within the proteome is crucial to our scientific understanding of cellular biology and to provide disease specific insights for clinical applications. Mass spectrometry-based proteomics is the premier method for proteome analysis, with the ability to both identify and quantify proteins. Although proteomics continues to grow as a robust field of bioanalytical chemistry, advances are still necessary to enable a more comprehensive view of the proteome. In this review, we provide a broad overview of mass spectrometry-based proteomics in general, and highlight four developing areas of bottom-up proteomics: (1) protein inference, (2) alternative proteases, (3) sample-specific databases and (4) post-translational modification discovery.
Collapse
Affiliation(s)
- Rachel M Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA.
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
12
|
Danko K, Lukasheva E, Zhukov VA, Zgoda V, Frolov A. Detergent-Assisted Protein Digestion-On the Way to Avoid the Key Bottleneck of Shotgun Bottom-Up Proteomics. Int J Mol Sci 2022; 23:13903. [PMID: 36430380 PMCID: PMC9695859 DOI: 10.3390/ijms232213903] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 11/02/2022] [Accepted: 11/05/2022] [Indexed: 11/16/2022] Open
Abstract
Gel-free bottom-up shotgun proteomics is the principal methodological platform for the state-of-the-art proteome research. This methodology assumes quantitative isolation of the total protein fraction from a complex biological sample, its limited proteolysis with site-specific proteases, analysis of the resulted peptides with nanoscaled reversed-phase high-performance liquid chromatography-(tandem) mass spectrometry (nanoRP-HPLC-MS and MS/MS), protein identification by sequence database search and peptide-based quantitative analysis. The most critical steps of this workflow are protein reconstitution and digestion; therefore, detergents and chaotropic agents are strongly mandatory to ensure complete solubilization of complex protein isolates and to achieve accessibility of all protease cleavage sites. However, detergents are incompatible with both RP separation and electrospray ionization (ESI). Therefore, to make LC-MS analysis possible, several strategies were implemented in the shotgun proteomics workflow. These techniques rely either on enzymatic digestion in centrifugal filters with subsequent evacuation of the detergent, or employment of MS-compatible surfactants, which can be degraded upon the digestion. In this review we comprehensively address all currently available strategies for the detergent-assisted proteolysis in respect of their relative efficiency when applied to different biological matrices. We critically discuss the current progress and the further perspectives of these technologies in the context of its advances and gaps.
Collapse
Affiliation(s)
- Katerina Danko
- Department of Biochemistry, St. Petersburg State University, 199034 St. Petersburg, Russia
| | - Elena Lukasheva
- Department of Biochemistry, St. Petersburg State University, 199034 St. Petersburg, Russia
| | - Vladimir A. Zhukov
- All-Russia Research Institute for Agricultural Microbiology, Podbelsky Chaussee 3, Pushkin, 196608 St. Petersburg, Russia
| | - Viktor Zgoda
- Institute of Biomedical Chemistry, 119121 Moscow, Russia
| | - Andrej Frolov
- K.A. Timiryazev Institute of Plant Physiology RAS, 127276 Moscow, Russia
| |
Collapse
|
13
|
Woessmann J, Kotol D, Hober A, Uhlén M, Edfors F. Addressing the Protease Bias in Quantitative Proteomics. J Proteome Res 2022; 21:2526-2534. [PMID: 36044728 PMCID: PMC9552229 DOI: 10.1021/acs.jproteome.2c00491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
![]()
Protein quantification strategies using multiple proteases
have
been shown to deliver poor interprotease accuracy in label-free mass
spectrometry experiments. By utilizing six different proteases with
different cleavage sites, this study explores the protease bias and
its effect on accuracy and precision by using recombinant protein
standards. We established 557 SRM assays, using a recombinant protein
standard resource, toward 10 proteins in human plasma and determined
their concentration with multiple proteases. The quantified peptides
of these plasma proteins spanned 3 orders of magnitude (0.02–70
μM). In total, 60 peptides were used for absolute quantification
and the majority of the peptides showed high robustness. The retained
reproducibility was achieved by quantifying plasma proteins using
spiked stable isotope standard recombinant proteins in a targeted
proteomics workflow.
Collapse
Affiliation(s)
- Jakob Woessmann
- Science for Life Laboratory, KTH─Royal Institute of Technology, SE-171 65 Solna, Sweden.,Department of Protein Science, KTH─Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - David Kotol
- Science for Life Laboratory, KTH─Royal Institute of Technology, SE-171 65 Solna, Sweden.,Department of Protein Science, KTH─Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Andreas Hober
- Science for Life Laboratory, KTH─Royal Institute of Technology, SE-171 65 Solna, Sweden.,Department of Protein Science, KTH─Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Mathias Uhlén
- Science for Life Laboratory, KTH─Royal Institute of Technology, SE-171 65 Solna, Sweden.,Department of Protein Science, KTH─Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Fredrik Edfors
- Science for Life Laboratory, KTH─Royal Institute of Technology, SE-171 65 Solna, Sweden.,Department of Protein Science, KTH─Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| |
Collapse
|