1
|
Krishnamurthy S, Gunasegaran B, Paul-Heng M, Mohamedali A, P Klare W, Pang CNI, Gluch L, Shin JS, Chan C, Baker MS, Ahn SB, Heng B. Recombinant Protein Spectral Library (rPSL) DIA-MS method improves identification and quantification of low-abundance cancer-associated and kynurenine pathway proteins. Commun Chem 2025; 8:141. [PMID: 40348885 PMCID: PMC12065878 DOI: 10.1038/s42004-025-01531-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Accepted: 04/22/2025] [Indexed: 05/14/2025] Open
Abstract
Data-independent acquisition mass spectrometry (DIA-MS) is a powerful tool for quantitative proteomics, but a well-constructed reference spectral library is crucial to optimize DIA analysis, particularly for low-abundance proteins. In this study, we evaluate the efficacy of a recombinant protein spectral library (rPSL), generated from tryptic digestion of 42 human recombinant proteins, in enhancing the detection and quantification of lower-abundance cancer-associated proteins. Additionally, we generated a combined sample-specific biological-rPSL by integrating the rPSL with a spectral library derived from pooled biological samples. We compared the performance of these libraries for DIA data extraction with standard methods, including sample-specific biological spectral library and library-free DIA methods. Our specific focus was on quantifying cancer-associated proteins, including key enzymes involved in kynurenine pathway, across patient-derived tissues and cell lines. Both rPSL and biological-rPSL-DIA approaches provided significantly improved coverage of lower-abundance proteins, enhancing sensitivity and more consistent protein quantification across matched tumour and adjacent noncancerous tissues from breast and colorectal cancer patients and in cancer cell lines. Overall, our study demonstrates that rPSL and biological-rPSL coupled with DIA-MS workflows, can address the limitations of both biological library-based and library-free DIA methods, offering a robust approach for quantifying low-abundance cancer-associated proteins in complex biological samples.
Collapse
Affiliation(s)
- Shivani Krishnamurthy
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
| | - Bavani Gunasegaran
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
| | - Moumita Paul-Heng
- Transplantation Immunobiology Research Group, Charles Perkins Centre, The University of Sydney, Sydney, Australia
| | - Abidali Mohamedali
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
- Faculty of Science and Engineering, School of Natural Sciences, Macquarie University, Sydney, Australia
| | - William P Klare
- Australian Proteome Analysis Facility, Macquarie University, Sydney, Australia
| | - C N Ignatius Pang
- Australian Proteome Analysis Facility, Macquarie University, Sydney, Australia
| | - Laurence Gluch
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
- The Strathfield Breast and Thyroid Centre, Strathfield, Sydney, Australia
| | - Joo-Shik Shin
- Department of Tissue Pathology and Diagnostic Oncology, Royal Prince Alfred Hospital, Camperdown, Sydney, Australia
- Central Clinical School, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Charles Chan
- Department of Anatomical Pathology, NSW Health Pathology, Concord Hospital, Sydney, NSW, Australia
- Concord Institute of Academic Surgery, Concord Clinical School, Faculty of Medicine and Health, Concord Hospital, The University of Sydney, Sydney, Australia
| | - Mark S Baker
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
| | - Seong Beom Ahn
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia.
| | - Benjamin Heng
- Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia.
| |
Collapse
|
2
|
Lai X, Qi G. Using long columns to quantify over 9200 unique protein groups from brain tissue in a single injection on an Orbitrap Exploris 480 mass spectrometer. J Proteomics 2024; 308:105285. [PMID: 39159862 DOI: 10.1016/j.jprot.2024.105285] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 08/14/2024] [Accepted: 08/16/2024] [Indexed: 08/21/2024]
Abstract
The most exciting advancement in LC-MS/MS-based bottom-up proteomics has centered around enhancing mass spectrometers. Among these, the latest and most advanced mass spectrometer for bottom-up proteomics is the Orbitrap Astral that has the highest scan rate to accelerate throughput and the highest sensitivity to handle a very small amount of peptide samples and to achieve deeper proteomics. However, its affordability remains a challenge for most laboratories. While significant strides have been made in improving mass spectrometry, advancing liquid chromatography (LC) to achieve deeper proteomics has not achieved significant successes since the innovation of Multidimensional Protein Identification Technology (MudPIT) in 2001. To achieve deeper proteomics in a less labor-intensive and more reproducible approach while using a more cost-effective mass spectrometer, such as the Orbitrap Exploris 480, we evaluated trap columns as long as 40 cm and analytical column as long as 600 cm besides sample loading amount, gradient time, and analytical column particle size to enable a fractionation-free method for a single injection to obtain deeper proteomics. The length of trap and analytic columns is the key factor. Using a 30 cm trap column and 250 cm analytical column with other optimized LC conditions, we quantified over 9200 unique protein groups from brain tissue in a single injection using a 24-h gradient on an Orbitrap Exploris 480 mass spectrometer.
Collapse
Affiliation(s)
- Xianyin Lai
- Biotechnology Discovery Research, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, USA.
| | - Guihong Qi
- Biotechnology Discovery Research, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, USA
| |
Collapse
|
3
|
Kong F, Keshet U, Shen T, Rodriguez E, Fiehn O. LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD-, UVPD-, and HCD-High Resolution Mass Spectrometers. Anal Chem 2023; 95:16810-16818. [PMID: 37939222 PMCID: PMC11492814 DOI: 10.1021/acs.analchem.3c02263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Compound annotation using spectral-matching algorithms is vital for (MS/MS)-based metabolomics research, but is hindered by the lack of high-quality reference MS/MS library spectra. Finding and removing errors from libraries, including noise ions, is mostly done manually. This process is both error-prone and time-consuming. To address these challenges, we have developed an automated library curation pipeline, LibGen, to universally build novel spectral libraries. This pipeline corrects mass errors, denoises spectra by subformula assignments, and performs quality control of the reference spectra by calculating explained intensity and spectral entropy. We employed LibGen to generate three high-quality libraries with chemical standards of 2241 natural products. To this end, we used an IQ-X orbital ion trap mass spectrometer to generate 1947 classic high-energy collision dissociation spectra (HCD) as well as 1093 ultraviolet-photodissociation (UVPD) mass spectra. The third library was generated by an electron-activated collision dissociation (EAD) 7600 ZenoTOF mass spectrometer yielding 3244 MS/MS spectra. The natural compounds covered 140 chemical classes from prenol lipids to benzypyrans with >97% of the compounds showing <0.2 Tanimoto-similarity, demonstrating a very high structural variance. Mass spectra showed much higher information content for both UVPD- and EAD-mass spectra compared to classic HCD spectra when using spectral entropy calculations. We validated the denoising algorithm by acquiring MS/MS spectra at high concentration and at 13-fold diluted chemical standards. At low concentrations, a higher proportion of spectra showed apparent fragment ions that could not be explained by subformula losses of the parent molecule. When more than 10% of the total intensity of MS/MS fragments was regarded as noise ions, spectra were considered as low quality and were not included in the libraries. As the overall process is fully automated, LibGen can be utilized by all researchers who create or curate mass spectral libraries. The libraries we created here are publicly available at MassBank.us.
Collapse
Affiliation(s)
- Fanzhou Kong
- Chemistry Department, One Shields Avenue, University of California-Davis, Davis, California 95616, United States
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Uri Keshet
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Tong Shen
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Elys Rodriguez
- Chemistry Department, One Shields Avenue, University of California-Davis, Davis, California 95616, United States
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California-Davis, Davis, California 95616, United States
| |
Collapse
|
4
|
Moorthy A, Kearsley A, Mallard W, Wallace W, Stein S. Inferring the Nominal Molecular Mass of an Analyte from Its Electron Ionization Mass Spectrum. Anal Chem 2023; 95:13132-13139. [PMID: 37610141 PMCID: PMC10560098 DOI: 10.1021/acs.analchem.3c01815] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
The performance of three algorithms for predicting nominal molecular mass from an analyte's electron ionization mass spectrum is presented. The Peak Interpretation Method (PIM) attempts to quantify the likelihood that a molecular ion peak is contained in the mass spectrum, whereas the Simple Search Hitlist Method (SS-HM) and iterative Hybrid Search Hitlist Method (iHS-HM) leverage results from mass spectral library searching. These predictions can be employed in combination (recommended) or independently. The methods were tested on two sets of query mass spectra searched against libraries that did not contain the reference mass spectra of the same compounds: 19,074 spectra of various organic molecules searched against the NIST17 mass spectral library and 162 spectra of small molecule drugs searched against SWGDRUG version 3.3. Individually, each molecular mass prediction method had computed precisions (the fraction of positive predictions that were correct) of 91, 89, and 74%, respectively. The methods become more valuable when predictions are taken together. When all three predictions were identical, which occurred in 33% of the test cases, the predicted molecular mass was almost always correct (>99%).
Collapse
Affiliation(s)
- A.S. Moorthy
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - A.J. Kearsley
- Mathematical Analysis and Modeling Group, Applied and Computational Mathematics Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - W.G. Mallard
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - W.E. Wallace
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - S.E. Stein
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| |
Collapse
|
5
|
Lee SY, Lee ST, Suh S, Ko BJ, Oh HB. Revealing Unknown Controlled Substances and New Psychoactive Substances Using High-Resolution LC-MS/MS Machine Learning Models and the Hybrid Similarity Search Algorithm. J Anal Toxicol 2021; 46:732-742. [PMID: 34498039 DOI: 10.1093/jat/bkab098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 08/11/2021] [Accepted: 09/08/2021] [Indexed: 11/12/2022] Open
Abstract
High-resolution LC-MS/MS tandem mass spectra-based machine learning models are constructed to address the analytical challenge of identifying unknown controlled substances and new psychoactive substances (NPS's). Using a training set comprised of 770 LC-MS/MS barcode spectra (with binary entries 0 or 1) obtained generally by high-resolution mass spectrometers, three classification machine learning models were generated and evaluated. The three models are artificial neural network (ANN), support vector machine (SVM), and k-nearest neighbor (k-NN) models. In these models, controlled substances and NPS's were classified into 13 subgroups (benzylpiperazine, opiate, benzodiazepine, amphetamine, cocaine, methcathinone, classical cannabinoid, fentanyl, 2C series, indazole carbonyl compound, indole carbonyl compound, phencyclidine, and others). Using 193 LC-MS/MS barcode spectra as an external test set, accuracy of the ANN, SVM, and k-NN models were evaluated as 72.5%, 90.0%, and 94.3%, respectively. Also, the hybrid similarity search (HSS) algorithm was evaluated to examine whether this algorithm can successfully identify unknown controlled substances and NPS's whose data are unavailable in the database. When only 24 representative LC-MS/MS spectra of controlled substances and NPS's were selectively included in the database, it was found that HSS can successfully identify compounds with high reliability. The machine learning models and HSS algorithms are incorporated into our home-coded AI-SNPS (artificial intelligence screener for narcotic drugs and psychotropic substances) standalone software that is equipped with a graphic user interface. The use of this software allows unknown controlled substances and NPS's to be identified in a convenient manner.
Collapse
Affiliation(s)
- So Yeon Lee
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea
| | - Sang Tak Lee
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea
| | - Sungill Suh
- Forensic genetics & chemistry division, Supreme prosecutors' office, Seoul 06590, Republic of Korea
| | - Bum Jun Ko
- Forensic genetics & chemistry division, Supreme prosecutors' office, Seoul 06590, Republic of Korea
| | - Han Bin Oh
- Department of Chemistry, Sogang University, Seoul 04107, Republic of Korea
| |
Collapse
|
6
|
Guan S, Bythell BJ. Size Dependent Fragmentation Chemistry of Short Doubly Protonated Tryptic Peptides. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2021; 32:1020-1032. [PMID: 33779179 DOI: 10.1021/jasms.1c00009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Tandem mass spectrometry of electrospray ionized multiply charged peptide ions is commonly used to identify the sequence of peptide(s) and infer the identity of source protein(s). Doubly protonated peptide ions are consistently the most efficiently sequenced ions following collision-induced dissociation of peptides generated by tryptic digestion. While the broad characteristics of longer (N ≥ 8 residue) doubly protonated peptides have been investigated, there is comparatively little data on shorter systems where charge repulsion should exhibit the greatest influence on the dissociation chemistry. To address this gap and further understand the chemistry underlying collisional-dissociation of doubly charged tryptic peptides, two series of analytes ([GxR+2H]2+ and [AxR+2H]2+, x = 2-5) were investigated experimentally and with theory. We find distinct differences in the preference of bond cleavage sites for these peptides as a function of size and to a lesser extent composition. Density functional calculations at two levels of theory predict that the threshold relative energies required for bond cleavages at the same site for peptides of different size are quite similar (for example, b2-yN-2). In isolation, this finding is inconsistent with experiment. However, the predicted extent of entropy change of these reactions is size dependent. Subsequent RRKM rate constant calculations provide a far clearer picture of the kinetics of the competing bond cleavage reactions enabling rationalization of experimental findings. The M06-2X data were substantially more consistent with experiment than were the B3LYP data.
Collapse
Affiliation(s)
- Shanshan Guan
- Department of Chemistry and Biochemistry, Ohio University, 307 Chemistry Building, Athens, Ohio 45701, United States
- Department of Chemistry and Biochemistry, University of Missouri-St. Louis, 1 University Boulevard, St. Louis, Missouri 63121, United States
| | - Benjamin J Bythell
- Department of Chemistry and Biochemistry, Ohio University, 307 Chemistry Building, Athens, Ohio 45701, United States
- Department of Chemistry and Biochemistry, University of Missouri-St. Louis, 1 University Boulevard, St. Louis, Missouri 63121, United States
| |
Collapse
|
7
|
|
8
|
Abstract
This manuscript outlines a straight-forward procedure for generating a map of similarity between spectra of a set. When applied to a reference set of spectra for Type I fentanyl analogs (molecules differing from fentanyl by a single modification), the map illuminates clustering that is applicable to automated structure assignment of unidentified molecules. An open-source software implementation that generates mass spectral similarity mappings of unknowns against a library of Type I fentanyl analog spectra is available at http://github.com/asm3-nist/FentanylClassifier.
Collapse
|