1
|
Arab I, Egghe K, Laukens K, Chen K, Barakat K, Bittremieux W. Benchmarking of Small Molecule Feature Representations for hERG, Nav1.5, and Cav1.2 Cardiotoxicity Prediction. J Chem Inf Model 2024; 64:2515-2527. [PMID: 37870574 DOI: 10.1021/acs.jcim.3c01301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023]
Abstract
In the field of drug discovery, there is a substantial challenge in seeking out chemical structures that possess desirable pharmacological, toxicological, and pharmacokinetic properties. Complications arise when drugs interfere with the functioning of cardiac ion channels, leading to serious cardiovascular consequences. The discontinuation and removal of numerous approved drugs from the market or at late development stages in the pipeline due to such inhibitory effects further highlight the urgency of addressing this issue. Consequently, the early prediction of potential blockers targeting cardiac ion channels during the drug discovery process is of paramount importance. This study introduces a deep learning framework that computationally determines the cardiotoxicity associated with the voltage-gated potassium channel (hERG), the voltage-gated calcium channel (Cav1.2), and the voltage-gated sodium channel (Nav1.5) for drug candidates. The predictive capabilities of three feature representations─molecular fingerprints, descriptors, and graph-based numerical representations─are rigorously benchmarked. Additionally, a novel training and evaluation data set framework is presented, enabling predictive model training of drug off-target cardiotoxicity using a comprehensive and large curated data set covering these three cardiac ion channels. To facilitate these predictions, a robust and comprehensive small molecule cardiotoxicity prediction tool named CToxPred has been developed. It is made available as open source under the permissive MIT license at https://github.com/issararab/CToxPred.
Collapse
Affiliation(s)
- Issar Arab
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (Biomina), 2020 Antwerp, Belgium
| | - Kristof Egghe
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
| | - Kris Laukens
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (Biomina), 2020 Antwerp, Belgium
| | - Ke Chen
- Chair for Theoretical Chemistry, Catalysis Research Center, Technische Universität München, Lichtenbergstraße 4, D-85747 Garching, Germany
| | - Khaled Barakat
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, Alberta 8613, Canada
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (Biomina), 2020 Antwerp, Belgium
| |
Collapse
|
2
|
Bittremieux W. From data to discovery: The essential role of computational tools in proteomics. Proteomics 2024; 24:e2300081. [PMID: 38629976 DOI: 10.1002/pmic.202300081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 02/20/2024] [Indexed: 04/19/2024]
Affiliation(s)
- Wout Bittremieux
- Department of Computer Science, University of Antwerp, Antwerpen, Belgium
| |
Collapse
|
3
|
Adams C, Laukens K, Bittremieux W, Boonen K. Machine learning-based peptide-spectrum match rescoring opens up the immunopeptidome. Proteomics 2024; 24:e2300336. [PMID: 38009585 DOI: 10.1002/pmic.202300336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/18/2023] [Accepted: 10/23/2023] [Indexed: 11/29/2023]
Abstract
Immunopeptidomics is a key technology in the discovery of targets for immunotherapy and vaccine development. However, identifying immunopeptides remains challenging due to their non-tryptic nature, which results in distinct spectral characteristics. Moreover, the absence of strict digestion rules leads to extensive search spaces, further amplified by the incorporation of somatic mutations, pathogen genomes, unannotated open reading frames, and post-translational modifications. This inflation in search space leads to an increase in random high-scoring matches, resulting in fewer identifications at a given false discovery rate. Peptide-spectrum match rescoring has emerged as a machine learning-based solution to address challenges in mass spectrometry-based immunopeptidomics data analysis. It involves post-processing unfiltered spectrum annotations to better distinguish between correct and incorrect peptide-spectrum matches. Recently, features based on predicted peptidoform properties, including fragment ion intensities, retention time, and collisional cross section, have been used to improve the accuracy and sensitivity of immunopeptide identification. In this review, we describe the diverse bioinformatics pipelines that are currently available for peptide-spectrum match rescoring and discuss how they can be used for the analysis of immunopeptidomics data. Finally, we provide insights into current and future machine learning solutions to boost immunopeptide identification.
Collapse
Affiliation(s)
- Charlotte Adams
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
- Laboratory of Protein Science, Proteomics and Epigenetic Signaling (PPES), Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Wout Bittremieux
- Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Kurt Boonen
- Laboratory of Protein Science, Proteomics and Epigenetic Signaling (PPES), Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
- ImmuneSpec BV, Niel, Belgium
| |
Collapse
|
4
|
Mohanty I, Mannochio-Russo H, Schweer JV, El Abiead Y, Bittremieux W, Xing S, Schmid R, Zuffa S, Vasquez F, Muti VB, Zemlin J, Tovar-Herrera OE, Moraïs S, Desai D, Amin S, Koo I, Turck CW, Mizrahi I, Kris-Etherton PM, Petersen KS, Fleming JA, Huan T, Patterson AD, Siegel D, Hagey LR, Wang M, Aron AT, Dorrestein PC. The underappreciated diversity of bile acid modifications. Cell 2024; 187:1801-1818.e20. [PMID: 38471500 DOI: 10.1016/j.cell.2024.02.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 11/30/2023] [Accepted: 02/15/2024] [Indexed: 03/14/2024]
Abstract
The repertoire of modifications to bile acids and related steroidal lipids by host and microbial metabolism remains incompletely characterized. To address this knowledge gap, we created a reusable resource of tandem mass spectrometry (MS/MS) spectra by filtering 1.2 billion publicly available MS/MS spectra for bile-acid-selective ion patterns. Thousands of modifications are distributed throughout animal and human bodies as well as microbial cultures. We employed this MS/MS library to identify polyamine bile amidates, prevalent in carnivores. They are present in humans, and their levels alter with a diet change from a Mediterranean to a typical American diet. This work highlights the existence of many more bile acid modifications than previously recognized and the value of leveraging public large-scale untargeted metabolomics data to discover metabolites. The availability of a modification-centric bile acid MS/MS library will inform future studies investigating bile acid roles in health and disease.
Collapse
Affiliation(s)
- Ipsita Mohanty
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Helena Mannochio-Russo
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Joshua V Schweer
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Department of Chemistry and Biochemistry, University of California, San Diego, San Diego, CA, USA
| | - Yasin El Abiead
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerpen, Belgium
| | - Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver Campus, Vancouver, BC, Canada
| | - Robin Schmid
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Simone Zuffa
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Felipe Vasquez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Valentina B Muti
- Department of Computer Science and Engineering, University of California, Riverside, Riverside, CA, USA; Department of Chemistry and Biochemistry, University of Denver, Denver, CO 80210, USA
| | - Jasmine Zemlin
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA 92093, USA
| | - Omar E Tovar-Herrera
- Department of Life Sciences, Ben-Gurion University of the Negev, Be'er Sheva, Israel; Goldman Sonnenfeldt School of Sustainability and Climate Change, Ben-Gurion University of the Negev, Be'er Sheva 84105, Israel
| | - Sarah Moraïs
- Department of Life Sciences, Ben-Gurion University of the Negev, Be'er Sheva, Israel; Goldman Sonnenfeldt School of Sustainability and Climate Change, Ben-Gurion University of the Negev, Be'er Sheva 84105, Israel
| | - Dhimant Desai
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA, USA
| | - Shantu Amin
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA, USA
| | - Imhoi Koo
- Center for Molecular Toxicology and Carcinogenesis, Department of Veterinary and Biomedical Sciences, Pennsylvania State University, University Park, PA, USA
| | - Christoph W Turck
- Max Planck Institute of Psychiatry, Proteomics and Biomarkers, Kraepelinstrasse 2-10, Munich 80804, Germany; Key Laboratory of Animal Models and Human Disease Mechanisms of Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Itzhak Mizrahi
- Department of Life Sciences, Ben-Gurion University of the Negev, Be'er Sheva, Israel; Goldman Sonnenfeldt School of Sustainability and Climate Change, Ben-Gurion University of the Negev, Be'er Sheva 84105, Israel
| | - Penny M Kris-Etherton
- Department of Nutritional Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Kristina S Petersen
- Department of Nutritional Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Jennifer A Fleming
- Department of Nutritional Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Tao Huan
- Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver Campus, Vancouver, BC, Canada
| | - Andrew D Patterson
- Center for Molecular Toxicology and Carcinogenesis, Department of Veterinary and Biomedical Sciences, Pennsylvania State University, University Park, PA, USA
| | - Dionicio Siegel
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Lee R Hagey
- Department of Medicine, University of California, San Diego, San Diego, CA, USA
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California, Riverside, Riverside, CA, USA
| | - Allegra T Aron
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO 80210, USA
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA; Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA; Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
5
|
Martin MR, Bittremieux W, Hassoun S. Molecular structure discovery for untargeted metabolomics using biotransformation rules and global molecular networking. bioRxiv 2024:2024.02.04.578795. [PMID: 38370723 PMCID: PMC10871291 DOI: 10.1101/2024.02.04.578795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Although untargeted mass spectrometry-based metabolomics is crucial for understanding life's molecular underpinnings, its effectiveness is hampered by low annotation rates of the generated tandem mass spectra. To address this issue, we introduce a novel data-driven approach, Biotransformation-based Annotation Method (BAM), that leverages molecular structural similarities inherent in biochemical reactions. BAM operates by applying biotransformation rules to known 'anchor' molecules, which exhibit high spectral similarity to unknown spectra, thereby hypothesizing and ranking potential structures for the corresponding 'suspect' molecule. BAM's effectiveness is demonstrated by its success in annotating suspect spectra in a global molecular network comprising hundreds of millions of spectra. BAM was able to assign correct molecular structures to 24.2 % of examined anchor-suspect cases, thereby demonstrating remarkable advancement in metabolite annotation.
Collapse
|
6
|
Bittremieux W, Avalon NE, Thomas SP, Kakhkhorov SA, Aksenov AA, Gomes PWP, Aceves CM, Caraballo-Rodríguez AM, Gauglitz JM, Gerwick WH, Huan T, Jarmusch AK, Kaddurah-Daouk RF, Kang KB, Kim HW, Kondić T, Mannochio-Russo H, Meehan MJ, Melnik AV, Nothias LF, O'Donovan C, Panitchpakdi M, Petras D, Schmid R, Schymanski EL, van der Hooft JJJ, Weldon KC, Yang H, Xing S, Zemlin J, Wang M, Dorrestein PC. Open access repository-scale propagated nearest neighbor suspect spectral library for untargeted metabolomics. Nat Commun 2023; 14:8488. [PMID: 38123557 PMCID: PMC10733301 DOI: 10.1038/s41467-023-44035-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/28/2023] [Indexed: 12/23/2023] Open
Abstract
Despite the increasing availability of tandem mass spectrometry (MS/MS) community spectral libraries for untargeted metabolomics over the past decade, the majority of acquired MS/MS spectra remain uninterpreted. To further aid in interpreting unannotated spectra, we created a nearest neighbor suspect spectral library, consisting of 87,916 annotated MS/MS spectra derived from hundreds of millions of MS/MS spectra originating from published untargeted metabolomics experiments. Entries in this library, or "suspects," were derived from unannotated spectra that could be linked in a molecular network to an annotated spectrum. Annotations were propagated to unknowns based on structural relationships to reference molecules using MS/MS-based spectrum alignment. We demonstrate the broad relevance of the nearest neighbor suspect spectral library through representative examples of propagation-based annotation of acylcarnitines, bacterial and plant natural products, and drug metabolism. Our results also highlight how the library can help to better understand an Alzheimer's brain phenotype. The nearest neighbor suspect spectral library is openly available for download or for data analysis through the GNPS platform to help investigators hypothesize candidate structures for unknown MS/MS spectra in untargeted metabolomics data.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020, Antwerpen, Belgium.
| | - Nicole E Avalon
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, 92093, USA
| | - Sydney P Thomas
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Sarvar A Kakhkhorov
- Laboratory of Physical and Chemical Methods of Research, Center for Advanced Technologies, Tashkent, 100174, Uzbekistan
- Department of Food Science, Faculty of Science, University of Copenhagen, Rolighedsvej 26, 1958, Frederiksberg C, Denmark
| | - Alexander A Aksenov
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Chemistry, University of Connecticut, Storrs, CT, 06269, USA
- Arome Science inc., Farmington, CT, 06032, USA
| | - Paulo Wender P Gomes
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Christine M Aceves
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Andrés Mauricio Caraballo-Rodríguez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Julia M Gauglitz
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - William H Gerwick
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
| | - Tao Huan
- Department of Chemistry, University of British Columbia, Vancouver, BC, V6T 1Z1, Canada
| | - Alan K Jarmusch
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Immunity, Inflammation, and Disease Laboratory, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, 27709, USA
| | - Rima F Kaddurah-Daouk
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, 27701, USA
- Department of Medicine, Duke University, Durham, NC, 27710, USA
- Duke Institute of Brain Sciences, Duke University, Durham, NC, 27710, USA
| | - Kyo Bin Kang
- College of Pharmacy and Research Institute of Pharmaceutical Sciences, Sookmyung Women's University, Seoul, 04310, Korea
| | - Hyun Woo Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University, Goyang, 10326, Korea
| | - Todor Kondić
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4367, Belvaux, Luxembourg
| | - Helena Mannochio-Russo
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Biochemistry and Organic Chemistry, Institute of Chemistry, São Paulo State University, Araraquara, 14800-901, Brazil
| | - Michael J Meehan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Alexey V Melnik
- Department of Chemistry, University of Connecticut, Storrs, CT, 06269, USA
- Arome Science inc., Farmington, CT, 06032, USA
| | - Louis-Felix Nothias
- Université Côte d'Azur, CNRS, ICN, Nice, France
- Interdisciplinary Institute for Artificial Intelligence (3iA) Côte d'Azur, Nice, France
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Morgan Panitchpakdi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Daniel Petras
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, 72076, Tuebingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, 92507, USA
| | - Robin Schmid
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4367, Belvaux, Luxembourg
| | - Justin J J van der Hooft
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
| | - Kelly C Weldon
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Heejung Yang
- Laboratory of Natural Products Chemistry, College of Pharmacy, Kangwon National University, Chuncheon, 24341, Korea
| | - Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Chemistry, University of British Columbia, Vancouver, BC, V6T 1Z1, Canada
| | - Jasmine Zemlin
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California Riverside, Riverside, CA, 92507, USA
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
7
|
Gauglitz JM, West KA, Bittremieux W, Williams CL, Weldon KC, Panitchpakdi M, Di Ottavio F, Aceves CM, Brown E, Sikora NC, Jarmusch AK, Martino C, Tripathi A, Meehan MJ, Dorrestein K, Shaffer JP, Coras R, Vargas F, Goldasich LD, Schwartz T, Bryant M, Humphrey G, Johnson AJ, Spengler K, Belda-Ferre P, Diaz E, McDonald D, Zhu Q, Elijah EO, Wang M, Marotz C, Sprecher KE, Vargas-Robles D, Withrow D, Ackermann G, Herrera L, Bradford BJ, Marques LMM, Amaral JG, Silva RM, Veras FP, Cunha TM, Oliveira RDR, Louzada-Junior P, Mills RH, Piotrowski PK, Servetas SL, Da Silva SM, Jones CM, Lin NJ, Lippa KA, Jackson SA, Daouk RK, Galasko D, Dulai PS, Kalashnikova TI, Wittenberg C, Terkeltaub R, Doty MM, Kim JH, Rhee KE, Beauchamp-Walters J, Wright KP, Dominguez-Bello MG, Manary M, Oliveira MF, Boland BS, Lopes NP, Guma M, Swafford AD, Dutton RJ, Knight R, Dorrestein PC. Author Correction: Enhancing untargeted metabolomics using metadata-based source annotation. Nat Biotechnol 2023; 41:1656. [PMID: 37853256 DOI: 10.1038/s41587-023-02025-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2023]
Affiliation(s)
- Julia M Gauglitz
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kiana A West
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Candace L Williams
- Beckman Center for Conservation Research, San Diego Zoo Wildlife Alliance, Escondido, CA, USA
| | - Kelly C Weldon
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
| | - Morgan Panitchpakdi
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Francesca Di Ottavio
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
| | - Christine M Aceves
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Elizabeth Brown
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Nicole C Sikora
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Alan K Jarmusch
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Cameron Martino
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
| | - Anupriya Tripathi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Michael J Meehan
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kathleen Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Justin P Shaffer
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Roxana Coras
- Division of Rheumatology, Allergy & Immunology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Fernando Vargas
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | | | - Tara Schwartz
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - MacKenzie Bryant
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Gregory Humphrey
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Abigail J Johnson
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Katharina Spengler
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
| | - Pedro Belda-Ferre
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Edgar Diaz
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Daniel McDonald
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Qiyun Zhu
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Emmanuel O Elijah
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Clarisse Marotz
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Kate E Sprecher
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, CO, USA
- Department of Population Health Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Daniela Vargas-Robles
- Servicio Autónomo Centro Amazónico de Investigación y Control de Enfermedades Tropicales Simón Bolívar, Puerto Ayacucho, Amazonas, Venezuela
| | - Dana Withrow
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, CO, USA
| | - Gail Ackermann
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Lourdes Herrera
- Department of Pediatrics, Billings Clinic, Billings, MT, USA
| | - Barry J Bradford
- Department of Animal Science, Michigan State University, East Lansing, MI, USA
| | - Lucas Maciel Mauriz Marques
- Department of Pharmacology, Ribeirão Preto Medicinal School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Juliano Geraldo Amaral
- Multidisciplinary Health Institute, Federal University of Bahia, Vitória da Conquista, Bahia, Brazil
| | - Rodrigo Moreira Silva
- NPPNS, Department of Biomolecular Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Flavio Protasio Veras
- Department of Pharmacology, Ribeirão Preto Medicinal School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Thiago Mattar Cunha
- Department of Pharmacology, Ribeirão Preto Medicinal School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Rene Donizeti Ribeiro Oliveira
- Department of Internal Medicine, Ribeirão Preto Medical School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Paulo Louzada-Junior
- Department of Internal Medicine, Ribeirão Preto Medical School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Robert H Mills
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Pharmacology, University of California San Diego, La Jolla, CA, USA
| | - Paulina K Piotrowski
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Stephanie L Servetas
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Sandra M Da Silva
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Christina M Jones
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Nancy J Lin
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Katrice A Lippa
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Scott A Jackson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Rima Kaddurah Daouk
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, Durham, NC, USA
- Department of Medicine, Duke University, Durham, NC, USA
- Duke Institute of Brain Sciences, Duke University, Durham, NC, USA
| | - Douglas Galasko
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
| | - Parambir S Dulai
- Division of Gastroenterology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | | | - Curt Wittenberg
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA
| | - Robert Terkeltaub
- Division of Rheumatology, Allergy & Immunology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
- San Diego VA Healthcare System, San Diego, CA, USA
| | - Megan M Doty
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
- Division of Neonatology, Department of Pediatrics, Kapi'olani Medical Center for Women and Children, John A. Burns School of Medicine, Honolulu, Hawaii, USA
| | - Jae H Kim
- Division of Neonatology, Perinatal Institute, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Kyung E Rhee
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Julia Beauchamp-Walters
- Division of Pediatric Hospital Medicine, Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
| | - Kenneth P Wright
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, CO, USA
| | - Maria Gloria Dominguez-Bello
- Department of Biochemistry and Microbiology, School of Environmental and Biological Sciences; Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Mark Manary
- Department of Pediatrics, Washington University, St. Louis, MO, USA
| | - Michelli F Oliveira
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Brigid S Boland
- Division of Gastroenterology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Norberto Peporine Lopes
- NPPNS, Department of Biomolecular Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Monica Guma
- Division of Rheumatology, Allergy & Immunology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Austin D Swafford
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
| | - Rachel J Dutton
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Rob Knight
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA.
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Pharmacology, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
8
|
Kang J, Xu W, Bittremieux W, Moshiri N, Rosing T. Accelerating Open Modification Spectral Library Searching on Tensor Core in High-dimensional Space. Bioinformatics 2023:btad404. [PMID: 37369033 DOI: 10.1093/bioinformatics/btad404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 05/09/2023] [Accepted: 06/26/2023] [Indexed: 06/29/2023] Open
Abstract
MOTIVATION Driven by technological advances, the throughput and cost of mass spectrometry proteomics experiments have improved by orders of magnitude in recent decades. Spectral library searching is a common approach to annotating experimental mass spectra by matching them against large libraries of reference spectra corresponding to known peptides. An important disadvantage, however, is that only peptides included in the spectral library can be found, whereas novel peptides, such as those with unexpected post-translational modifications, will remain unknown. Open modification searching is an increasingly popular approach to annotate modified peptides based on partial matches against their unmodified counterparts. Unfortunately, this leads to very large search spaces and excessive runtimes, which is especially problematic considering the continuously increasing sizes of mass spectrometry proteomics datasets. RESULTS We propose an open modification searching algorithm, called HOMS-TC, that fully exploits parallelism in the entire pipeline of spectral library searching. We designed a new highly parallel encoding method based on the principle of hyperdimensional computing to encode mass spectral data to hypervectors while minimizing information loss. This process can be easily parallelized since each dimension is calculated independently. HOMS-TC processes two stages of existing cascade search in parallel and selects the most similar spectra while considering post-translational modifications. We accelerate HOMS-TC on NVIDIA's Tensor Core Units, which is emerging and readily available in the recent graphics processing unit (GPU). Our evaluation shows that HOMS-TC is 31 × faster on average than alternative search engines and provides comparable accuracy to competing search tools. AVAILABILITY HOMS-TC is freely available under the Apache 2.0 license as an open-source software project at https://github.com/tycheyoung/homs-tc.
Collapse
Affiliation(s)
- Jaeyoung Kang
- Department of Electrical and Computer Engineering, University of California San Diego, CA 92093, USA
| | - Weihong Xu
- Department of Computer Science and Engineering, University of California San Diego, CA 92093, USA
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, Antwerpen, 2020, Belgium
| | - Niema Moshiri
- Department of Computer Science and Engineering, University of California San Diego, CA 92093, USA
| | - Tajana Rosing
- Department of Computer Science and Engineering, University of California San Diego, CA 92093, USA
| |
Collapse
|
9
|
Xu W, Kang J, Bittremieux W, Moshiri N, Rosing T. HyperSpec: Ultrafast Mass Spectra Clustering in Hyperdimensional Space. J Proteome Res 2023. [PMID: 37166120 DOI: 10.1021/acs.jproteome.2c00612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
As current shotgun proteomics experiments can produce gigabytes of mass spectrometry data per hour, processing these massive data volumes has become progressively more challenging. Spectral clustering is an effective approach to speed up downstream data processing by merging highly similar spectra to minimize data redundancy. However, because state-of-the-art spectral clustering tools fail to achieve optimal runtimes, this simply moves the processing bottleneck. In this work, we present a fast spectral clustering tool, HyperSpec, based on hyperdimensional computing (HDC). HDC shows promising clustering capability while only requiring lightweight binary operations with high parallelism that can be optimized using low-level hardware architectures, making it possible to run HyperSpec on graphics processing units to achieve extremely efficient spectral clustering performance. Additionally, HyperSpec includes optimized data preprocessing modules to reduce the spectrum preprocessing time, which is a critical bottleneck during spectral clustering. Based on experiments using various mass spectrometry data sets, HyperSpec produces results with comparable clustering quality as state-of-the-art spectral clustering tools while achieving speedups by orders of magnitude, shortening the clustering runtime of over 21 million spectra from 4 h to only 24 min.
Collapse
Affiliation(s)
- Weihong Xu
- Department of Computer Science Engineering, University of California, San Diego, La Jolla, California 92093, United States
| | - Jaeyoung Kang
- Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, California 92093, United States
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerpen, Belgium
| | - Niema Moshiri
- Department of Computer Science Engineering, University of California, San Diego, La Jolla, California 92093, United States
| | - Tajana Rosing
- Department of Computer Science Engineering, University of California, San Diego, La Jolla, California 92093, United States
| |
Collapse
|
10
|
Bittremieux W, Levitsky L, Pilz M, Sachsenberg T, Huber F, Wang M, Dorrestein PC. Unified and Standardized Mass Spectrometry Data Processing in Python Using spectrum_utils. J Proteome Res 2023; 22:625-631. [PMID: 36688502 DOI: 10.1021/acs.jproteome.2c00632] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
spectrum_utils is a Python package for mass spectrometry data processing and visualization. Since its introduction, spectrum_utils has grown into a fundamental software solution that powers various applications in proteomics and metabolomics, ranging from spectrum preprocessing prior to spectrum identification and machine learning applications to spectrum plotting from online data repositories and assisting data analysis tasks for dozens of other projects. Here, we present updates to spectrum_utils, which include new functionality to integrate mass spectrometry community data standards, enhanced mass spectral data processing, and unified mass spectral data visualization in Python. spectrum_utils is freely available as open source at https://github.com/bittremieux/spectrum_utils.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | - Lev Levitsky
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, 119334 Moscow, Russia
| | - Matteo Pilz
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
| | - Timo Sachsenberg
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
| | - Florian Huber
- Centre for Digitalisation and Digitality, University of Applied Sciences Düsseldorf, 40476 Düsseldorf, Germany
| | - Mingxun Wang
- Department of Computer Science, University of California─Riverside, Riverside, California 92507, United States
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California─San Diego, La Jolla, California 92093, United States.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California─San Diego, La Jolla California 92093, United States
| |
Collapse
|
11
|
Deutsch EW, Vizcaíno JA, Jones AR, Binz PA, Lam H, Klein J, Bittremieux W, Perez-Riverol Y, Tabb DL, Walzer M, Ricard-Blum S, Hermjakob H, Neumann S, Mak TD, Kawano S, Mendoza L, Van Den Bossche T, Gabriels R, Bandeira N, Carver J, Pullman B, Sun Z, Hoffmann N, Shofstahl J, Zhu Y, Licata L, Quaglia F, Tosatto SCE, Orchard SE. Proteomics Standards Initiative at Twenty Years: Current Activities and Future Work. J Proteome Res 2023; 22:287-301. [PMID: 36626722 PMCID: PMC9903322 DOI: 10.1021/acs.jproteome.2c00637] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Indexed: 01/11/2023]
Abstract
The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI standards. We briefly describe the current state of the many existing PSI standards, some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete. Then the set of proposals currently being developed are described, with an open call to the community for participation in the forging of the next generation of standards. Finally, we describe some synergies and collaborations with other organizations and look to the future in how the PSI will continue to promote the open sharing of data and thus accelerate the progress of the field of proteomics.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Andrew R. Jones
- Institute
of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Pierre-Alain Binz
- Clinical
Chemistry Service, Lausanne University Hospital, 1011 976 Lausanne, Switzerland
| | - Henry Lam
- Department
of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, P. R. China.
| | - Joshua Klein
- Program for
Bioinformatics, Boston University, Boston, Massachusetts 02215, United States
| | - Wout Bittremieux
- Skaggs
School
of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
- Department
of Computer Science, University of Antwerp, 2020 Antwerpen, Belgium
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L. Tabb
- SA MRC
Centre for TB Research, DST/NRF Centre of Excellence for Biomedical
TB Research, Division of Molecular Biology and Human Genetics, Faculty
of Medicine and Health Sciences, Stellenbosch
University, Cape Town 7602, South Africa
| | - Mathias Walzer
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sylvie Ricard-Blum
- Univ.
Lyon, Université Lyon 1, ICBMS, UMR 5246, 69622 Villeurbanne, France
| | - Henning Hermjakob
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Steffen Neumann
- Bioinformatics
and Scientific Data, Leibniz Institute of
Plant Biochemistry, 06120 Halle, Germany
- German
Centre for Integrative Biodiversity Research (iDiv), 04103 Halle-Jena-Leipzig, Germany
| | - Tytus D. Mak
- Mass Spectrometry
Data Center, National Institute of Standards
and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United
States
| | - Shin Kawano
- Database
Center for Life Science, Joint Support Center for Data Science Research, Research Organization of Information and Systems, Chiba 277-0871, Japan
- Faculty
of Contemporary Society, Toyama University
of International Studies, Toyama 930-1292, Japan
- School
of Frontier Engineering, Kitasato University, Sagamihara 252-0373, Japan
| | - Luis Mendoza
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Tim Van Den Bossche
- VIB-UGent
Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9052 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent
Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9052 Ghent, Belgium
| | - Nuno Bandeira
- Skaggs
School
of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Jeremy Carver
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Benjamin Pullman
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Zhi Sun
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Nils Hoffmann
- Institute
for Bio- and Geosciences (IBG-5), Forschungszentrum
Jülich GmbH, 52428 Jülich, Germany
| | - Jim Shofstahl
- Thermo
Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Yunping Zhu
- National
Center for Protein Sciences (Beijing), Beijing
Institute of Lifeomics, #38, Life Science Park, Changping District, Beijing 102206, China
| | - Luana Licata
- Fondazione
Human Technopole, 20157 Milan, Italy
- Department
of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Federica Quaglia
- Institute
of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), 70126 Bari, Italy
- Department
of Biomedical Sciences, University of Padova, 35131 Padova, Italy
| | | | - Sandra E. Orchard
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
12
|
Arab I, Fondrie WE, Laukens K, Bittremieux W. Semisupervised Machine Learning for Sensitive Open Modification Spectral Library Searching. J Proteome Res 2023; 22:585-593. [PMID: 36688569 DOI: 10.1021/acs.jproteome.2c00616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
A key analysis task in mass spectrometry proteomics is matching the acquired tandem mass spectra to their originating peptides by sequence database searching or spectral library searching. Machine learning is an increasingly popular postprocessing approach to maximize the number of confident spectrum identifications that can be obtained at a given false discovery rate threshold. Here, we have integrated semisupervised machine learning in the ANN-SoLo tool, an efficient spectral library search engine that is optimized for open modification searching to identify peptides with any type of post-translational modification. We show that machine learning rescoring boosts the number of spectra that can be identified for both standard searching and open searching, and we provide insights into relevant spectrum characteristics harnessed by the machine learning model. The semisupervised machine learning functionality has now been fully integrated into ANN-SoLo, which is available as open source under the permissive Apache 2.0 license on GitHub at https://github.com/bittremieux/ANN-SoLo.
Collapse
Affiliation(s)
- Issar Arab
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | | | - Kris Laukens
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| |
Collapse
|
13
|
Gauglitz JM, West KA, Bittremieux W, Williams CL, Weldon KC, Panitchpakdi M, Di Ottavio F, Aceves CM, Brown E, Sikora NC, Jarmusch AK, Martino C, Tripathi A, Meehan MJ, Dorrestein K, Shaffer JP, Coras R, Vargas F, Goldasich LD, Schwartz T, Bryant M, Humphrey G, Johnson AJ, Spengler K, Belda-Ferre P, Diaz E, McDonald D, Zhu Q, Elijah EO, Wang M, Marotz C, Sprecher KE, Vargas-Robles D, Withrow D, Ackermann G, Herrera L, Bradford BJ, Marques LMM, Amaral JG, Silva RM, Veras FP, Cunha TM, Oliveira RDR, Louzada-Junior P, Mills RH, Piotrowski PK, Servetas SL, Da Silva SM, Jones CM, Lin NJ, Lippa KA, Jackson SA, Daouk RK, Galasko D, Dulai PS, Kalashnikova TI, Wittenberg C, Terkeltaub R, Doty MM, Kim JH, Rhee KE, Beauchamp-Walters J, Wright KP, Dominguez-Bello MG, Manary M, Oliveira MF, Boland BS, Lopes NP, Guma M, Swafford AD, Dutton RJ, Knight R, Dorrestein PC. Enhancing untargeted metabolomics using metadata-based source annotation. Nat Biotechnol 2022; 40:1774-1779. [PMID: 35798960 PMCID: PMC10277029 DOI: 10.1038/s41587-022-01368-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 05/20/2022] [Indexed: 01/30/2023]
Abstract
Human untargeted metabolomics studies annotate only ~10% of molecular features. We introduce reference-data-driven analysis to match metabolomics tandem mass spectrometry (MS/MS) data against metadata-annotated source data as a pseudo-MS/MS reference library. Applying this approach to food source data, we show that it increases MS/MS spectral usage 5.1-fold over conventional structural MS/MS library matches and allows empirical assessment of dietary patterns from untargeted data.
Collapse
Affiliation(s)
- Julia M Gauglitz
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kiana A West
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Candace L Williams
- Beckman Center for Conservation Research, San Diego Zoo Wildlife Alliance, Escondido, CA, USA
| | - Kelly C Weldon
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
| | - Morgan Panitchpakdi
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Francesca Di Ottavio
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
| | - Christine M Aceves
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Elizabeth Brown
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Nicole C Sikora
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Alan K Jarmusch
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Cameron Martino
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
| | - Anupriya Tripathi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Michael J Meehan
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kathleen Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Justin P Shaffer
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Roxana Coras
- Division of Rheumatology, Allergy & Immunology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Fernando Vargas
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | | | - Tara Schwartz
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - MacKenzie Bryant
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Gregory Humphrey
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Abigail J Johnson
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Katharina Spengler
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
| | - Pedro Belda-Ferre
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Edgar Diaz
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Daniel McDonald
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Qiyun Zhu
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Emmanuel O Elijah
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Clarisse Marotz
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Kate E Sprecher
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, CO, USA
- Department of Population Health Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Daniela Vargas-Robles
- Servicio Autónomo Centro Amazónico de Investigación y Control de Enfermedades Tropicales Simón Bolívar, Puerto Ayacucho, Amazonas, Venezuela
| | - Dana Withrow
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, CO, USA
| | - Gail Ackermann
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Lourdes Herrera
- Department of Pediatrics, Billings Clinic, Billings, MT, USA
| | - Barry J Bradford
- Department of Animal Science, Michigan State University, East Lansing, MI, USA
| | - Lucas Maciel Mauriz Marques
- Department of Pharmacology, Ribeirão Preto Medicinal School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Juliano Geraldo Amaral
- Multidisciplinary Health Institute, Federal University of Bahia, Vitória da Conquista, Bahia, Brazil
| | - Rodrigo Moreira Silva
- NPPNS, Department of Biomolecular Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Flavio Protasio Veras
- Department of Pharmacology, Ribeirão Preto Medicinal School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Thiago Mattar Cunha
- Department of Pharmacology, Ribeirão Preto Medicinal School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Rene Donizeti Ribeiro Oliveira
- Department of Internal Medicine, Ribeirão Preto Medical School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Paulo Louzada-Junior
- Department of Internal Medicine, Ribeirão Preto Medical School, Center of Research in Inflammatory Diseases, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Robert H Mills
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
- Department of Pharmacology, University of California San Diego, La Jolla, CA, USA
| | - Paulina K Piotrowski
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Stephanie L Servetas
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Sandra M Da Silva
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Christina M Jones
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Nancy J Lin
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Katrice A Lippa
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Scott A Jackson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Rima Kaddurah Daouk
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, Durham, NC, USA
- Department of Medicine, Duke University, Durham, NC, USA
- Duke Institute of Brain Sciences, Duke University, Durham, NC, USA
| | - Douglas Galasko
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
| | - Parambir S Dulai
- Division of Gastroenterology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | | | - Curt Wittenberg
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, USA
| | - Robert Terkeltaub
- Division of Rheumatology, Allergy & Immunology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
- San Diego VA Healthcare System, San Diego, CA, USA
| | - Megan M Doty
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
- Division of Neonatology, Department of Pediatrics, Kapi'olani Medical Center for Women and Children, John A. Burns School of Medicine, Honolulu, Hawaii, USA
| | - Jae H Kim
- Division of Neonatology, Perinatal Institute, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Kyung E Rhee
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Julia Beauchamp-Walters
- Division of Pediatric Hospital Medicine, Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
| | - Kenneth P Wright
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, CO, USA
| | - Maria Gloria Dominguez-Bello
- Department of Biochemistry and Microbiology, School of Environmental and Biological Sciences; Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Mark Manary
- Department of Pediatrics, Washington University, St. Louis, MO, USA
| | - Michelli F Oliveira
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Brigid S Boland
- Division of Gastroenterology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Norberto Peporine Lopes
- NPPNS, Department of Biomolecular Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, Sao Paolo, Brazil
| | - Monica Guma
- Division of Rheumatology, Allergy & Immunology, Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Austin D Swafford
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA
| | - Rachel J Dutton
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Rob Knight
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA.
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.
- Center for Microbiome Innovation, Joan and Irwin Jacobs School of Engineering, University of California San Diego, La Jolla, CA, USA.
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, CA, USA.
- Department of Pharmacology, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
14
|
Bittremieux W, Wang M, Dorrestein PC. The critical role that spectral libraries play in capturing the metabolomics community knowledge. Metabolomics 2022; 18:94. [PMID: 36409434 DOI: 10.1007/s11306-022-01947-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 10/19/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Spectral library searching is currently the most common approach for compound annotation in untargeted metabolomics. Spectral libraries applicable to liquid chromatography mass spectrometry have grown in size over the past decade to include hundreds of thousands to millions of mass spectra and tens of thousands of compounds, forming an essential knowledge base for the interpretation of metabolomics experiments. AIM OF REVIEW We describe existing spectral library resources, highlight different strategies for compiling spectral libraries, and discuss quality considerations that should be taken into account when interpreting spectral library searching results. Finally, we describe how spectral libraries are empowering the next generation of machine learning tools in computational metabolomics, and discuss several opportunities for using increasingly accessible large spectral libraries. KEY SCIENTIFIC CONCEPTS OF REVIEW This review focuses on the current state of spectral libraries for untargeted LC-MS/MS based metabolomics. We show how the number of entries in publicly accessible spectral libraries has increased more than 60-fold in the past eight years to aid molecular interpretation and we discuss how the role of spectral libraries in untargeted metabolomics will evolve in the near future.
Collapse
Affiliation(s)
- Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
| | - Mingxun Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, 92507, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
15
|
Adams C, Boonen K, Laukens K, Bittremieux W. Open Modification Searching of SARS-CoV-2-Human Protein Interaction Data Reveals Novel Viral Modification Sites. Mol Cell Proteomics 2022; 21:100425. [PMID: 36241021 PMCID: PMC9554009 DOI: 10.1016/j.mcpro.2022.100425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 09/18/2022] [Accepted: 10/09/2022] [Indexed: 01/18/2023] Open
Abstract
The outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of the coronavirus 2019 disease, has led to an ongoing global pandemic since 2019. Mass spectrometry can be used to understand the molecular mechanisms of viral infection by SARS-CoV-2, for example, by determining virus-host protein-protein interactions through which SARS-CoV-2 hijacks its human hosts during infection, and to study the role of post-translational modifications. We have reanalyzed public affinity purification-mass spectrometry data using open modification searching to investigate the presence of post-translational modifications in the context of the SARS-CoV-2 virus-host protein-protein interaction network. Based on an over twofold increase in identified spectra, our detected protein interactions show a high overlap with independent mass spectrometry-based SARS-CoV-2 studies and virus-host interactions for alternative viruses, as well as previously unknown protein interactions. In addition, we identified several novel modification sites on SARS-CoV-2 proteins that we investigated in relation to their interactions with host proteins. A detailed analysis of relevant modifications, including phosphorylation, ubiquitination, and S-nitrosylation, provides important hypotheses about the functional role of these modifications during viral infection by SARS-CoV-2.
Collapse
Affiliation(s)
- Charlotte Adams
- Department of Computer Science, University of Antwerp, Antwerp, Belgium,Centre for Proteomics (CFP), University of Antwerp, Antwerp, Belgium
| | - Kurt Boonen
- Centre for Proteomics (CFP), University of Antwerp, Antwerp, Belgium,Sustainable Health Department, Flemish Institute for Technological Research (VITO), Antwerp, Belgium
| | - Kris Laukens
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA,For correspondence: Wout Bittremieux
| |
Collapse
|
16
|
Bittremieux W, Schmid R, Huber F, van der Hooft JJJ, Wang M, Dorrestein PC. Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment For Discovery of Structurally Related Molecules. J Am Soc Mass Spectrom 2022; 33:1733-1744. [PMID: 35960544 DOI: 10.1021/jasms.2c00153] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Spectrum alignment of tandem mass spectrometry (MS/MS) data using the modified cosine similarity and subsequent visualization as molecular networks have been demonstrated to be a useful strategy to discover analogs of molecules from untargeted MS/MS-based metabolomics experiments. Recently, a neutral loss matching approach has been introduced as an alternative to MS/MS-based molecular networking with an implied performance advantage in finding analogs that cannot be discovered using existing MS/MS spectrum alignment strategies. To comprehensively evaluate the scoring properties of neutral loss matching, the cosine similarity, and the modified cosine similarity, similarity measures of 955 228 peptide MS/MS spectrum pairs and 10 million small molecule MS/MS spectrum pairs were compared. This comparative analysis revealed that the modified cosine similarity outperformed neutral loss matching and the cosine similarity in all cases. The data further indicated that the performance of MS/MS spectrum alignment depends on the location and type of the modification, as well as the chemical compound class of fragmented molecules.
Collapse
Affiliation(s)
- Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Robin Schmid
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Florian Huber
- Centre for Digitalization and Digitality, University of Applied Sciences, 40476 Düsseldorf, Germany
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University, 6708PB Wageningen, The Netherlands
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
17
|
Reher R, Aron AT, Fajtová P, Stincone P, Wagner B, Pérez-Lorente AI, Liu C, Shalom IYB, Bittremieux W, Wang M, Jeong K, Matos-Hernandez ML, Alexander KL, Caro-Diaz EJ, Naman CB, Scanlan JHW, Hochban PMM, Diederich WE, Molina-Santiago C, Romero D, Selim KA, Sass P, Brötz-Oesterhelt H, Hughes CC, Dorrestein PC, O'Donoghue AJ, Gerwick WH, Petras D. Native metabolomics identifies the rivulariapeptolide family of protease inhibitors. Nat Commun 2022; 13:4619. [PMID: 35941113 PMCID: PMC9358669 DOI: 10.1038/s41467-022-32016-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 07/12/2022] [Indexed: 11/15/2022] Open
Abstract
The identity and biological activity of most metabolites still remain unknown. A bottleneck in the exploration of metabolite structures and pharmaceutical activities is the compound purification needed for bioactivity assignments and downstream structure elucidation. To enable bioactivity-focused compound identification from complex mixtures, we develop a scalable native metabolomics approach that integrates non-targeted liquid chromatography tandem mass spectrometry and detection of protein binding via native mass spectrometry. A native metabolomics screen for protease inhibitors from an environmental cyanobacteria community reveals 30 chymotrypsin-binding cyclodepsipeptides. Guided by the native metabolomics results, we select and purify five of these compounds for full structure elucidation via tandem mass spectrometry, chemical derivatization, and nuclear magnetic resonance spectroscopy as well as evaluation of their biological activities. These results identify rivulariapeptolides as a family of serine protease inhibitors with nanomolar potency, highlighting native metabolomics as a promising approach for drug discovery, chemical ecology, and chemical biology studies. Bioactivity-guided isolation of specialized metabolites is an iterative process. Here, the authors demonstrate a native metabolomics approach that allows for fast screening of complex metabolite extracts against a protein of interest and simultaneous structure annotation.
Collapse
Affiliation(s)
- Raphael Reher
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA.,Institute of Pharmacy, Martin-Luther-University Halle-Wittenberg, Halle, Germany.,Institute of Pharmaceutical Biology and Biotechnology, University of Marburg, Marburg, Germany
| | - Allegra T Aron
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - Pavla Fajtová
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - Paolo Stincone
- Cluster of Excellence "Controlling Microbes to Fight Infections" (CMFI), University of Tuebingen, Tuebingen, Germany
| | - Berenike Wagner
- Cluster of Excellence "Controlling Microbes to Fight Infections" (CMFI), University of Tuebingen, Tuebingen, Germany.,Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Tuebingen, Germany
| | - Alicia I Pérez-Lorente
- Instituto de Hortofruticultura Subtropical y Mediterránea "La Mayora," Consejo Superior de Investigaciones Científicas, Departamento de Microbiología, Universidad de Málaga, Málaga, Spain
| | - Chenxi Liu
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - Ido Y Ben Shalom
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - Kyowon Jeong
- Applied Bioinformatics, Computer Science Department, University of Tuebingen, Tuebingen, Germany
| | - Marie L Matos-Hernandez
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Puerto Rico - Medical Sciences Campus, San Juan, Puerto Rico
| | - Kelsey L Alexander
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA.,Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, CA, USA
| | - Eduardo J Caro-Diaz
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Puerto Rico - Medical Sciences Campus, San Juan, Puerto Rico
| | - C Benjamin Naman
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Department of Marine Pharmacy, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, China
| | - J H William Scanlan
- Department of Pharmaceutical Chemistry and Center for Tumor Biology and Immunology (ZTI), University of Marburg, Marburg, Germany
| | - Phil M M Hochban
- Department of Pharmaceutical Chemistry and Center for Tumor Biology and Immunology (ZTI), University of Marburg, Marburg, Germany
| | - Wibke E Diederich
- Department of Pharmaceutical Chemistry and Center for Tumor Biology and Immunology (ZTI), University of Marburg, Marburg, Germany
| | - Carlos Molina-Santiago
- Instituto de Hortofruticultura Subtropical y Mediterránea "La Mayora," Consejo Superior de Investigaciones Científicas, Departamento de Microbiología, Universidad de Málaga, Málaga, Spain
| | - Diego Romero
- Instituto de Hortofruticultura Subtropical y Mediterránea "La Mayora," Consejo Superior de Investigaciones Científicas, Departamento de Microbiología, Universidad de Málaga, Málaga, Spain
| | - Khaled A Selim
- Cluster of Excellence "Controlling Microbes to Fight Infections" (CMFI), University of Tuebingen, Tuebingen, Germany.,Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Tuebingen, Germany
| | - Peter Sass
- Cluster of Excellence "Controlling Microbes to Fight Infections" (CMFI), University of Tuebingen, Tuebingen, Germany.,Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Tuebingen, Germany
| | - Heike Brötz-Oesterhelt
- Cluster of Excellence "Controlling Microbes to Fight Infections" (CMFI), University of Tuebingen, Tuebingen, Germany.,Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Tuebingen, Germany.,German Center for Infection Research, Partner Site Tuebingen, Tuebingen, Germany
| | - Chambers C Hughes
- Cluster of Excellence "Controlling Microbes to Fight Infections" (CMFI), University of Tuebingen, Tuebingen, Germany.,Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Tuebingen, Germany.,German Center for Infection Research, Partner Site Tuebingen, Tuebingen, Germany
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - Anthony J O'Donoghue
- Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA
| | - William H Gerwick
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA. .,Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA.
| | - Daniel Petras
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA. .,Skaggs School of Pharmacy and Pharmaceutical Science, University of California San Diego, La Jolla, CA, USA. .,Cluster of Excellence "Controlling Microbes to Fight Infections" (CMFI), University of Tuebingen, Tuebingen, Germany. .,Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Tuebingen, Germany.
| |
Collapse
|
18
|
Bittremieux W, May DH, Bilmes J, Noble WS. A learned embedding for efficient joint analysis of millions of mass spectra. Nat Methods 2022; 19:675-678. [DOI: 10.1038/s41592-022-01496-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 04/14/2022] [Indexed: 11/09/2022]
|
19
|
Luo X, Bittremieux W, Griss J, Deutsch EW, Sachsenberg T, Levitsky LI, Ivanov MV, Bubis JA, Gabriels R, Webel H, Sanchez A, Bai M, Käll L, Perez-Riverol Y. A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics. J Proteome Res 2022; 21:1566-1574. [PMID: 35549218 PMCID: PMC9171829 DOI: 10.1021/acs.jproteome.2c00069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Spectrum clustering
is a powerful strategy to minimize redundant
mass spectra by grouping them based on similarity, with the aim of
forming groups of mass spectra from the same repeatedly measured analytes.
Each such group of near-identical spectra can be represented by its
so-called consensus spectrum for downstream processing. Although several
algorithms for spectrum clustering have been adequately benchmarked
and tested, the influence of the consensus spectrum generation step
is rarely evaluated. Here, we present an implementation and benchmark
of common consensus spectrum algorithms, including spectrum averaging,
spectrum binning, the most similar spectrum, and the best-identified
spectrum. We have analyzed diverse public data sets using two different
clustering algorithms (spectra-cluster and MaRaCluster) to evaluate
how the consensus spectrum generation procedure influences downstream
peptide identification. The BEST and BIN methods were found the most
reliable methods for consensus spectrum generation, including for
data sets with post-translational modifications (PTM) such as phosphorylation.
All source code and data of the present study are freely available
on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark.
Collapse
Affiliation(s)
- Xiyang Luo
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, 400065 Chongqing, China
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Johannes Griss
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, U.K.,Department of Dermatology, Medical University of Vienna, 1090 Vienna, Austria
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Timo Sachsenberg
- Applied Bioinformatics, Department for Computer Science, University of Tuebingen, Sand 14, 72076 Tuebingen, Germany
| | - Lev I Levitsky
- V.L. Talrose Institute for Energy Problems of Chemical Physics, N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow 142432, Russia
| | - Mark V Ivanov
- V.L. Talrose Institute for Energy Problems of Chemical Physics, N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow 142432, Russia
| | - Julia A Bubis
- V.L. Talrose Institute for Energy Problems of Chemical Physics, N.N. Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Moscow 142432, Russia
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, B-9052 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, B-9000 Ghent, Belgium
| | - Henry Webel
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen DK-2200, Denmark
| | - Aniel Sanchez
- Section for Clinical Chemistry, Department of Translational Medicine, Lund University, Skåne University Hospital Malmö, 20502 Malmö, Sweden
| | - Mingze Bai
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, 400065 Chongqing, China
| | - Lukas Käll
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, Royal Institute of Technology - KTH, Box 1031, 17121 Solna, Sweden
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, U.K
| |
Collapse
|
20
|
LeDuc RD, Deutsch EW, Binz PA, Fellers RT, Cesnik AJ, Klein JA, Van Den Bossche T, Gabriels R, Yalavarthi A, Perez-Riverol Y, Carver J, Bittremieux W, Kawano S, Pullman B, Bandeira N, Kelleher NL, Thomas PM, Vizcaíno JA. Proteomics Standards Initiative's ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms. J Proteome Res 2022; 21:1189-1195. [PMID: 35290070 PMCID: PMC7612572 DOI: 10.1021/acs.jproteome.1c00771] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically-induced and artifactual modifications. The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human-and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical), and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.
Collapse
Affiliation(s)
- Richard D LeDuc
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Pierre-Alain Binz
- Clinical Chemistry Service, Lausanne University Hospital, 1011 Lausanne, Switzerland
| | - Ryan T Fellers
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Anthony J Cesnik
- Department of Genetics, Stanford University, Stanford, California 94305, United States.,Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, California 94158, United States.,SciLifeLab, School of Engineering Sciences in Chemistry Biotechnology and Health, KTH-Royal Institute of Technology, SE-171 21 Solna, Stockholm, Sweden 113 51
| | - Joshua A Klein
- Program for Bioinformatics, Boston University, Boston, Massachusetts 02215, United States
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, Technologiepark 75-FSVM II, 9052 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Technologiepark 75-FSVM II, 9052 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Arshika Yalavarthi
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | | | - Shin Kawano
- Toyama University of International Studies, Toyama, 930-1292 Toyama, Higashikuromaki, 6 5-1, Japan.,Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa, Chiba 277-0871, Japan
| | | | | | - Neil L Kelleher
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Paul M Thomas
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
21
|
Bittremieux W, Advani RS, Jarmusch AK, Aguirre S, Lu A, Dorrestein PC, Tsunoda SM. Physicochemical properties determining drug detection in skin. Clin Transl Sci 2021; 15:761-770. [PMID: 34793633 PMCID: PMC8932847 DOI: 10.1111/cts.13198] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 10/20/2021] [Accepted: 11/09/2021] [Indexed: 12/24/2022] Open
Abstract
Chemicals, including some systemically administered xenobiotics and their biotransformations, can be detected noninvasively using skin swabs and untargeted metabolomics analysis. We sought to understand the principal drivers that determine whether a drug taken orally or systemically is likely to be observed on the epidermis by using a random forest classifier to predict which drugs would be detected on the skin. A variety of molecular descriptors describing calculated properties of drugs, such as measures of volume, electronegativity, bond energy, and electrotopology, were used to train the classifier. The mean area under the receiver operating characteristic curve was 0.71 for predicting drug detection on the epidermis, and the SHapley Additive exPlanations (SHAP) model interpretation technique was used to determine the most relevant molecular descriptors. Based on the analysis of 2561 US Food and Drug Administration (FDA)‐approved drugs, we predict that therapeutic drug classes, such as nervous system drugs, are more likely to be detected on the skin. Detecting drugs and other chemicals noninvasively on the skin using untargeted metabolomics could be a useful clinical advancement in therapeutic drug monitoring, adherence, and health status.
Collapse
Affiliation(s)
- Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA.,Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California, USA.,Department of Computer Science, University of Antwerp, Antwerpen, Belgium
| | - Rohit S Advani
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA.,Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California, USA.,Department of Chemistry, University of Massachusetts-Amherst, Amherst, Massachusetts, USA
| | - Alan K Jarmusch
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA.,Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California, USA.,Immunity, Inflammation, and Disease Laboratory, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina, USA
| | - Shaden Aguirre
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA.,Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California, USA
| | - Aileen Lu
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA.,Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California, USA
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA.,Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California, USA
| | - Shirley M Tsunoda
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA
| |
Collapse
|
22
|
Abstract
The volume of proteomics and mass spectrometry data available in public repositories continues to grow at a rapid pace as more researchers embrace open science practices. Open access to the data behind scientific discoveries has become critical to validate published findings and develop new computational tools. Here, we present ppx, a Python package that provides easy, programmatic access to the data stored in ProteomeXchange repositories, such as PRIDE and MassIVE. The ppx package can be used as either a command line tool or a Python package to retrieve the files and metadata associated with a project when provided its identifier. To demonstrate how ppx enhances reproducible research, we used ppx within a Snakemake workflow to reanalyze a published data set with the open modification search tool ANN-SoLo and compared our reanalysis to the original results. We show that ppx readily integrates into workflows, and our reanalysis produced results consistent with the original analysis. We envision that ppx will be a valuable tool for creating reproducible analyses, providing tool developers easy access to data for development, testing, and benchmarking, and enabling the use of mass spectrometry data in data-intensive analyses. The ppx package is freely available and open source under the MIT license at https://github.com/wfondrie/ppx.
Collapse
Affiliation(s)
- William E Fondrie
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| |
Collapse
|
23
|
Affiliation(s)
- Samantha L Wilson
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Gregory P Way
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.,Department of Computer Science, University of Antwerp, Antwerpen, Belgium
| | - Jean-Paul Armache
- Department of Biochemistry & Molecular Biology, The Huck Institutes of Life Sciences, Pennsylvania State University, University Park, PA, USA
| | | | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.,Department of Medical Biophysics, Department of Computer Science, University of Toronto, Toronto, ON, Canada.,Vector Institute, Toronto, ON, Canada
| |
Collapse
|
24
|
Moris P, De Pauw J, Postovskaya A, Gielis S, De Neuter N, Bittremieux W, Ogunjimi B, Laukens K, Meysman P. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief Bioinform 2021; 22:bbaa318. [PMID: 33346826 PMCID: PMC8294552 DOI: 10.1093/bib/bbaa318] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The prediction of epitope recognition by T-cell receptors (TCRs) has seen many advancements in recent years, with several methods now available that can predict recognition for a specific set of epitopes. However, the generic case of evaluating all possible TCR-epitope pairs remains challenging, mainly due to the high diversity of the interacting sequences and the limited amount of currently available training data. In this work, we provide an overview of the current state of this unsolved problem. First, we examine appropriate validation strategies to accurately assess the generalization performance of generic TCR-epitope recognition models when applied to both seen and unseen epitopes. In addition, we present a novel feature representation approach, which we call ImRex (interaction map recognition). This approach is based on the pairwise combination of physicochemical properties of the individual amino acids in the CDR3 and epitope sequences, which provides a convolutional neural network with the combined representation of both sequences. Lastly, we highlight various challenges that are specific to TCR-epitope data and that can adversely affect model performance. These include the issue of selecting negative data, the imbalanced epitope distribution of curated TCR-epitope datasets and the potential exchangeability of TCR alpha and beta chains. Our results indicate that while extrapolation to unseen epitopes remains a difficult challenge, ImRex makes this feasible for a subset of epitopes that are not too dissimilar from the training data. We show that appropriate feature engineering methods and rigorous benchmark standards are required to create and validate TCR-epitope predictive models.
Collapse
MESH Headings
- Animals
- Complementarity Determining Regions/genetics
- Complementarity Determining Regions/immunology
- Epitopes, T-Lymphocyte/genetics
- Epitopes, T-Lymphocyte/immunology
- Humans
- Macaca mulatta
- Mice
- Models, Genetic
- Models, Immunological
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Receptors, Antigen, T-Cell, alpha-beta/immunology
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Pieter Meysman
- Corresponding author: Pieter Meysman, Adrem Data Lab, Department of Computer Science, University of Antwerp, Antwerp, 2020, Belgium. E-mail:
| |
Collapse
|
25
|
Deutsch EW, Perez-Riverol Y, Carver J, Kawano S, Mendoza L, Van Den Bossche T, Gabriels R, Binz PA, Pullman B, Sun Z, Shofstahl J, Bittremieux W, Mak TD, Klein J, Zhu Y, Lam H, Vizcaíno JA, Bandeira N. Universal Spectrum Identifier for mass spectra. Nat Methods 2021; 18:768-770. [PMID: 34183830 PMCID: PMC8405201 DOI: 10.1038/s41592-021-01184-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 05/10/2021] [Indexed: 02/03/2023]
Abstract
Mass spectra provide the ultimate evidence to support the findings of mass spectrometry proteomics studies in publications, and it is therefore crucial to be able to trace the conclusions back to the spectra. The Universal Spectrum Identifier (USI) provides a standardized mechanism for encoding a virtual path to any mass spectrum contained in datasets deposited to public proteomics repositories. USI enables greater transparency of spectral evidence, with more than 1 billion USI identifications from over 3 billion spectra already available through ProteomeXchange repositories.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA,Address correspondence to: , Phone: 206-732-1200, Fax: 206-732-1299. , Phone: 858-534-8666
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Jeremy Carver
- Center for Computational Mass Spectrometry, Department of Computer Science and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 92093-0404, USA
| | - Shin Kawano
- Toyama University of International Studies, 930-1292 Toyama, Japan
| | - Luis Mendoza
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA
| | - Tim Van Den Bossche
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Ralf Gabriels
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Pierre-Alain Binz
- Clinical Chemistry Service, Lausanne University Hospital, 1011 Lausanne, Switzerland
| | - Benjamin Pullman
- Center for Computational Mass Spectrometry, Department of Computer Science and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 92093-0404, USA
| | - Zhi Sun
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA
| | - Jim Shofstahl
- Thermo Fisher Scientific, 355 River Oaks Parkway San Jose, CA 95134, USA
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA,Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
| | - Tytus D. Mak
- Mass Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, USA
| | - Joshua Klein
- Program for Bioinformatics, Boston University, Boston, MA 02215, USA
| | - Yunping Zhu
- National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, #38, Life Science Park, Changping District, Beijing 102206, China
| | - Henry Lam
- Department of Chemical and Biological Engineering, the Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Nuno Bandeira
- Center for Computational Mass Spectrometry, Department of Computer Science and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 92093-0404, USA,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA,Address correspondence to: , Phone: 206-732-1200, Fax: 206-732-1299. , Phone: 858-534-8666
| |
Collapse
|
26
|
Bittremieux W, Laukens K, Noble WS, Dorrestein PC. Large-scale tandem mass spectrum clustering using fast nearest neighbor searching. Rapid Commun Mass Spectrom 2021:e9153. [PMID: 34169593 PMCID: PMC8709870 DOI: 10.1002/rcm.9153] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 06/21/2021] [Accepted: 06/21/2021] [Indexed: 05/27/2023]
Abstract
RATIONALE Advanced algorithmic solutions are necessary to process the ever-increasing amounts of mass spectrometry data that are being generated. In this study, we describe the falcon spectrum clustering tool for efficient clustering of millions of MS/MS spectra. METHODS falcon succeeds in efficiently clustering large amounts of mass spectral data using advanced techniques for fast spectrum similarity searching. First, high-resolution spectra are binned and converted to low-dimensional vectors using feature hashing. Next, the spectrum vectors are used to construct nearest neighbor indexes for fast similarity searching. The nearest neighbor indexes are used to efficiently compute a sparse pairwise distance matrix without having to exhaustively perform all pairwise spectrum comparisons within the relevant precursor mass tolerance. Finally, density-based clustering is performed to group similar spectra into clusters. RESULTS Several state-of-the-art spectrum clustering tools were evaluated using a large draft human proteome data set consisting of 25 million spectra, indicating that alternative tools produce clustering results with different characteristics. Notably, falcon generates larger highly pure clusters than alternative tools, leading to a larger reduction in data volume without the loss of relevant information for more efficient downstream processing. CONCLUSIONS falcon is a highly efficient spectrum clustering tool, which is publicly available as an open source under the permissive BSD license at https://github.com/bittremieux/falcon.
Collapse
Affiliation(s)
- Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, United States
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - William Stafford Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, United States
| |
Collapse
|
27
|
Liu Y, De Vijlder T, Bittremieux W, Laukens K, Heyndrickx W. Current and future deep learning algorithms for tandem mass spectrometry (MS/MS)-based small molecule structure elucidation. Rapid Commun Mass Spectrom 2021:e9120. [PMID: 33955607 DOI: 10.1002/rcm.9120] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 04/13/2021] [Accepted: 04/29/2021] [Indexed: 06/12/2023]
Abstract
RATIONALE Structure elucidation of small molecules has been one of the cornerstone applications of mass spectrometry for decades. Despite the increasing availability of software tools, structure elucidation from tandem mass spectrometry (MS/MS) data remains a challenging task, leaving many spectra unidentified. However, as an increasing number of reference MS/MS spectra are being curated at a repository scale and shared on public servers, there is an exciting opportunity to develop powerful new deep learning (DL) models for automated structure elucidation. ARCHITECTURES Recent early-stage DL frameworks mostly follow a "two-step approach" that translates MS/MS spectra to database structures after first predicting molecular descriptors. The related architectures could suffer from: (1) computational complexity because of the separate training of descriptor-specific classifiers, (2) the high dimensional nature of mass spectral data and information loss due to data preprocessing, (3) low substructure coverage and class imbalance problem of predefined molecular fingerprints. Inspired by successful DL frameworks employed in drug discovery fields, we have conceptualized and designed hypothetical DL architectures to tackle the above issues. For (1), we recommend multitask learning to achieve better performance with fewer classifiers by grouping structurally related descriptors. For (2) and (3), we introduce feature engineering to extract condensed and higher-order information from spectra and structure data. For instance, encoding spectra with subtrees and pre-calculated spectral patterns add peak interactions to the model input. Encoding structures with graph convolutional networks incorporates connectivity within a molecule. The joint embedding of spectra and structures can enable simultaneous spectral library and molecular database search. CONCLUSIONS In principle, given enough training data, adapted DL architectures, optimal hyperparameters and computing power, DL frameworks can predict small molecule structures, completely or at least partially, from MS/MS spectra. However, their performance and general applicability should be fairly evaluated against classical machine learning frameworks.
Collapse
Affiliation(s)
| | | | - Wout Bittremieux
- University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), University of Antwerp, Antwerp, Belgium
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, San Diego, CA, USA
| | - Kris Laukens
- University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), University of Antwerp, Antwerp, Belgium
| | | |
Collapse
|
28
|
Bittremieux W, Bouyssié D, Dorfer V, Locard-Paulet M, Perez-Riverol Y, Schwämmle V, Uszkoreit J, Van Den Bossche T. The European Bioinformatics Community for Mass Spectrometry (EuBIC-MS): an open community for bioinformatics training and research. Rapid Commun Mass Spectrom 2021:e9087. [PMID: 33861485 DOI: 10.1002/rcm.9087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/13/2021] [Accepted: 03/18/2021] [Indexed: 06/12/2023]
Abstract
The European Bioinformatics Community for Mass Spectrometry (EuBIC-MS; eubic-ms.org) was founded in 2014 to unite European computational mass spectrometry researchers and proteomics bioinformaticians working in academia and industry. EuBIC-MS maintains educational resources (proteomics-academy.org) and organises workshops at national and international conferences on proteomics and mass spectrometry. Furthermore, EuBIC-MS is actively involved in several community initiatives such as the Human Proteome Organization's Proteomics Standards Initiative (HUPO-PSI). Apart from these collaborations, EuBIC-MS has organised two Winter Schools and two Developers' Meetings that have contributed to the strengthening of the European mass spectrometry network and fostered international collaboration in this field, even beyond Europe. Moreover, EuBIC-MS is currently actively developing a community-driven standard dedicated to mass spectrometry data annotation (SDRF-Proteomics) that will facilitate data reuse and collaboration. This manuscript highlights what EuBIC-MS is, what it does, and what it already has achieved. A warm invitation is extended to new researchers at all career stages to join the EuBIC-MS community on its Slack channel (eubic.slack.com).
Collapse
Affiliation(s)
- Wout Bittremieux
- European Bioinformatics Community for Mass Spectrometry, Belgium
- University of California San Diego, La Jolla, CA, USA
- University of Antwerp, Antwerp, Belgium
| | - David Bouyssié
- European Bioinformatics Community for Mass Spectrometry, Belgium
- IPBS, University of Toulouse, CNRS, UPS, Toulouse, France
| | - Viktoria Dorfer
- European Bioinformatics Community for Mass Spectrometry, Belgium
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Hagenberg, Austria
| | - Marie Locard-Paulet
- European Bioinformatics Community for Mass Spectrometry, Belgium
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark
| | - Yasset Perez-Riverol
- European Bioinformatics Community for Mass Spectrometry, Belgium
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Veit Schwämmle
- European Bioinformatics Community for Mass Spectrometry, Belgium
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Julian Uszkoreit
- European Bioinformatics Community for Mass Spectrometry, Belgium
- Center for Protein Diagnostics (PRODI), Medical Proteome Analysis, Ruhr University Bochum, Bochum, Germany
- Medical Faculty, Medizinisches Proteom-Center, Ruhr University Bochum, Bochum, Germany
| | - Tim Van Den Bossche
- European Bioinformatics Community for Mass Spectrometry, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
29
|
Bittremieux W, Adams C, Laukens K, Dorrestein PC, Bandeira N. Open Science Resources for the Mass Spectrometry-Based Analysis of SARS-CoV-2. J Proteome Res 2021; 20:1464-1475. [PMID: 33605735 DOI: 10.1021/acs.jproteome.0c00929] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The SARS-CoV-2 virus is the causative agent of the 2020 pandemic leading to the COVID-19 respiratory disease. With many scientific and humanitarian efforts ongoing to develop diagnostic tests, vaccines, and treatments for COVID-19, and to prevent the spread of SARS-CoV-2, mass spectrometry research, including proteomics, is playing a role in determining the biology of this viral infection. Proteomics studies are starting to lead to an understanding of the roles of viral and host proteins during SARS-CoV-2 infection, their protein-protein interactions, and post-translational modifications. This is beginning to provide insights into potential therapeutic targets or diagnostic strategies that can be used to reduce the long-term burden of the pandemic. However, the extraordinary situation caused by the global pandemic is also highlighting the need to improve mass spectrometry data and workflow sharing. We therefore describe freely available data and computational resources that can facilitate and assist the mass spectrometry-based analysis of SARS-CoV-2. We exemplify this by reanalyzing a virus-host interactome data set to detect protein-protein interactions and identify host proteins that could potentially be used as targets for drug repurposing.
Collapse
Affiliation(s)
- Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla 92093, California, United States.,Department of Computer Science, University of Antwerp, Antwerp 2020, Belgium
| | - Charlotte Adams
- Department of Computer Science, University of Antwerp, Antwerp 2020, Belgium
| | - Kris Laukens
- Department of Computer Science, University of Antwerp, Antwerp 2020, Belgium
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla 92093, California, United States
| | - Nuno Bandeira
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla 92093, California, United States.,Department of Computer Science and Engineering, University of California San Diego, La Jolla 92093, California, United States
| |
Collapse
|
30
|
Gielis S, Moris P, Bittremieux W, De Neuter N, Ogunjimi B, Laukens K, Meysman P. Identification of Epitope-Specific T Cells in T-Cell Receptor Repertoires. Methods Mol Biol 2021; 2120:183-195. [PMID: 32124320 DOI: 10.1007/978-1-0716-0327-7_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
Recognition of cancer epitopes by T cells is fundamental for the activation of targeted antitumor responses. As such, the identification and study of epitope-specific T cells has been instrumental in our understanding of cancer immunology and the development of personalized immunotherapies. To facilitate the study of T-cell epitope specificity, we developed a prediction tool, TCRex, that can identify epitope-specific T-cell receptors (TCRs) directly from TCR repertoire data and perform epitope-specificity enrichment analyses. This chapter details the use of the TCRex web tool.
Collapse
Affiliation(s)
- Sofie Gielis
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,AUDACIS, Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing, University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium
| | - Pieter Moris
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium
| | - Wout Bittremieux
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, CA, USA
| | - Nicolas De Neuter
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,AUDACIS, Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing, University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium
| | - Benson Ogunjimi
- AUDACIS, Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing, University of Antwerp, Antwerp, Belgium.,Antwerp Center for Translational Immunology and Virology (ACTIV), Vaccine and Infectious Disease Institute, University of Antwerp, Antwerp, Belgium.,Department of Paediatrics, Antwerp University Hospital, Antwerp, Belgium.,Center for Health Economics Research and Modeling Infectious Diseases (CHERMID), Vaccine and Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,AUDACIS, Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing, University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium
| | - Pieter Meysman
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium. .,AUDACIS, Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing, University of Antwerp, Antwerp, Belgium. .,Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium.
| |
Collapse
|
31
|
Aksenov AA, Laponogov I, Zhang Z, Doran SLF, Belluomo I, Veselkov D, Bittremieux W, Nothias LF, Nothias-Esposito M, Maloney KN, Misra BB, Melnik AV, Smirnov A, Du X, Jones KL, Dorrestein K, Panitchpakdi M, Ernst M, van der Hooft JJJ, Gonzalez M, Carazzone C, Amézquita A, Callewaert C, Morton JT, Quinn RA, Bouslimani A, Orio AA, Petras D, Smania AM, Couvillion SP, Burnet MC, Nicora CD, Zink E, Metz TO, Artaev V, Humston-Fulmer E, Gregor R, Meijler MM, Mizrahi I, Eyal S, Anderson B, Dutton R, Lugan R, Boulch PL, Guitton Y, Prevost S, Poirier A, Dervilly G, Le Bizec B, Fait A, Persi NS, Song C, Gashu K, Coras R, Guma M, Manasson J, Scher JU, Barupal DK, Alseekh S, Fernie AR, Mirnezami R, Vasiliou V, Schmid R, Borisov RS, Kulikova LN, Knight R, Wang M, Hanna GB, Dorrestein PC, Veselkov K. Auto-deconvolution and molecular networking of gas chromatography-mass spectrometry data. Nat Biotechnol 2021; 39:169-173. [PMID: 33169034 PMCID: PMC7971188 DOI: 10.1038/s41587-020-0700-3] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 08/26/2020] [Accepted: 09/09/2020] [Indexed: 12/23/2022]
Abstract
We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography-mass spectrometry (GC-MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC-MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples.
Collapse
Affiliation(s)
- Alexander A Aksenov
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Ivan Laponogov
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Zheng Zhang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Sophie L F Doran
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Ilaria Belluomo
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Dennis Veselkov
- Intelligify Limited, London, UK
- Department of Computing, Imperial College, South Kensington Campus, London, UK
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Louis Felix Nothias
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Mélissa Nothias-Esposito
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Katherine N Maloney
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Department of Chemistry, Point Loma Nazarene University, San Diego, CA, USA
| | - Biswapriya B Misra
- Center for Precision Medicine, Department of Internal Medicine, Section of Molecular Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Alexey V Melnik
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Aleksandr Smirnov
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Xiuxia Du
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Kenneth L Jones
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Kathleen Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Morgan Panitchpakdi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Madeleine Ernst
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Section for Clinical Mass Spectrometry, Department of Congenital Disorders, Danish Center for Neonatal Screening, Statens Serum Institut, Copenhagen, Denmark
| | - Justin J J van der Hooft
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Mabel Gonzalez
- Department of Chemistry, Universidad de los Andes, Bogotá, Colombia
| | - Chiara Carazzone
- Department of Chemistry, Universidad de los Andes, Bogotá, Colombia
| | - Adolfo Amézquita
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Chris Callewaert
- Center for Microbial Ecology and Technology, Ghent, Belgium
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - James T Morton
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Robert A Quinn
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Amina Bouslimani
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Andrea Albarracín Orio
- IRNASUS, Universidad Católica de Córdoba, CONICET, Facultad de Ciencias Agropecuarias, Córdoba, Argentina
| | - Daniel Petras
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Andrea M Smania
- Universidad Nacional de Córdoba, Facultad de Ciencias Químicas, Departamento de Química Biológica Ranwel Caputto, Córdoba, Argentina
- CONICET, Universidad Nacional de Córdoba, Centro de Investigaciones en Química Biológica de Córdoba (CIQUIBIC), Córdoba, Argentina
| | - Sneha P Couvillion
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Meagan C Burnet
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Carrie D Nicora
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Erika Zink
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Thomas O Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | | | | | - Rachel Gregor
- Department of Chemistry and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Michael M Meijler
- Department of Chemistry and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Itzhak Mizrahi
- Department of Life Sciences and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Stav Eyal
- Department of Life Sciences and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Brooke Anderson
- Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Rachel Dutton
- Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Raphaël Lugan
- UMR Qualisud, Université d'Avignon et des Pays du Vaucluse, Agrosciences, Avignon, France
| | - Pauline Le Boulch
- UMR Qualisud, Université d'Avignon et des Pays du Vaucluse, Agrosciences, Avignon, France
| | - Yann Guitton
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Stephanie Prevost
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Audrey Poirier
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Gaud Dervilly
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Bruno Le Bizec
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Aaron Fait
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Noga Sikron Persi
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Chao Song
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Kelem Gashu
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Roxana Coras
- Division of Rheumatology, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Monica Guma
- Division of Rheumatology, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Julia Manasson
- Division of Rheumatology, Department of Medicine, New York University School of Medicine, New York, NY, USA
| | - Jose U Scher
- Division of Rheumatology, Department of Medicine, New York University School of Medicine, New York, NY, USA
| | - Dinesh Kumar Barupal
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Saleh Alseekh
- Max Planck Institute for Molecular Plant Physiology, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| | - Alisdair R Fernie
- Max Planck Institute for Molecular Plant Physiology, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| | - Reza Mirnezami
- Department of Colorectal Surgery, Royal Free Hospital NHS Foundation Trust, Hampstead, London, UK
| | - Vasilis Vasiliou
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven, CT, USA
| | - Robin Schmid
- Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany
| | - Roman S Borisov
- A.V. Topchiev Institute of Petrochemical Synthesis RAS, Moscow, Russian Federation
| | - Larisa N Kulikova
- Рeoples' Friendship University of Russia (RUDN University), Moscow, Russian Federation
| | - Rob Knight
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
- UCSD Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - George B Hanna
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
- UCSD Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA.
| | - Kirill Veselkov
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK.
| |
Collapse
|
32
|
Aksenov AA, Laponogov I, Zhang Z, Doran SLF, Belluomo I, Veselkov D, Bittremieux W, Nothias LF, Nothias-Esposito M, Maloney KN, Misra BB, Melnik AV, Smirnov A, Du X, Jones KL, Dorrestein K, Panitchpakdi M, Ernst M, van der Hooft JJJ, Gonzalez M, Carazzone C, Amézquita A, Callewaert C, Morton JT, Quinn RA, Bouslimani A, Orio AA, Petras D, Smania AM, Couvillion SP, Burnet MC, Nicora CD, Zink E, Metz TO, Artaev V, Humston-Fulmer E, Gregor R, Meijler MM, Mizrahi I, Eyal S, Anderson B, Dutton R, Lugan R, Boulch PL, Guitton Y, Prevost S, Poirier A, Dervilly G, Le Bizec B, Fait A, Persi NS, Song C, Gashu K, Coras R, Guma M, Manasson J, Scher JU, Barupal DK, Alseekh S, Fernie AR, Mirnezami R, Vasiliou V, Schmid R, Borisov RS, Kulikova LN, Knight R, Wang M, Hanna GB, Dorrestein PC, Veselkov K. Auto-deconvolution and molecular networking of gas chromatography-mass spectrometry data. Nat Biotechnol 2021. [PMID: 33169034 DOI: 10.1038/s41587-41020-40700-41583] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography-mass spectrometry (GC-MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC-MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples.
Collapse
Affiliation(s)
- Alexander A Aksenov
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Ivan Laponogov
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Zheng Zhang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Sophie L F Doran
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Ilaria Belluomo
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Dennis Veselkov
- Intelligify Limited, London, UK
- Department of Computing, Imperial College, South Kensington Campus, London, UK
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
- Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | - Louis Felix Nothias
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Mélissa Nothias-Esposito
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Katherine N Maloney
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Department of Chemistry, Point Loma Nazarene University, San Diego, CA, USA
| | - Biswapriya B Misra
- Center for Precision Medicine, Department of Internal Medicine, Section of Molecular Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Alexey V Melnik
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Aleksandr Smirnov
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Xiuxia Du
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Kenneth L Jones
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Kathleen Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Morgan Panitchpakdi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Madeleine Ernst
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Section for Clinical Mass Spectrometry, Department of Congenital Disorders, Danish Center for Neonatal Screening, Statens Serum Institut, Copenhagen, Denmark
| | - Justin J J van der Hooft
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Mabel Gonzalez
- Department of Chemistry, Universidad de los Andes, Bogotá, Colombia
| | - Chiara Carazzone
- Department of Chemistry, Universidad de los Andes, Bogotá, Colombia
| | - Adolfo Amézquita
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - Chris Callewaert
- Center for Microbial Ecology and Technology, Ghent, Belgium
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - James T Morton
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Robert A Quinn
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Amina Bouslimani
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Andrea Albarracín Orio
- IRNASUS, Universidad Católica de Córdoba, CONICET, Facultad de Ciencias Agropecuarias, Córdoba, Argentina
| | - Daniel Petras
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - Andrea M Smania
- Universidad Nacional de Córdoba, Facultad de Ciencias Químicas, Departamento de Química Biológica Ranwel Caputto, Córdoba, Argentina
- CONICET, Universidad Nacional de Córdoba, Centro de Investigaciones en Química Biológica de Córdoba (CIQUIBIC), Córdoba, Argentina
| | - Sneha P Couvillion
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Meagan C Burnet
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Carrie D Nicora
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Erika Zink
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Thomas O Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | | | | | - Rachel Gregor
- Department of Chemistry and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Michael M Meijler
- Department of Chemistry and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Itzhak Mizrahi
- Department of Life Sciences and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Stav Eyal
- Department of Life Sciences and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Brooke Anderson
- Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Rachel Dutton
- Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Raphaël Lugan
- UMR Qualisud, Université d'Avignon et des Pays du Vaucluse, Agrosciences, Avignon, France
| | - Pauline Le Boulch
- UMR Qualisud, Université d'Avignon et des Pays du Vaucluse, Agrosciences, Avignon, France
| | - Yann Guitton
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Stephanie Prevost
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Audrey Poirier
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Gaud Dervilly
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Bruno Le Bizec
- Laboratoire d'Etude des Résidus et Contaminants dans les Aliments (LABERCA), Oniris, INRAe, Nantes, France
| | - Aaron Fait
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Noga Sikron Persi
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Chao Song
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Kelem Gashu
- The French Associates Institute for Agriculture and Biotechnology of Dryland, The Jacob Blaustein Institutes for Desert Research, Ben Gurion University of the Negev, Sede Boqer Campus, Beer Sheva, Israel
| | - Roxana Coras
- Division of Rheumatology, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Monica Guma
- Division of Rheumatology, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Julia Manasson
- Division of Rheumatology, Department of Medicine, New York University School of Medicine, New York, NY, USA
| | - Jose U Scher
- Division of Rheumatology, Department of Medicine, New York University School of Medicine, New York, NY, USA
| | - Dinesh Kumar Barupal
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Saleh Alseekh
- Max Planck Institute for Molecular Plant Physiology, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| | - Alisdair R Fernie
- Max Planck Institute for Molecular Plant Physiology, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| | - Reza Mirnezami
- Department of Colorectal Surgery, Royal Free Hospital NHS Foundation Trust, Hampstead, London, UK
| | - Vasilis Vasiliou
- Department of Environmental Health Sciences, Yale School of Public Health, Yale University, New Haven, CT, USA
| | - Robin Schmid
- Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany
| | - Roman S Borisov
- A.V. Topchiev Institute of Petrochemical Synthesis RAS, Moscow, Russian Federation
| | - Larisa N Kulikova
- Рeoples' Friendship University of Russia (RUDN University), Moscow, Russian Federation
| | - Rob Knight
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
- UCSD Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA
| | - George B Hanna
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California,San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
- UCSD Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA.
| | - Kirill Veselkov
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, UK.
| |
Collapse
|
33
|
Abstract
Given the wide diversity in applications of biological mass spectrometry, custom data analyses are often needed to fully interpret the results of an experiment. Such bioinformatics scripts necessarily include similar basic functionality to read mass spectral data from standard file formats, process it, and visualize it. Rather than having to reimplement this functionality, to facilitate this task, spectrum_utils is a Python package for mass spectrometry data processing and visualization. Its high-level functionality enables developers to quickly prototype ideas for computational mass spectrometry projects in only a few lines of code. Notably, the data processing functionality is highly optimized for computational efficiency to be able to deal with the large volumes of data that are generated during mass spectrometry experiments. The visualization functionality makes it possible to easily produce publication-quality figures as well as interactive spectrum plots for inclusion on web pages. spectrum_utils is available for Python 3.6+, includes extensive online documentation and examples, and can be easily installed using conda. It is freely available as open source under the Apache 2.0 license at https://github.com/bittremieux/spectrum_utils .
Collapse
Affiliation(s)
- Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California San Diego , La Jolla , California 92093 , United States.,Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium.,Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| |
Collapse
|
34
|
Gielis S, Moris P, Bittremieux W, De Neuter N, Ogunjimi B, Laukens K, Meysman P. Detection of Enriched T Cell Epitope Specificity in Full T Cell Receptor Sequence Repertoires. Front Immunol 2019; 10:2820. [PMID: 31849987 PMCID: PMC6896208 DOI: 10.3389/fimmu.2019.02820] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 11/15/2019] [Indexed: 12/15/2022] Open
Abstract
High-throughput T cell receptor (TCR) sequencing allows the characterization of an individual's TCR repertoire and directly queries their immune state. However, it remains a non-trivial task to couple these sequenced TCRs to their antigenic targets. In this paper, we present a novel strategy to annotate full TCR sequence repertoires with their epitope specificities. The strategy is based on a machine learning algorithm to learn the TCR patterns common to the recognition of a specific epitope. These results are then combined with a statistical analysis to evaluate the occurrence of specific epitope-reactive TCR sequences per epitope in repertoire data. In this manner, we can directly study the capacity of full TCR repertoires to target specific epitopes of the relevant vaccines or pathogens. We demonstrate the usability of this approach on three independent datasets related to vaccine monitoring and infectious disease diagnostics by independently identifying the epitopes that are targeted by the TCR repertoire. The developed method is freely available as a web tool for academic use at tcrex.biodatamining.be.
Collapse
Affiliation(s)
- Sofie Gielis
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing (AUDACIS), University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (Biomina), University of Antwerp, Antwerp, Belgium
| | - Pieter Moris
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (Biomina), University of Antwerp, Antwerp, Belgium
| | - Wout Bittremieux
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (Biomina), University of Antwerp, Antwerp, Belgium.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, United States
| | - Nicolas De Neuter
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing (AUDACIS), University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (Biomina), University of Antwerp, Antwerp, Belgium
| | - Benson Ogunjimi
- Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing (AUDACIS), University of Antwerp, Antwerp, Belgium.,Department of Paediatrics, Antwerp University Hospital, Edegem, Belgium.,Centre for Health Economics Research and Modeling Infectious Diseases (CHERMID), Vaccine and Infectious Disease Institute, University of Antwerp, Wilrijk, Belgium.,Antwerp Center for Translational Immunology and Virology (ACTIV), Vaccine and Infectious Disease Institute, University of Antwerp, Wilrijk, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing (AUDACIS), University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (Biomina), University of Antwerp, Antwerp, Belgium
| | - Pieter Meysman
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing (AUDACIS), University of Antwerp, Antwerp, Belgium.,Biomedical Informatics Research Network Antwerp (Biomina), University of Antwerp, Antwerp, Belgium
| |
Collapse
|
35
|
Abstract
For the 2018 YPIC Challenge, contestants were invited to try to decipher two unknown English questions encoded by a synthetic protein expressed in Escherichia coli. In addition to deciphering the sentence, contestants were asked to determine the three-dimensional structure and detect any post-translation modifications left by the host organism. We present our experimental and computational strategy to characterize this sample by identifying the unknown protein sequence and detecting the presence of post-translational modifications. The sample was acquired with dynamic exclusion disabled to increase the signal-to-noise ratio of the measured molecules, after which spectral clustering was used to generate high-quality consensus spectra. De novo spectrum identification was used to determine the synthetic protein sequence, and any post-translational modifications introduced by E. coli on the synthetic protein were analyzed via spectral networking. This workflow resulted in a de novo sequence coverage of 70%, on par with sequence database searching performance. Additionally, the spectral networking analysis indicated that no systematic modifications were introduced on the synthetic protein by E. coli. The strategy presented here can be directly used to analyze samples for which no protein sequence information is available or when the identity of the sample is unknown. All software and code to perform the bioinformatics analysis is available as open source, and self-contained Jupyter notebooks are provided to fully recreate the analysis.
Collapse
Affiliation(s)
- Lindsay Pino
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
| | - Andy Lin
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
| | - Wout Bittremieux
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
- Department of Mathematics and Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| |
Collapse
|
36
|
Bittremieux W, Laukens K, Noble WS. Extremely Fast and Accurate Open Modification Spectral Library Searching of High-Resolution Mass Spectra Using Feature Hashing and Graphics Processing Units. J Proteome Res 2019; 18:3792-3799. [PMID: 31448616 PMCID: PMC6886738 DOI: 10.1021/acs.jproteome.9b00291] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Open modification searching (OMS) is a powerful search strategy to identify peptides with any type of modification. OMS works by using a very wide precursor mass window to allow modified spectra to match against their unmodified variants, after which the modification types can be inferred from the corresponding precursor mass differences. A disadvantage of this strategy, however, is the large computational cost, because each query spectrum has to be compared against a multitude of candidate peptides. We have previously introduced the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. Here we demonstrate how this candidate selection procedure can be further optimized using graphics processing units. Additionally, we introduce a feature hashing scheme to convert high-resolution spectra to low-dimensional vectors. On the basis of these algorithmic advances, along with low-level code optimizations, the new version of ANN-SoLo is up to an order of magnitude faster than its initial version. This makes it possible to efficiently perform open searches on a large scale to gain a deeper understanding about the protein modification landscape. We demonstrate the computational efficiency and identification performance of ANN-SoLo based on a large data set of the draft human proteome. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Kris Laukens
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
- Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| |
Collapse
|
37
|
Meysman P, Saeys Y, Sabaghian E, Bittremieux W, Van de Peer Y, Goethals B, Laukens K. Mining the Enriched Subgraphs for Specific Vertices in a Biological Graph. IEEE/ACM Trans Comput Biol Bioinform 2019; 16:1496-1507. [PMID: 27295680 DOI: 10.1109/tcbb.2016.2576440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we present a subgroup discovery method to find subgraphs in a graph that are associated with a given set of vertices. The association between a subgraph pattern and a set of vertices is defined by its significant enrichment based on a Bonferroni-corrected hypergeometric probability value. This interestingness measure requires a dedicated pruning procedure to limit the number of subgraph matches that must be calculated. The presented mining algorithm to find associated subgraph patterns in large graphs is therefore designed to efficiently traverse the search space. We demonstrate the operation of this method by applying it on three biological graph data sets and show that we can find associated subgraphs for a biologically relevant set of vertices and that the found subgraphs themselves are biologically interesting.
Collapse
|
38
|
Kopczynski D, Bittremieux W, Bouyssié D, Dorfer V, Locard-Paulet M, Van Puyvelde B, Schwämmle V, Soggiu A, Willems S, Uszkoreit J. Proceedings of the EuBIC Winter School 2019. EuPA Open Proteom 2019; 22-23:4-7. [PMID: 31890545 PMCID: PMC6924290 DOI: 10.1016/j.euprot.2019.07.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 07/17/2019] [Indexed: 01/29/2023]
Abstract
The 2019 European Bioinformatics Community (EuBIC) Winter School was held from January 15th to January 18th 2019 in Zakopane, Poland. This year’s meeting was the third of its kind and gathered international researchers in the field of (computational) proteomics to discuss (mainly) challenges in proteomics quantification and data independent acquisition (DIA). Here, we present an overview of the scientific program of the 2019 EuBIC Winter School. Furthermore, we can already give a small outlook to the upcoming EuBIC 2020 Developer’s Meeting.
Collapse
Affiliation(s)
- Dominik Kopczynski
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Bunsen-Kirchhoff-Str. 11, D-44139, Dortmund, Germany
| | | | - David Bouyssié
- Institute of Pharmacology and Structural Biology, University of Toulouse, CNRS, UPS, Toulouse, France
| | - Viktoria Dorfer
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Hagenberg, Austria
| | - Marie Locard-Paulet
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen. Denmark
| | - Bart Van Puyvelde
- Laboratory of Pharmaceutical Biotechnology, Ghent University, Ghent, Belgium
| | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, 5230, Odense, Denmark
| | - Alessio Soggiu
- Department of Veterinary Medicine, University of Milan, Milan, Italy
| | - Sander Willems
- Laboratory of Pharmaceutical Biotechnology, Ghent University, Ghent, Belgium
| | - Julian Uszkoreit
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Gesundheitscampus 4, D-44801, Bochum, Germany
| |
Collapse
|
39
|
Beirnaert C, Peeters L, Meysman P, Bittremieux W, Foubert K, Custers D, Van der Auwera A, Cuykx M, Pieters L, Covaci A, Laukens K. Using Expert Driven Machine Learning to Enhance Dynamic Metabolomics Data Analysis. Metabolites 2019; 9:metabo9030054. [PMID: 30897797 PMCID: PMC6468718 DOI: 10.3390/metabo9030054] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Revised: 03/05/2019] [Accepted: 03/18/2019] [Indexed: 11/16/2022] Open
Abstract
Data analysis for metabolomics is undergoing rapid progress thanks to the proliferation of novel tools and the standardization of existing workflows. As untargeted metabolomics datasets and experiments continue to increase in size and complexity, standardized workflows are often not sufficiently sophisticated. In addition, the ground truth for untargeted metabolomics experiments is intrinsically unknown and the performance of tools is difficult to evaluate. Here, the problem of dynamic multi-class metabolomics experiments was investigated using a simulated dataset with a known ground truth. This simulated dataset was used to evaluate the performance of tinderesting, a new and intuitive tool based on gathering expert knowledge to be used in machine learning. The results were compared to EDGE, a statistical method for time series data. This paper presents three novel outcomes. The first is a way to simulate dynamic metabolomics data with a known ground truth based on ordinary differential equations. This method is made available through the MetaboLouise R package. Second, the EDGE tool, originally developed for genomics data analysis, is highly performant in analyzing dynamic case vs. control metabolomics data. Third, the tinderesting method is introduced to analyse more complex dynamic metabolomics experiments. This tool consists of a Shiny app for collecting expert knowledge, which in turn is used to train a machine learning model to emulate the decision process of the expert. This approach does not replace traditional data analysis workflows for metabolomics, but can provide additional information, improved performance or easier interpretation of results. The advantage is that the tool is agnostic to the complexity of the experiment, and thus is easier to use in advanced setups. All code for the presented analysis, MetaboLouise and tinderesting are freely available.
Collapse
Affiliation(s)
- Charlie Beirnaert
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, 2000 Antwerp, Belgium.
| | - Laura Peeters
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, 2000 Antwerp, Belgium.
| | - Pieter Meysman
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, 2000 Antwerp, Belgium.
| | - Wout Bittremieux
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, 2000 Antwerp, Belgium.
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| | - Kenn Foubert
- Natural Products & Food Research and Analysis (NatuRA), Department of Pharmaceutical Sciences, University of Antwerp, 2000 Antwerp, Belgium.
| | - Deborah Custers
- Natural Products & Food Research and Analysis (NatuRA), Department of Pharmaceutical Sciences, University of Antwerp, 2000 Antwerp, Belgium.
| | - Anastasia Van der Auwera
- Natural Products & Food Research and Analysis (NatuRA), Department of Pharmaceutical Sciences, University of Antwerp, 2000 Antwerp, Belgium.
| | - Matthias Cuykx
- Toxicological Center, Department of Pharmaceutical Sciences, University of Antwerp, 2000 Antwerp, Belgium.
| | - Luc Pieters
- Natural Products & Food Research and Analysis (NatuRA), Department of Pharmaceutical Sciences, University of Antwerp, 2000 Antwerp, Belgium.
| | - Adrian Covaci
- Toxicological Center, Department of Pharmaceutical Sciences, University of Antwerp, 2000 Antwerp, Belgium.
| | - Kris Laukens
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, 2000 Antwerp, Belgium.
| |
Collapse
|
40
|
Abstract
Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. A drawback of this strategy, however, is that it leads to a large increase in search time. Although performing an open search can be done using existing spectral library search engines by simply setting a wide precursor mass window, none of these tools have been optimized for OMS, leading to excessive runtimes and suboptimal identification results. We present the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. This approach is combined with a cascade search strategy to maximize the number of identified unmodified and modified spectra while strictly controlling the false discovery rate as well as a shifted dot product score to sensitively match modified spectra to their unmodified counterparts. ANN-SoLo achieves state-of-the-art performance in terms of speed and the number of identifications. On a previously published human cell line data set, ANN-SoLo confidently identifies more spectra than SpectraST or MSFragger and achieves a speedup of an order of magnitude compared with SpectraST. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Pieter Meysman
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
- Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| | - Kris Laukens
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| |
Collapse
|
41
|
Mrzic A, Meysman P, Bittremieux W, Moris P, Cule B, Goethals B, Laukens K. Grasping frequent subgraph mining for bioinformatics applications. BioData Min 2018; 11:20. [PMID: 30202444 PMCID: PMC6122726 DOI: 10.1186/s13040-018-0181-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 08/13/2018] [Indexed: 11/18/2022] Open
Abstract
Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. The definition of which subgraphs are interesting and which are not is highly dependent on the application. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks. Thus far, information about the bioinformatics application of subgraph mining remains scattered over heterogeneous literature. In this review, we provide an introduction to subgraph mining for life scientists. We give an overview of various subgraph mining algorithms from a bioinformatics perspective and present several of their potential biomedical applications.
Collapse
Affiliation(s)
- Aida Mrzic
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Pieter Meysman
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Wout Bittremieux
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Pieter Moris
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| | - Boris Cule
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Bart Goethals
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- 1Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.,2Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Antwerp, Belgium
| |
Collapse
|
42
|
Bittremieux W, Tabb DL, Impens F, Staes A, Timmerman E, Martens L, Laukens K. Quality control in mass spectrometry-based proteomics. Mass Spectrom Rev 2018; 37:697-711. [PMID: 28802010 DOI: 10.1002/mas.21544] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Revised: 07/24/2017] [Accepted: 07/24/2017] [Indexed: 05/21/2023]
Abstract
Mass spectrometry is a highly complex analytical technique and mass spectrometry-based proteomics experiments can be subject to a large variability, which forms an obstacle to obtaining accurate and reproducible results. Therefore, a comprehensive and systematic approach to quality control is an essential requirement to inspire confidence in the generated results. A typical mass spectrometry experiment consists of multiple different phases including the sample preparation, liquid chromatography, mass spectrometry, and bioinformatics stages. We review potential sources of variability that can impact the results of a mass spectrometry experiment occurring in all of these steps, and we discuss how to monitor and remedy the negative influences on the experimental results. Furthermore, we describe how specialized quality control samples of varying sample complexity can be incorporated into the experimental workflow and how they can be used to rigorously assess detailed aspects of the instrument performance.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (Biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| | - David L Tabb
- Division of Molecular Biology and Human Genetics, Stellenbosch University Faculty of Medicine and Health Sciences, Tygerberg Hospital, Cape Town, South Africa
| | - Francis Impens
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - An Staes
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Evy Timmerman
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Zwijnaarde, Belgium
| | - Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (Biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| |
Collapse
|
43
|
Willems S, Bouyssié D, Deforce D, Dorfer V, Gorshkov V, Kopczynski D, Laukens K, Locard-Paulet M, Schwämmle V, Uszkoreit J, Valkenborg D, Vaudel M, Bittremieux W. Proceedings of the EuBIC developer's meeting 2018. J Proteomics 2018; 187:25-27. [PMID: 29864591 DOI: 10.1016/j.jprot.2018.05.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 05/27/2018] [Indexed: 11/18/2022]
Abstract
The inaugural European Bioinformatics Community (EuBIC) developer's meeting was held from January 9th to January 12th 2018 in Ghent, Belgium. While the meeting kicked off with an interactive keynote session featuring four internationally renowned experts in the field of computational proteomics, its primary focus were the hands-on hackathon sessions which featured six community-proposed projects revolving around three major topics: Here, we present an overview of the scientific program of the EuBIC developer's meeting and provide a starting point for follow-up on the covered projects.
Collapse
Affiliation(s)
- Sander Willems
- Laboratory of Pharmaceutical Biotechnology, Ghent University, Ghent, Belgium
| | - David Bouyssié
- Institute of Pharmacology and Structural Biology, University of Toulouse, CNRS, UPS, Toulouse, France
| | - Dieter Deforce
- Laboratory of Pharmaceutical Biotechnology, Ghent University, Ghent, Belgium
| | - Viktoria Dorfer
- Bioinformatics Research Group, University of Applied Sciences Upper Austria, Hagenberg, Austria
| | - Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, Denmark
| | - Dominik Kopczynski
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany
| | - Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Marie Locard-Paulet
- Institute of Pharmacology and Structural Biology, University of Toulouse, CNRS, UPS, Toulouse, France
| | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, Denmark
| | - Julian Uszkoreit
- Medizinisches Proteom-Center, Ruhr University Bochum, Bochum, Germany
| | - Dirk Valkenborg
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium; Centre for Proteomics, University of Antwerp, Antwerp, Belgium
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Bergen, Norway; Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| | - Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium; Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| |
Collapse
|
44
|
Deutsch EW, Orchard S, Binz PA, Bittremieux W, Eisenacher M, Hermjakob H, Kawano S, Lam H, Mayer G, Menschaert G, Perez-Riverol Y, Salek RM, Tabb DL, Tenzer S, Vizcaíno JA, Walzer M, Jones AR. Proteomics Standards Initiative: Fifteen Years of Progress and Future Work. J Proteome Res 2017; 16:4288-4298. [PMID: 28849660 PMCID: PMC5715286 DOI: 10.1021/acs.jproteome.7b00370] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 years. Under the guidance of the chair, cochairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Furthermore, new standards are currently either in the final stages of completion (proBed and proBAM for proteogenomics results as well as PEFF) or in early stages of design (a spectral library standard format, a universal spectrum identifier, the qcML quality control format, and the Protein Expression Interface (PROXI) web services Application Programming Interface). In this work we review the current status of all of these aspects of the PSI, describe synergies with other efforts such as the ProteomeXchange Consortium, the Human Proteome Project, and the metabolomics community, and provide a look at future directions of the PSI.
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Pierre-Alain Binz
- CHUV Centre Hospitalier Universitaire Vaudois , 1011 Lausanne, Switzerland
| | - Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp , Middelheimlaan 1, 2020 Antwerp, Belgium
| | - Martin Eisenacher
- Medizinisches Proteom Center (MPC), Ruhr-Universität Bochum , D-44801 Bochum, Germany
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.,State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, National Center for Protein Sciences, Beijing , Beijing 102206, China
| | - Shin Kawano
- Database Center for Life Science, Joint Support Center for Data Science Research, Research Organization of Information and Systems , Kashiwa, Chiba 277-0871, Japan
| | - Henry Lam
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Clear Water Bay, Hong Kong, P. R. China.,Department of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology , Clear Water Bay, Hong Kong, P. R. China
| | - Gerhard Mayer
- Medizinisches Proteom Center (MPC), Ruhr-Universität Bochum , D-44801 Bochum, Germany
| | - Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Ghent University , 9000 Ghent, Belgium
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Reza M Salek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L Tabb
- SA MRC Centre for TB Research, DST/NRF Centre of Excellence for Biomedical TB Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University , Cape Town, South Africa
| | - Stefan Tenzer
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz , 55131 Mainz, Germany
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Andrew R Jones
- Institute of Integrative Biology, University of Liverpool , South Wirral L64 4AY, United Kingdom
| |
Collapse
|
45
|
Vizcaíno JA, Walzer M, Jiménez RC, Bittremieux W, Bouyssié D, Carapito C, Corrales F, Ferro M, Heck AJR, Horvatovich P, Hubalek M, Lane L, Laukens K, Levander F, Lisacek F, Novak P, Palmblad M, Piovesan D, Pühler A, Schwämmle V, Valkenborg D, van Rijswijk M, Vondrasek J, Eisenacher M, Martens L, Kohlbacher O. A community proposal to integrate proteomics activities in ELIXIR. F1000Res 2017; 6. [PMID: 28713550 PMCID: PMC5499783 DOI: 10.12688/f1000research.11751.1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/06/2017] [Indexed: 11/20/2022] Open
Abstract
Computational approaches have been major drivers behind the progress of proteomics in recent years. The aim of this white paper is to provide a framework for integrating computational proteomics into ELIXIR in the near future, and thus to broaden the portfolio of omics technologies supported by this European distributed infrastructure. This white paper is the direct result of a strategy meeting on ‘The Future of Proteomics in ELIXIR’ that took place in March 2017 in Tübingen (Germany), and involved representatives of eleven ELIXIR nodes. These discussions led to a list of priority areas in computational proteomics that would complement existing activities and close gaps in the portfolio of tools and services offered by ELIXIR so far. We provide some suggestions on how these activities could be integrated into ELIXIR’s existing platforms, and how it could lead to a new ELIXIR use case in proteomics. We also highlight connections to the related field of metabolomics, where similar activities are ongoing. This white paper could thus serve as a starting point for the integration of computational proteomics into ELIXIR. Over the next few months we will be working closely with all stakeholders involved, and in particular with other representatives of the proteomics community, to further refine this paper.
Collapse
Affiliation(s)
- Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | | | - Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, 2020, Belgium
| | - David Bouyssié
- French Proteomics Infrastructure ProFI, Grenoble, (EDyP U1038, CEA/Inserm/ Grenoble Alpes University) Toulouse (IPBS, Université de Toulouse, CNRS, UPS), Strasbourg (LSMBO, IPHC UMR7178, CNRS-Université de Strasbourg), France
| | - Christine Carapito
- French Proteomics Infrastructure ProFI, Grenoble, (EDyP U1038, CEA/Inserm/ Grenoble Alpes University) Toulouse (IPBS, Université de Toulouse, CNRS, UPS), Strasbourg (LSMBO, IPHC UMR7178, CNRS-Université de Strasbourg), France
| | - Fernando Corrales
- ProteoRed, Proteomics Unit, Centro Nacional de Biotecnología (CSIC), Madrid, 28049, Spain
| | - Myriam Ferro
- French Proteomics Infrastructure ProFI, Grenoble, (EDyP U1038, CEA/Inserm/ Grenoble Alpes University) Toulouse (IPBS, Université de Toulouse, CNRS, UPS), Strasbourg (LSMBO, IPHC UMR7178, CNRS-Université de Strasbourg), France
| | - Albert J R Heck
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Centre for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, 3548 CH, Netherlands.,Netherlands Proteomics Center, Utretcht, 3584 CH, Netherlands
| | - Peter Horvatovich
- Analytical Biochemistry, Department of Pharmacy, University of Groningen, Groningen, 9713 AV, Netherlands
| | - Martin Hubalek
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague 1, 117 20, Czech Republic
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics, Geneva, 1015, Switzerland.,Department of Human Protein Science, Faculty of Medicine, University of Geneva, Geneva, 1205, Switzerland
| | - Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, 2020, Belgium
| | - Fredrik Levander
- National Bioinformatics Infrastructure Sweden (NBIS), SciLifeLab, Department of Immunotechnology, Lund University, Lund, 223 62, Sweden
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1015, Switzerland.,Computer Science Department, University of Geneva, Geneva, 1205, Switzerland
| | - Petr Novak
- Institute of Microbiology, Czech Academy of Sciences, Prague 1, 117 20, Czech Republic
| | - Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, Leiden, 2333 ZA, Netherlands
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, I-35121, Italy
| | - Alfred Pühler
- Center for Biotechnology, Bielefeld University, Bielefeld, 33615, Germany
| | - Veit Schwämmle
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense M, 5230, Denmark
| | - Dirk Valkenborg
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, 3500, Belgium.,Center for Proteomics, University of Antwerp, Antwerpen, 2000, Belgium.,Applied Bio & Molecular Systems, VITO, Mol, BE-2400, Belgium
| | - Merlijn van Rijswijk
- Netherlands Metabolomics Centre, Utrecht, 3511 GC, Netherlands.,Dutch Techcentre for Life Sciences / ELIXIR-NL, Utrecht, 3511 GC, Netherlands
| | - Jiri Vondrasek
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague 1, 117 20, Czech Republic
| | - Martin Eisenacher
- Medical Bioinformatics, Medizinisches Proteom-Center, Ruhr-University Bochum, Bochum, 44801, Germany
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, Ghent, 9052, Belgium.,Department of Biochemistry, Ghent University, Ghent, 9000, Belgium
| | - Oliver Kohlbacher
- Applied Bioinformatics, Department of Computer Science, University of Tübingen, Tübingen, 72074, Germany.,Center for Bioinformatics Tübingen, University of Tübingen, Tübingen, 72074, Germany.,Quantitative Biology Center, University of Tübingen, Tübingen, 72074, Germany.,Biomolecular Interactions, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
| |
Collapse
|
46
|
Bittremieux W, Walzer M, Tenzer S, Zhu W, Salek RM, Eisenacher M, Tabb DL. The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass Spectrometry. Anal Chem 2017; 89:4474-4479. [PMID: 28318237 DOI: 10.1021/acs.analchem.6b04310] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
To have confidence in results acquired during biological mass spectrometry experiments, a systematic approach to quality control is of vital importance. Nonetheless, until now, only scattered initiatives have been undertaken to this end, and these individual efforts have often not been complementary. To address this issue, the Human Proteome Organization-Proteomics Standards Initiative has established a new working group on quality control at its meeting in the spring of 2016. The goal of this working group is to provide a unifying framework for quality control data. The initial focus will be on providing a community-driven standardized file format for quality control. For this purpose, the previously proposed qcML format will be adapted to support a variety of use cases for both proteomics and metabolomics applications, and it will be established as an official PSI format. An important consideration is to avoid enforcing restrictive requirements on quality control but instead provide the basic technical necessities required to support extensive quality control for any type of mass spectrometry-based workflow. We want to emphasize that this is an open community effort, and we seek participation from all scientists with an interest in this field.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp , Middelheimlaan 1, 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , Wilrijkstraat 10, 2650 Edegem, Belgium
| | - Mathias Walzer
- Department of Computer Science, University of Tübingen , Tübingen 72076, Germany.,Center for Bioinformatics, University of Tübingen , Tübingen 72074, Germany
| | - Stefan Tenzer
- Institute for Immunology, University Medical Center of the Johannes-Gutenberg University Mainz D 55131, Germany
| | - Weimin Zhu
- National Center for Protein Science , No. 38, Science Park Road, Changping District, Beijing 102206, China
| | - Reza M Salek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Martin Eisenacher
- Medical Bioinformatics, Medizinisches Proteom-Center, Ruhr-University Bochum , Bochum 44801, Germany
| | - David L Tabb
- Division of Molecular Biology and Human Genetics, Stellenbosch University Faculty of Medicine and Health Sciences , Tygerberg Hospital, Francie Van Zijl Drive, Cape Town 7505, South Africa
| |
Collapse
|
47
|
Bittremieux W, Valkenborg D, Martens L, Laukens K. Computational quality control tools for mass spectrometry proteomics. Proteomics 2016; 17. [DOI: 10.1002/pmic.201600159] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Revised: 07/28/2016] [Accepted: 08/19/2016] [Indexed: 12/30/2022]
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science; University of Antwerp; Antwerp Belgium
- Biomedical Informatics Research Center Antwerp (biomina); University of Antwerp/Antwerp, University Hospital; Edegem Belgium
| | - Dirk Valkenborg
- Flemish Institute for Technological Research (VITO); Mol Belgium
- CFP; University of Antwerp; Antwerp Belgium
- I-BioStat; Hasselt University; Diepenbeek Belgium
| | - Lennart Martens
- Medical Biotechnology Center; VIB; Ghent Belgium
- Department of Biochemistry, Faculty of Medicine and Health Sciences; Ghent University; Ghent Belgium
- Bioinformatics Institute Ghent; Ghent University; Zwijnaarde Belgium
| | - Kris Laukens
- Department of Mathematics and Computer Science; University of Antwerp; Antwerp Belgium
- Biomedical Informatics Research Center Antwerp (biomina); University of Antwerp/Antwerp, University Hospital; Edegem Belgium
| |
Collapse
|
48
|
Maes E, Kelchtermans P, Bittremieux W, De Grave K, Degroeve S, Hooyberghs J, Mertens I, Baggerman G, Ramon J, Laukens K, Martens L, Valkenborg D. Designing biomedical proteomics experiments: state-of-the-art and future perspectives. Expert Rev Proteomics 2016; 13:495-511. [PMID: 27031651 DOI: 10.1586/14789450.2016.1172967] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
With the current expanded technical capabilities to perform mass spectrometry-based biomedical proteomics experiments, an improved focus on the design of experiments is crucial. As it is clear that ignoring the importance of a good design leads to an unprecedented rate of false discoveries which would poison our results, more and more tools are developed to help researchers designing proteomic experiments. In this review, we apply statistical thinking to go through the entire proteomics workflow for biomarker discovery and validation and relate the considerations that should be made at the level of hypothesis building, technology selection, experimental design and the optimization of the experimental parameters.
Collapse
Affiliation(s)
- Evelyne Maes
- a Applied Bio & molecular systems , VITO , Mol , Belgium.,b CFP , University of Antwerp , Antwerp , Belgium
| | - Pieter Kelchtermans
- b CFP , University of Antwerp , Antwerp , Belgium.,c Medical Biotechnology Center , VIB , Ghent , Belgium.,d Department of Biochemistry , Ghent University , Ghent , Belgium.,e Bioinformatics Institute Ghent , Ghent University , Ghent , Belgium
| | - Wout Bittremieux
- f Department of Mathematics and Computer Science , University of Antwerp , Antwerp , Belgium.,g Biomedical Informatics Research Center Antwerp (biomina) , University of Antwerp/Antwerp University Hospital , Antwerp , Belgium
| | - Kurt De Grave
- h Department of Computer Science , KU Leuven , Leuven , Belgium
| | - Sven Degroeve
- c Medical Biotechnology Center , VIB , Ghent , Belgium.,d Department of Biochemistry , Ghent University , Ghent , Belgium.,e Bioinformatics Institute Ghent , Ghent University , Ghent , Belgium
| | - Jef Hooyberghs
- a Applied Bio & molecular systems , VITO , Mol , Belgium
| | - Inge Mertens
- a Applied Bio & molecular systems , VITO , Mol , Belgium.,b CFP , University of Antwerp , Antwerp , Belgium
| | - Geert Baggerman
- a Applied Bio & molecular systems , VITO , Mol , Belgium.,b CFP , University of Antwerp , Antwerp , Belgium
| | - Jan Ramon
- h Department of Computer Science , KU Leuven , Leuven , Belgium.,i INRIA , Lille , France
| | - Kris Laukens
- f Department of Mathematics and Computer Science , University of Antwerp , Antwerp , Belgium.,g Biomedical Informatics Research Center Antwerp (biomina) , University of Antwerp/Antwerp University Hospital , Antwerp , Belgium
| | - Lennart Martens
- c Medical Biotechnology Center , VIB , Ghent , Belgium.,d Department of Biochemistry , Ghent University , Ghent , Belgium.,e Bioinformatics Institute Ghent , Ghent University , Ghent , Belgium
| | - Dirk Valkenborg
- a Applied Bio & molecular systems , VITO , Mol , Belgium.,b CFP , University of Antwerp , Antwerp , Belgium.,j Interuniversity Institute for Biostatistics and statistical Bioinformatics , Hasselt University , Hasselt , Belgium
| |
Collapse
|
49
|
Bittremieux W, Meysman P, Martens L, Valkenborg D, Laukens K. Unsupervised Quality Assessment of Mass Spectrometry Proteomics Experiments by Multivariate Quality Control Metrics. J Proteome Res 2016; 15:1300-7. [PMID: 26974716 DOI: 10.1021/acs.jproteome.6b00028] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Despite many technological and computational advances, the results of a mass spectrometry proteomics experiment are still subject to a large variability. For the understanding and evaluation of how technical variability affects the results of an experiment, several computationally derived quality control metrics have been introduced. However, despite the availability of these metrics, a systematic approach to quality control is often still lacking because the metrics are not fully understood and are hard to interpret. Here, we present a toolkit of powerful techniques to analyze and interpret multivariate quality control metrics to assess the quality of mass spectrometry proteomics experiments. We show how unsupervised techniques applied to these quality control metrics can provide an initial discrimination between low-quality experiments and high-quality experiments prior to manual investigation. Furthermore, we provide a technique to obtain detailed information on the quality control metrics that are related to the decreased performance, which can be used as actionable information to improve the experimental setup. Our toolkit is released as open-source and can be downloaded from https://bitbucket.org/proteinspector/qc_analysis/ .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp , 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , 2650 Edegem, Belgium
| | - Pieter Meysman
- Department of Mathematics and Computer Science, University of Antwerp , 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , 2650 Edegem, Belgium
| | - Lennart Martens
- Department of Medical Protein Research, VIB , 9000 Ghent, Belgium.,Department of Biochemistry, Faculty of Medicine and Health Sciences, Ghent University , 9000 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University , 9000 Ghent, Belgium
| | - Dirk Valkenborg
- Flemish Institute for Technological Research (VITO) , 2400 Mol, Belgium.,CFP, University of Antwerp , 2020 Antwerp, Belgium.,I-BioStat, Hasselt University , 3590 Diepenbeek, Belgium
| | - Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp , 2020 Antwerp, Belgium.,Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital , 2650 Edegem, Belgium
| |
Collapse
|
50
|
Cuykx M, Van den Eede N, Bittremieux W, Laukens K, Dardenne F, Blust R, Covaci A. Optimization of LC-QTOF MS parameters for the coverage of the in vitro HepaRG metabolome. Toxicol Lett 2015. [DOI: 10.1016/j.toxlet.2015.08.696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|