1
|
Zweigle J, Tisler S, Bevilacqua M, Tomasi G, Nielsen NJ, Gawlitta N, Lübeck JS, Smilde AK, Christensen JH. Prioritization strategies for non-target screening in environmental samples by chromatography - High-resolution mass spectrometry: A tutorial. J Chromatogr A 2025; 1751:465944. [PMID: 40203635 DOI: 10.1016/j.chroma.2025.465944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Revised: 04/01/2025] [Accepted: 04/03/2025] [Indexed: 04/11/2025]
Abstract
Non-target screening (NTS) using chromatography coupled to high-resolution mass spectrometry (HRMS), has become fundamental for detecting and prioritizing chemicals of emerging concern (CECs) in complex environmental matrices. The vast number of generated features (m/z, retention time, and intensity) necessitate effective prioritization strategies to identify environmentally and toxicologically relevant CECs. Since compound identification remains a major bottleneck in NTS, prioritization is critical to focus identification efforts where they matter most. This tutorial presents seven prioritization strategies: (1) Target and suspect screening for identifying known or suspected compounds using reference libraries. (2) Data quality filtering to apply quality control measures to reduce noise and the number of false positives. (3) Chemistry-driven prioritization using HRMS data properties to prioritize specific compound classes (e.g., halogenated substances, transformation products). (4) Process-driven - using spatial, temporal, or process-based comparisons (pre- and post-technical processes) to identify key features. (5) Effect-Directed Analysis (EDA) and Virtual Effect-Directed Analysis (vEDA) prioritization to link chemical features to biological effects. (6) Prediction-based prioritization such as quantitative structure-property relationships (QSPR) and machine learning to estimate risk or concentration levels, and (7) Pixel- or tile-based analysis where the chromatographic image (2D data) is used to pin-point regions of interest or for comparison of larger sample sets. By integrating these prioritization strategies, this tutorial provides a structured foundation to evaluate both identified and unidentified features, prioritize high-risk compounds, and advance environmental risk assessment and regulatory decision-making.
Collapse
Affiliation(s)
- Jonathan Zweigle
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Selina Tisler
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Marta Bevilacqua
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Giorgio Tomasi
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Nikoline J Nielsen
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Nadine Gawlitta
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Josephine S Lübeck
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Age K Smilde
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Jan H Christensen
- Analytical Chemistry Group, Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark.
| |
Collapse
|
2
|
Bushuiev R, Bushuiev A, Samusevich R, Brungs C, Sivic J, Pluskal T. Self-supervised learning of molecular representations from millions of tandem mass spectra using DreaMS. Nat Biotechnol 2025:10.1038/s41587-025-02663-3. [PMID: 40410407 DOI: 10.1038/s41587-025-02663-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 03/31/2025] [Indexed: 05/25/2025]
Abstract
Characterizing biological and environmental samples at a molecular level primarily uses tandem mass spectroscopy (MS/MS), yet the interpretation of tandem mass spectra from untargeted metabolomics experiments remains a challenge. Existing computational methods for predictions from mass spectra rely on limited spectral libraries and on hard-coded human expertise. Here we introduce a transformer-based neural network pre-trained in a self-supervised way on millions of unannotated tandem mass spectra from our GNPS Experimental Mass Spectra (GeMS) dataset mined from the MassIVE GNPS repository. We show that pre-training our model to predict masked spectral peaks and chromatographic retention orders leads to the emergence of rich representations of molecular structures, which we named Deep Representations Empowering the Annotation of Mass Spectra (DreaMS). Further fine-tuning the neural network yields state-of-the-art performance across a variety of tasks. We make our new dataset and model available to the community and release the DreaMS Atlas-a molecular network of 201 million MS/MS spectra constructed using DreaMS annotations.
Collapse
Affiliation(s)
- Roman Bushuiev
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
- Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University, Prague, Czech Republic
| | - Anton Bushuiev
- Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University, Prague, Czech Republic
| | - Raman Samusevich
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
- Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University, Prague, Czech Republic
| | - Corinna Brungs
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Josef Sivic
- Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University, Prague, Czech Republic.
| | - Tomáš Pluskal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic.
| |
Collapse
|
3
|
Xu R, Zhu J. Unveiling the dark matter of the metabolome: A narrative review of bioinformatics tools for LC-HRMS-based compound annotation. Talanta 2025; 295:128327. [PMID: 40393240 DOI: 10.1016/j.talanta.2025.128327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2025] [Revised: 05/07/2025] [Accepted: 05/13/2025] [Indexed: 05/22/2025]
Abstract
Compound annotation, including the unveiling of dark matter in the metabolomics study represents a pivotal undertaking within the metabolomics field, serving as the linchpin for unraveling the identities and attributes of chemical entities. This narrative review examines the evolution of widely adopted compound annotation tools tailored for liquid chromatography-mass spectrometry (LC-MS) data analysis over the past two decades, which has been characterized by a transition from library-based search methodologies to advanced high-throughput approaches. Furthermore, emerging tools originating from both LC and MS domains were summarized. The synergistic partnership between quantitative structure-retention relationship (QSRR) models and machine learning (ML) techniques is explored, encompassing both conventional methodologies and advanced convolutional neural networks (CNNs). This collaborative framework has played a pivotal role in the precise prediction of retention times. Additionally, the enhanced applicability and extensibility of retention order prediction are emphasized, particularly under the constraints of experimental configurations. Within the domain of mass spectra-based annotation, the foundational task of mapping compound structures to mass spectra is examined-traditionally accomplished by aligning experimental data with established standards and libraries. Recent advancements highlight emerging tools that adopt multi-tiered mapping strategies, such as molecular networks and fragmentation trees, or incorporate machine learning to capture complex mapping patterns. This comprehensive examination underscores the pivotal role of compound annotation tools in advancing our understanding of complex LC-MS data matrix to further assist the annotation of dark matter in metabolome.
Collapse
Affiliation(s)
- Rui Xu
- Human Nutrition Program, Department of Human Sciences, The Ohio State University, Columbus, OH, 43210, United States; Comprehensive Cancer Center, The Ohio State University, Columbus, OH, 43210, United States.
| | - Jiangjiang Zhu
- Human Nutrition Program, Department of Human Sciences, The Ohio State University, Columbus, OH, 43210, United States; Comprehensive Cancer Center, The Ohio State University, Columbus, OH, 43210, United States.
| |
Collapse
|
4
|
Damiani T, Jarmusch AK, Aron AT, Petras D, Phelan VV, Zhao HN, Bittremieux W, Acharya DD, Ahmed MMA, Bauermeister A, Bertin MJ, Boudreau PD, Borges RM, Bowen BP, Brown CJ, Chagas FO, Clevenger KD, Correia MSP, Crandall WJ, Crüsemann M, Fahy E, Fiehn O, Garg N, Gerwick WH, Gilbert JR, Globisch D, Gomes PWP, Heuckeroth S, James CA, Jarmusch SA, Kakhkhorov SA, Kang KB, Kessler N, Kersten RD, Kim H, Kirk RD, Kohlbacher O, Kontou EE, Liu K, Lizama-Chamu I, Luu GT, Luzzatto Knaan T, Mannochio-Russo H, Marty MT, Matsuzawa Y, McAvoy AC, McCall LI, Mohamed OG, Nahor O, Neuweger H, Niedermeyer THJ, Nishida K, Northen TR, Overdahl KE, Rainer J, Reher R, Rodriguez E, Sachsenberg TT, Sanchez LM, Schmid R, Stevens C, Subramaniam S, Tian Z, Tripathi A, Tsugawa H, van der Hooft JJJ, Vicini A, Walter A, Weber T, Xiong Q, Xu T, Pluskal T, Dorrestein PC, Wang M. A universal language for finding mass spectrometry data patterns. Nat Methods 2025:10.1038/s41592-025-02660-z. [PMID: 40355727 DOI: 10.1038/s41592-025-02660-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 03/03/2025] [Indexed: 05/14/2025]
Abstract
Despite being information rich, the vast majority of untargeted mass spectrometry data are underutilized; most analytes are not used for downstream interpretation or reanalysis after publication. The inability to dive into these rich raw mass spectrometry datasets is due to the limited flexibility and scalability of existing software tools. Here we introduce a new language, the Mass Spectrometry Query Language (MassQL), and an accompanying software ecosystem that addresses these issues by enabling the community to directly query mass spectrometry data with an expressive set of user-defined mass spectrometry patterns. Illustrated by real-world examples, MassQL provides a data-driven definition of chemical diversity by enabling the reanalysis of all public untargeted metabolomics data, empowering scientists across many disciplines to make new discoveries. MassQL has been widely implemented in multiple open-source and commercial mass spectrometry analysis tools, which enhances the ability, interoperability and reproducibility of mining of mass spectrometry data for the research community.
Collapse
Affiliation(s)
- Tito Damiani
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Alan K Jarmusch
- Metabolomics Core Facility, Immunity, Inflammation, and Disease Laboratory, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, USA
| | - Allegra T Aron
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Daniel Petras
- Functional Metabolomics Lab, CMFI Cluster of Excellence, University of Tuebingen, Tuebingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA
| | - Vanessa V Phelan
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Haoqi Nina Zhao
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Deepa D Acharya
- Biologicals and Natural Products Discovery, Crop Protection R&D, Corteva Agrisciences, Indianapolis, IN, USA
| | - Mohammed M A Ahmed
- BioMolecular Sciences, School of Pharmacy, University of Mississippi, Oxford, MS, USA
- Pharmacognosy, Faculty of Pharmacy, Al-Azhar University, Nasr City, Egypt
| | - Anelize Bauermeister
- Department of Fundamental Chemistry, Institute of Chemistry, University of São Paulo, São Paulo, Brazil
| | - Matthew J Bertin
- Department of Chemistry, Case Western Reserve University, Cleveland, OH, USA
| | - Paul D Boudreau
- BioMolecular Sciences, School of Pharmacy, University of Mississippi, Oxford, MS, USA
| | - Ricardo M Borges
- Walter Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Benjamin P Bowen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, USA
- The Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, USA
| | - Christopher J Brown
- Mass Spectrometry Center of Expertise, Regulatory and Stewardship, Corteva Agrisciences, Indianapolis, IN, USA
| | - Fernanda O Chagas
- Walter Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Kenneth D Clevenger
- Biologicals and Natural Products, Crop Protection R&D, Corteva Agrisciences, Indianapolis, IN, USA
| | - Mario S P Correia
- Department of Chemistry - BMC, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - William J Crandall
- Clinical Biomarkers Laboratory, School of Medicine, Emory University, Atlanta, GA, USA
| | - Max Crüsemann
- Institute of Pharmaceutical Biology, University of Bonn, Bonn, Germany
- Institute of Pharmaceutical Biology, Goethe University Frankfurt, Frankfurt, Germany
| | - Eoin Fahy
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California Davis, Davis, CA, USA
| | - Neha Garg
- School of Chemistry and Biochemistry, Center for Microbial Dynamics and Infection, Georgia Institute of Technology, Atlanta, GA, USA
| | - William H Gerwick
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
- Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Jeffrey R Gilbert
- Mass Spectrometry Center of Expertise, Regulatory and Stewardship, Corteva Agrisciences, Indianapolis, IN, USA
| | - Daniel Globisch
- Department of Chemistry - BMC, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Paulo Wender P Gomes
- Faculty of Chemistry, Institute of Exact and Natural Science, Federal University of Para, Belem, Brazil
| | - Steffen Heuckeroth
- Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany
| | - C Andrew James
- Center for Urban Waters, University of Washington, Tacoma, WA, USA
| | - Scott A Jarmusch
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Sarvar A Kakhkhorov
- Laboratory of Physical and Chemical Methods of Research, Center for Advanced Technologies, Tashkent, Uzbekistan
| | - Kyo Bin Kang
- College of Pharmacy, Sookmyung Women's University, Seoul, Republic of Korea
| | - Nikolas Kessler
- SW R&D Bioinformatics, Life Science Mass Spectrometry, Bruker Daltonics GmbH & Co. KG, Bremen, Germany
| | - Roland D Kersten
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, Ann Arbor, MI, USA
| | - Hyunwoo Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University-Seoul, Goyang, Republic of Korea
| | - Riley D Kirk
- College of Pharmacy, University of Rhode Island, Kingston, RI, USA
| | - Oliver Kohlbacher
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, University of Tuebingen; Institute for Bioinformatics and Medical Informatics, University of Tuebingen; Institute for Translational Bioinformatics, University Hospital Tuebingen, Tübingen, Germany
| | - Eftychia E Kontou
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Ken Liu
- Clinical Biomarkers Laboratory, School of Medicine, Emory University, Atlanta, GA, USA
| | - Itzel Lizama-Chamu
- Department of Chemistry and Biochemistry, UC Santa Cruz, Santa Cruz, CA, USA
| | - Gordon T Luu
- Department of Chemistry and Biochemistry, UC Santa Cruz, Santa Cruz, CA, USA
| | - Tal Luzzatto Knaan
- Department of Marine Biology, The Leon H. Charney School of Marine Sciences, University of Haifa, Haifa, Israel
| | - Helena Mannochio-Russo
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Michael T Marty
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, USA
| | - Yuki Matsuzawa
- Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, Koganei, Japan
| | - Andrew C McAvoy
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA
| | - Laura-Isobel McCall
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA, USA
| | - Osama G Mohamed
- Pharmacognosy Department, Faculty of Pharmacy, Cairo University, Cairo, Egypt
- Natural Products Discovery Core, Life Sciences Institute, University of Michigan, Ann Arbor, MI, USA
| | - Omri Nahor
- Department of Marine Biology, The Leon H. Charney School of Marine Sciences, University of Haifa, Haifa, Israel
| | - Heiko Neuweger
- SW R&D Bioinformatics, Life Science Mass Spectrometry, Bruker Daltonics GmbH & Co. KG, Bremen, Germany
| | | | - Kozo Nishida
- Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, Koganei, Japan
| | - Trent R Northen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, USA
- The Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, USA
| | - Kirsten E Overdahl
- Metabolomics Core Facility, Immunity, Inflammation, and Disease Laboratory, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC, USA
| | | | - Raphael Reher
- Department of Pharmacy, University of Marburg, Marburg, Germany
| | - Elys Rodriguez
- West Coast Metabolomics Center, University of California Davis, Davis, CA, USA
| | - Timo T Sachsenberg
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, University of Tuebingen, Tübingen, Germany
| | - Laura M Sanchez
- Department of Chemistry and Biochemistry, UC Santa Cruz, Santa Cruz, CA, USA
| | - Robin Schmid
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Cole Stevens
- Department of BioMolecular Sciences, School of Pharmacy, University of Mississippi, Oxford, MS, USA
| | - Shankar Subramaniam
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | - Zhenyu Tian
- Chemistry and Chemical Biology, Northeastern University, Boston, MA, USA
| | - Ashootosh Tripathi
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, Ann Arbor, MI, USA
- Natural Products Discovery Core, Life Sciences Institute, University of Michigan, Ann Arbor, MI, USA
| | - Hiroshi Tsugawa
- Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, Koganei, Japan
- RIKEN Center for Integrative Medical Sciences, Tsurumi-ku, Japan
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Japan
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, Wageningen, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Andrea Vicini
- Institute for Biomedicine, Eurac Research, Bolzano, Italy
| | - Axel Walter
- Applied Bioinformatics, Department of Computer Science, University of Tuebingen, University of Tuebingen, Tübingen, Germany
| | - Tilmann Weber
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Quanbo Xiong
- Crop Protection R&D, Corteva Agrisciences, Indianapolis, IN, USA
| | - Tao Xu
- Data Science and Bioinformatics, Corteva Agrisciences, Dublin, OH, USA
| | - Tomáš Pluskal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, USA.
| |
Collapse
|
5
|
Bjärterot P, Nilsson A, Shariatgorji R, Vallianatou T, Kaya I, Svenningsson P, Käll L, Andrén PE. Met-ID: An Open-Source Software for Comprehensive Annotation of Multiple On-Tissue Chemical Modifications in MALDI-MSI. Anal Chem 2025; 97:9033-9041. [PMID: 40253716 PMCID: PMC12044586 DOI: 10.1021/acs.analchem.5c00633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2025] [Revised: 03/20/2025] [Accepted: 03/25/2025] [Indexed: 04/22/2025]
Abstract
Here, we introduce Met-ID, a graphical user interface software designed to efficiently identify metabolites from MALDI-MSI data sets. Met-ID enables annotation of m/z features from any type of MALDI-MSI experiment, involving either derivatizing or conventional matrices. It utilizes structural information for derivatizing matrices to generate a subset of targets that contain only functional groups specific to the derivatization agent. The software is able to identify multiple derivatization sites on the same molecule, facilitating identification of the derivatized compound. This ability is exemplified by FMP-10, a reactive matrix that assists the covalent charge-tagging of molecules containing phenolic hydroxyl and/or primary or secondary amine groups. Met-ID also permits users to recalibrate data with known m/z ratios, boosting confidence in mass match results. Furthermore, Met-ID includes a database featuring MS2 spectra of numerous chemical standards, consisting of neurotransmitters and metabolites derivatized with FMP-10, alongside peaks for FMP-10 itself, all accessible directly through the software. The MS2 spectral database supports user-uploaded spectra and enables comparison of these spectra with user-provided tissue MS2 spectra for similarity assessment. Although initially installed with basic data, Met-ID is designed to be customizable, encouraging users to tailor the software to their specific needs. While several MSI-oriented software solutions exist, Met-ID combines both MS1 and MS2 functionalities. Developed in alignment with the FAIR Guiding Principles for scientific software, Met-ID is freely available as an open-source tool on GitHub, ensuring wide accessibility and collaboration.
Collapse
Affiliation(s)
- Patrik Bjärterot
- Department
of Pharmaceutical Biosciences, Spatial Mass Spectrometry, Science
for Life Laboratory, Uppsala University, SE-75124 Uppsala, Sweden
| | - Anna Nilsson
- Department
of Pharmaceutical Biosciences, Spatial Mass Spectrometry, Science
for Life Laboratory, Uppsala University, SE-75124 Uppsala, Sweden
| | - Reza Shariatgorji
- Department
of Pharmaceutical Biosciences, Spatial Mass Spectrometry, Science
for Life Laboratory, Uppsala University, SE-75124 Uppsala, Sweden
| | - Theodosia Vallianatou
- Department
of Pharmaceutical Biosciences, Spatial Mass Spectrometry, Science
for Life Laboratory, Uppsala University, SE-75124 Uppsala, Sweden
| | - Ibrahim Kaya
- Department
of Pharmaceutical Biosciences, Spatial Mass Spectrometry, Science
for Life Laboratory, Uppsala University, SE-75124 Uppsala, Sweden
| | - Per Svenningsson
- Department
of Clinical Neuroscience, Karolinska Institute, SE-17177 Stockholm, Sweden
| | - Lukas Käll
- Science
for Life Laboratory, School of Engineering Sciences in Chemistry,
Biotechnology and Health, Royal Institute
of Technology-KTH, SE-17165 Solna, Sweden
| | - Per E. Andrén
- Department
of Pharmaceutical Biosciences, Spatial Mass Spectrometry, Science
for Life Laboratory, Uppsala University, SE-75124 Uppsala, Sweden
| |
Collapse
|
6
|
Szwarc S, Rutz A, Lee K, Mejri Y, Bonnet O, Hazni H, Jagora A, Mbeng Obame RB, Noh JK, Otogo N'Nang E, Alaribe SC, Awang K, Bernadat G, Choi YH, Courdavault V, Frederich M, Gaslonde T, Huber F, Kam TS, Low YY, Poupon E, van der Hooft JJJ, Kang KB, Le Pogam P, Beniddir MA. Translating community-wide spectral library into actionable chemical knowledge: a proof of concept with monoterpene indole alkaloids. J Cheminform 2025; 17:62. [PMID: 40296170 PMCID: PMC12039057 DOI: 10.1186/s13321-025-01009-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Accepted: 04/02/2025] [Indexed: 04/30/2025] Open
Abstract
With over 3000 representatives, the monoterpene indole alkaloids (MIAs) class is among the most diverse families of plant natural products. The MS/MS spectral space exploration of these complex compounds using chemoinformatic and computational mass spectrometry tools offers a valuable opportunity to extract and share chemical insights from this emblematic family of natural products (NPs). In this work, we first present a substantially updated version of the MIADB, a database now containing 422 MS/MS spectra of MIAs that has been uploaded to the GNPS library versus 172 initial entries. We then introduce an innovative workflow that leverages hundreds of fragmentation spectra to support the FAIRification, extraction and dissemination of chemical knowledge. This workflow aims at the extraction of spectral patterns matching finely defined MIA skeletons. These extracted signatures can then be queried against complex biological extract datasets using MassQL. By applying this strategy to an LC-MS/MS dataset of 75 plant extracts, our results demonstrated the efficiency of this approach in identifying the diversity of MIA skeletons present in the analyzed samples. Additionally, our work enabled the digitization of structural data for diverse MIA skeletons by converting them into machine-readable formats and thereby enhancing their dissemination for the scientific community.Scientific contribution A comprehensive investigation of the monoterpene indole alkaloid chemical space, aiming to highlight skeleton-dependent fragmentation similarity trends and to generate valuable spectrometric signatures that could be used as queries.
Collapse
Affiliation(s)
- Sarah Szwarc
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France
| | - Adriano Rutz
- Institute of Molecular Systems Biology, ETH Zürich, 8093, Zurich, Switzerland
| | - Kyungha Lee
- College of Pharmacy and Research Institute of Pharmaceutical Sciences, Sookmyung Women's University, Seoul, 04310, Republic of Korea
| | - Yassine Mejri
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France
- Université Paris-Dauphine, PSL Research University, CNRS, LAMSADE, 75016, PARIS, France
| | - Olivier Bonnet
- Laboratory of Pharmacognosy, Center of Interdisciplinary Research On Medicines (CIRM), University of Liège, Liège, Belgium
| | - Hazrina Hazni
- Department of Chemistry, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Adrien Jagora
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France
| | - Rany B Mbeng Obame
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France
| | - Jin Kyoung Noh
- El Batan, Instituto de BioEconomia, Quito, 170135, Ecuador
| | - Elvis Otogo N'Nang
- Département Science Fondamentale, Service Chimie-Biochimie, Université Des Sciences de La Santé, Owendo, Gabon
| | - Stephenie C Alaribe
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, College of Medicine, University of Lagos, Idiaraba Campus, Surulere, Lagos, Nigeria
| | - Khalijah Awang
- Department of Chemistry, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Guillaume Bernadat
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France
| | - Young Hae Choi
- Natural Products Laboratory, Institute of Biology, Leiden University, Sylviusweg 72, 2333 BE, Leiden, the Netherlands
| | - Vincent Courdavault
- EA2106 Biomolécules et Biotechnologies Végétales, Université de Tours, 31 Avenue Monge, 37200, Tours, France
| | - Michel Frederich
- Laboratory of Pharmacognosy, Center of Interdisciplinary Research On Medicines (CIRM), University of Liège, Liège, Belgium
| | - Thomas Gaslonde
- UMR 8038 CiTCoM, Faculté de Santé, Université Paris Cité, CNRS, 75006, Paris, France
| | - Florian Huber
- Centre for Digitalisation and Digitality, Düsseldorf University of Applied Sciences, 40476, Düsseldorf, Germany
| | - Toh-Seok Kam
- Department of Chemistry, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Yun Yee Low
- Department of Chemistry, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Erwan Poupon
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, 2006, South Africa
| | - Kyo Bin Kang
- College of Pharmacy and Research Institute of Pharmaceutical Sciences, Sookmyung Women's University, Seoul, 04310, Republic of Korea
| | - Pierre Le Pogam
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France.
| | - Mehdi A Beniddir
- Équipe, Chimie des Substances Naturelles, Université Paris-Saclay, CNRS, BioCIS, 17 avenue des Sciences, 91400, Orsay, France.
| |
Collapse
|
7
|
Wu Q, Song D, Zhao Y, Verdegaal AA, Turocy T, Duncan-Lowey B, Goodman AL, Palm NW, Crawford JM. Activity of GPCR-targeted drugs influenced by human gut microbiota metabolism. Nat Chem 2025:10.1038/s41557-025-01789-w. [PMID: 40181149 DOI: 10.1038/s41557-025-01789-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 02/24/2025] [Indexed: 04/05/2025]
Abstract
Microbiota-mediated drug metabolism can affect pharmacological efficacy. Here we conducted a systematic comparative metabolomics investigation of drug metabolism modes by evaluating the impacts of human gut commensal bacteria on 127 G-protein-coupled receptor (GPCR)-targeted drugs. For the most extensively metabolized drugs in our screen, we elucidated both conventional and unconventional drug transformations and the corresponding activities of generated metabolites. Comparisons of drug metabolism by a gut microbial community versus individual species revealed both taxon intrinsic and collaborative processes that influenced the activity of the metabolized drugs against target GPCRs. We also observed iloperidone inactivation by generating unconventional metabolites. The human gut commensal bacteria mixture incorporated sulfur in the form of a thiophene motif, whereas Morganella morganii used a cascade reaction to incorporate amino-acid-derived tricyclic systems into the drug metabolites. Our results reveal a broad impact of human gut commensal bacteria on GPCR-targeted drug structures and activities through diverse microbiota-mediated biotransformations.
Collapse
Affiliation(s)
- Qihao Wu
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Deguang Song
- Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA
| | - Yanyu Zhao
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Andrew A Verdegaal
- Department of Microbial Pathogenesis, Yale University School of Medicine, New Haven, CT, USA
- Microbial Sciences Institute, Yale University, West Haven, CT, USA
| | - Tayah Turocy
- Department of Chemistry, Yale University, New Haven, CT, USA
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA
| | - Brianna Duncan-Lowey
- Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA
| | - Andrew L Goodman
- Department of Microbial Pathogenesis, Yale University School of Medicine, New Haven, CT, USA.
- Microbial Sciences Institute, Yale University, West Haven, CT, USA.
| | - Noah W Palm
- Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA.
| | - Jason M Crawford
- Department of Chemistry, Yale University, New Haven, CT, USA.
- Institute of Biomolecular Design and Discovery, Yale University, West Haven, CT, USA.
- Department of Microbial Pathogenesis, Yale University School of Medicine, New Haven, CT, USA.
| |
Collapse
|
8
|
Nowatzky Y, Russo FF, Lisec J, Kister A, Reinert K, Muth T, Benner P. FIORA: Local neighborhood-based prediction of compound mass spectra from single fragmentation events. Nat Commun 2025; 16:2298. [PMID: 40055306 PMCID: PMC11889238 DOI: 10.1038/s41467-025-57422-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Accepted: 02/20/2025] [Indexed: 05/13/2025] Open
Abstract
Non-targeted metabolomics holds great promise for advancing precision medicine and biomarker discovery. However, identifying compounds from tandem mass spectra remains a challenging task due to the incomplete nature of spectral reference libraries. Augmenting these libraries with simulated mass spectra can provide the necessary references to resolve unmatched spectra, but generating high-quality data is difficult. In this study, we present FIORA, an open-source graph neural network designed to simulate tandem mass spectra. Our main contribution lies in utilizing the molecular neighborhood of bonds to learn breaking patterns and derive fragment ion probabilities. FIORA not only surpasses state-of-the-art fragmentation algorithms, ICEBERG and CFM-ID, in prediction quality, but also facilitates the prediction of additional features, such as retention time and collision cross section. Utilizing GPU acceleration, FIORA enables rapid validation of putative compound annotations and large-scale expansion of spectral reference libraries with high-quality predictions.
Collapse
Affiliation(s)
- Yannek Nowatzky
- Section VP.1 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
| | - Francesco Friedrich Russo
- Department of Analytical Chemistry and Reference Materials, Organic Trace Analysis and Food Analysis, Federal Institute for Materials Research and Testing (BAM), Berlin, Germany
- Institute of Pharmacy, Freie Universität Berlin, Berlin, Germany
| | - Jan Lisec
- Department of Analytical Chemistry and Reference Materials, Organic Trace Analysis and Food Analysis, Federal Institute for Materials Research and Testing (BAM), Berlin, Germany
| | - Alexander Kister
- Section VP.1 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin, Germany
| | - Knut Reinert
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Thilo Muth
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
- Data Competence Center MF 2, Robert Koch Institute, Berlin, Germany
| | - Philipp Benner
- Section VP.1 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin, Germany.
| |
Collapse
|
9
|
Onoprishvili T, Yuan JH, Petrov K, Ingalalli V, Khederlarian L, Leuchtenmuller N, Chandra S, Duarte A, Bender A, Gloaguen Y. SimMS: a GPU-accelerated cosine similarity implementation for tandem mass spectrometry. Bioinformatics 2025; 41:btaf081. [PMID: 39977359 PMCID: PMC11886821 DOI: 10.1093/bioinformatics/btaf081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 01/17/2025] [Accepted: 02/18/2025] [Indexed: 02/22/2025] Open
Abstract
MOTIVATION Untargeted metabolomics involves a large-scale comparison of the fragmentation pattern of a mass spectrum against a database containing known spectra. Given the number of comparisons involved, this step can be time-consuming. RESULTS In this work, we present a GPU-accelerated cosine similarity implementation for Tandem Mass Spectrometry (MS), with an approximately 1000-fold speedup compared to the MatchMS reference implementation, without any loss of accuracy. This improvement enables repository-scale spectral library matching for compound identification without the need for large compute clusters. This impact extends to any spectral comparison-based methods such as molecular networking approaches and analogue search. AVAILABILITY AND IMPLEMENTATION All code, results, and notebooks supporting are freely available under the MIT license at https://github.com/pangeAI/simms/.
Collapse
Affiliation(s)
| | | | - Kamen Petrov
- Pangea Botanica Germany GmbH, Berlin 10623, Germany
| | | | | | | | - Sona Chandra
- Pangea Botanica Germany GmbH, Berlin 10623, Germany
| | | | | | | |
Collapse
|
10
|
Charron-Lamoureux V, Mannochio-Russo H, Lamichhane S, Xing S, Patan A, Portal Gomes PW, Rajkumar P, Deleray V, Caraballo-Rodríguez AM, Chua KV, Lee LS, Liu Z, Ching J, Wang M, Dorrestein PC. A guide to reverse metabolomics-a framework for big data discovery strategy. Nat Protoc 2025:10.1038/s41596-024-01136-2. [PMID: 40021805 DOI: 10.1038/s41596-024-01136-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 12/17/2024] [Indexed: 03/03/2025]
Abstract
Untargeted metabolomics is evolving into a field of big data science. There is a growing interest within the metabolomics community in mining tandem mass spectrometry (MS/MS)-based data from public repositories. In traditional untargeted metabolomics, samples to address a predefined question are collected and liquid chromatography with MS/MS data are generated. We then identify metabolites associated with a phenotype (for example, disease versus healthy) and elucidate or validate their structural details (for example, molecular formula, structural classification, substructure or complete structural annotation or identification). In reverse metabolomics, we start with MS/MS spectra for known or unknown molecules. These spectra are used as search terms to search public data repositories to discover phenotype-relevant information such as organ/biofluid distribution, disease condition, intervention status (for example, pre- and postintervention), organisms (for example, mammals versus others), geography and any other biologically relevant associations. Here we guide the reader through a four-part process: (1) obtaining the MS/MS spectra of interest (Universal Spectrum Identifier) and (2) Mass Spectrometry Search Tool searches to find the files associated with the MS/MS that are in available databases, (3) using the Reanalysis Data User Interface framework to link the files with their metadata and (4) validating the observations. Parts 1-3 could take from hours to days depending on the method used for collecting MS/MS spectra. For example, we use MS/MS spectra from three small molecules: phenylalanine-cholic acid (a microbially conjugated bile acid), phenylalanine-C4:0 and histidine-C4:0 (two N-acyl amides). We leverage the Global Natural Products Social Molecular Networking-based framework to explore the microbial producers of these molecules and their associations with health conditions and organ distributions in humans and rodents.
Collapse
Affiliation(s)
- Vincent Charron-Lamoureux
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Helena Mannochio-Russo
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Santosh Lamichhane
- Turku Bioscience Center, University of Turku and Åbo Akademi University, Turku, Finland
| | - Shipei Xing
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Abubaker Patan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Paulo Wender Portal Gomes
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Prajit Rajkumar
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Victoria Deleray
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Andrés Mauricio Caraballo-Rodríguez
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Kee Voon Chua
- Cardiovascular and Metabolic Disorders Programme, Duke-NUS Medical School, Singapore, Singapore
| | - Lye Siang Lee
- Cardiovascular and Metabolic Disorders Programme, Duke-NUS Medical School, Singapore, Singapore
| | - Zhao Liu
- Cardiovascular and Metabolic Disorders Programme, Duke-NUS Medical School, Singapore, Singapore
| | - Jianhong Ching
- Cardiovascular and Metabolic Disorders Programme, Duke-NUS Medical School, Singapore, Singapore
- KK research Centre, KK Women's and Children's Hospital, Singapore, Singapore
| | - Mingxun Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
11
|
Martin M, Bittremieux W, Hassoun S. Molecular Structure Discovery for Untargeted Metabolomics Using Biotransformation Rules and Global Molecular Networking. Anal Chem 2025; 97:3213-3219. [PMID: 39903752 PMCID: PMC11841678 DOI: 10.1021/acs.analchem.4c01565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 12/18/2024] [Accepted: 12/22/2024] [Indexed: 02/06/2025]
Abstract
Although untargeted mass spectrometry-based metabolomics is crucial for understanding life's molecular underpinnings, its effectiveness is hampered by low annotation rates of the generated tandem mass spectra. To address this issue, we introduce a novel data-driven approach, Biotransformation-based Annotation Method (BAM), that leverages molecular structural similarities inherent in biochemical reactions. BAM operates by applying biotransformation rules to known "anchor" molecules, which exhibit high spectral similarity to unknown spectra, thereby hypothesizing and ranking potential structures for the corresponding "suspect" molecule. BAM's effectiveness is demonstrated by its success in annotating query spectra in a global molecular network comprising hundreds of millions of spectra. BAM was able to assign correct molecular structures to 24.2% of examined anchor-suspect cases, thereby demonstrating remarkable advancement in metabolite annotation.
Collapse
Affiliation(s)
- Margaret
R. Martin
- Department
of Computer Science, Tufts University, Medford, Massachusetts 02155, United States
| | - Wout Bittremieux
- Department
of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
| | - Soha Hassoun
- Department
of Computer Science, Tufts University, Medford, Massachusetts 02155, United States
- Department
of Chemical and Biological Engineering, Tufts University, Medford, Massachusetts 02155, United States
| |
Collapse
|
12
|
Guan X, Bu F, Fu Y, Zhang H, Xiang H, Chen X, Chen T, Wu X, Wu K, Liu L, Dong X. Immunogenic peptides putatively from intratumor microbes: Opportunities for colorectal cancer treatment. iScience 2024; 27:111338. [PMID: 39640572 PMCID: PMC11617993 DOI: 10.1016/j.isci.2024.111338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 07/23/2024] [Accepted: 11/04/2024] [Indexed: 12/07/2024] Open
Abstract
Recent evidence has confirmed the presence of intratumor microbes, yet their impact on the immunopeptidome remains largely unexplored. Here we introduced an integrated strategy to identify the immunopeptidome originated from intratumor microbes. Analyzing 10 colorectal cancer (CRC) patients, we identified 154 putative microbe-derived human leukocyte antigen (HLA)-I ligands. Predominantly bacterial in origin, these peptides were notably abundant in Fusobacterium nucleatum, the most prevalent bacterium differentiating between normal and tumor tissues. We discovered 20 peptides originating from F. nucleatum, thirteen of which, including two peptides shared across multiple patients, were tumor specific. Validation experiments confirmed that the putative microbe-derived peptide could activate CD8+ T cell responses. Our findings indicate that HLA-I molecules are capable of presenting intratumor microbe-derived peptides in CRC, potentially contributing to CD8+ T cell-mediated immunity and suggesting potential strategies for cancer immunotherapy.
Collapse
Affiliation(s)
- Xiangyu Guan
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
| | - Fanyu Bu
- BGI Research, Hangzhou 310030, China
| | - Yunyun Fu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Hangzhou 310030, China
| | - Haibo Zhang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Hangzhou 310030, China
| | | | - Xinle Chen
- BGI Research, Hangzhou 310030, China
- Center for Mitochondrial Biology and Medicine, The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, China
| | - Tai Chen
- BGI Research, Changzhou 213299, China
| | - Xiaojian Wu
- The Sixth Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510655, China
| | - Kui Wu
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI Research, Shenzhen 518083, China
- HIM-BGI Omics Center, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences (CAS), Hangzhou 310022, China
| | - Longqi Liu
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
| | - Xuan Dong
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
- Guangdong Provincial Key Laboratory of Human Disease Genomics, Shenzhen Key Laboratory of Genomics, BGI Research, Shenzhen 518083, China
- HIM-BGI Omics Center, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences (CAS), Hangzhou 310022, China
| |
Collapse
|
13
|
Cho YB, Kim JG, Han JS, An BK, Lee D, Lee MK, Hwang BY. LC-HRMS/MS-Guided Isolation of Unusual Diarylheptanoids from the Rhizomes of Alpinia officinarum. ACS OMEGA 2024; 9:46484-46491. [PMID: 39583693 PMCID: PMC11579737 DOI: 10.1021/acsomega.4c07987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 10/24/2024] [Accepted: 10/29/2024] [Indexed: 11/26/2024]
Abstract
LC-HRMS/MS analysis facilitated the precise targeting, isolation, and identification of unusual dimeric diarylheptanoids from Alpinia officinarum (A. officinarum). The tandem MS data for (4E)-1,7-diphenyl-4-hepten-3-one (7) revealed fragment ions at m/z 91, 105, and 117, which are fragmentation patterns specific to diarylheptanoids. In the tandem MS data, peaks with m/z values ranging from 450 to 600 that exhibited these specific fragment ions were selected and isolated. Consequently, two previously undescribed dimeric diarylheptanoids (1 and 2) and four unusual diarylheptanoids (3-6) along with 10 monomeric diarylheptanoids (7-16) were isolated from the rhizomes of A. officinarum using various chromatographic techniques. The structures of the isolates were elucidated by an analysis of 1D/2D NMR and HRESIMS data, and a combination of DP4+ probability analysis and ECD calculations. To evaluate the anti-inflammatory effects of the isolated compounds, their inhibitory activity against nitric oxide production in LPS-induced RAW 264.7 cells was assessed. Compounds 1, 7, and 9 exhibited remarkable inhibitory effects with IC50 values of 14.7, 6.6, and 5.0 μM, respectively.
Collapse
Affiliation(s)
- Yong Beom Cho
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Republic of Korea
| | - Jun Gu Kim
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Republic of Korea
| | - Jae Sang Han
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Republic of Korea
| | - Beom Kyun An
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Republic of Korea
| | - Dongho Lee
- Department
of Plant Biotechnology, College of Life Sciences and Biotechnology, Korea University, Seoul 02841, Republic of Korea
| | - Mi Kyeong Lee
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Republic of Korea
| | - Bang Yeon Hwang
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Republic of Korea
| |
Collapse
|
14
|
Meunier M, Schinkovitz A, Derbré S. Current and emerging tools and strategies for the identification of bioactive natural products in complex mixtures. Nat Prod Rep 2024; 41:1766-1786. [PMID: 39291767 DOI: 10.1039/d4np00006d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/19/2024]
Abstract
Covering: up to 2024The prompt identification of (bio)active natural products (NPs) from complex mixtures poses a significant challenge due to the presence of numerous compounds with diverse structures and (bio)activities. Thus, this review provides an overview of current and emerging tools and strategies for the identification of (bio)active NPs in complex mixtures. Traditional approaches of bioassay-guided fractionation (BGF), followed by nuclear magnetic resonance (NMR) and mass spectrometry (MS) analysis for compound structure elucidation, continue to play an important role in the identification of active NPs. However, recent advances (2018-2024) have led to the development of novel techniques such as (bio)chemometric analysis, dereplication and combined approaches, which allow efficient prioritization for the elucidation of (bio)active compounds. For researchers involved in the search for bioactive NPs and who want to speed up their discoveries while maintaining accurate identifications, this review highlights the strengths and limitations of each technique and provides up-to-date insights into their combined use to achieve the highest level of confidence in the identification of (bio)active natural products from complex matrices.
Collapse
Affiliation(s)
- Manon Meunier
- Univ. Angers, SONAS, SFR QUASAV, F-49000 Angers, France.
| | | | | |
Collapse
|
15
|
Shahneh MRZ, Strobel M, Vitale GA, Geibel C, Abiead YE, Garg N, Wagner B, Forchhammer K, Aron A, Phelan VV, Petras D, Wang M. ModiFinder: Tandem Mass Spectral Alignment Enables Structural Modification Site Localization. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:2564-2578. [PMID: 38830143 PMCID: PMC11540723 DOI: 10.1021/jasms.4c00061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Untargeted tandem mass spectrometry (MS/MS) has become a high-throughput method to measure small molecules in complex samples. One key goal is the transformation of these MS/MS spectra into chemical structures. Computational techniques such as MS/MS library search have enabled the reidentification of known compounds. Analog library search and molecular networking extend this identification to unknown compounds. While there have been advancements in metrics for the similarity of MS/MS spectra of structurally similar compounds, there is still a lack of automated methods to provide site specific information about structural modifications. Here we introduce ModiFinder which leverages the alignment of peaks in MS/MS spectra between structurally related known and unknown small molecules. Specifically, ModiFinder focuses on shifted MS/MS fragment peaks in the MS/MS alignment. These shifted peaks putatively represent substructures of the known molecule that contain the site of the modification. ModiFinder synthesizes this information together and scores the likelihood for each atom in the known molecule to be the modification site. We demonstrate in this manuscript how ModiFinder can effectively localize modifications which extends the capabilities of MS/MS analog searching and molecular networking to accelerate the discovery of novel compounds.
Collapse
Affiliation(s)
- Mohammad Reza Zare Shahneh
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| | - Michael Strobel
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| | - Giovanni Andrea Vitale
- Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Auf der Morgenstelle 24, Tuebingen 72076, Germany
| | - Christian Geibel
- Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Auf der Morgenstelle 24, Tuebingen 72076, Germany
| | - Yasin El Abiead
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Dr., San Diego, California 92093, United States
| | - Neha Garg
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta,, 950 Atlantic Drive, Atlanta, Georgia 30332, United States
| | - Berenike Wagner
- Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Auf der Morgenstelle 28, Tuebingen 72076, Germany
| | - Karl Forchhammer
- Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, Auf der Morgenstelle 28, Tuebingen 72076, Germany
| | - Allegra Aron
- Department of Chemistry and Biochemistry, University of Denver, 2101 East Wesley Ave, Denver, Colorado 80210, United States
| | - Vanessa V Phelan
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Anschutz Medical Campus, 12850 E Montview Blvd, Aurora, Colorado 80045, United States
| | - Daniel Petras
- Department of Biochemistry, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| |
Collapse
|
16
|
Zhang H, Yang Q, Xie T, Wang Y, Zhang Z, Lu H. MSBERT: Embedding Tandem Mass Spectra into Chemically Rational Space by Mask Learning and Contrastive Learning. Anal Chem 2024; 96:16599-16608. [PMID: 39397717 DOI: 10.1021/acs.analchem.4c02426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Tandem mass spectrometry (MS/MS) is a powerful technique for chemical analysis in many areas of science. The vast MS/MS spectral data generated in liquid chromatography-mass spectrometry (LC-MS) experiments require efficient analysis and interpretation methods for the following compound identification. In this study, we propose MSBERT based on self-supervised learning strategies to embed MS/MS spectra into reasonable embeddings for efficient compound identification. It adopts the transformer encoder as the backbone for mask learning and uses the same spectra with different masks for contrastive learning. MSBERT is trained on the GNPS data set and tested on the GNPS data set, the MoNA data set, and the MTBLS1572 data set. It exhibits enhanced library matching and analogous compound searching capabilities compared to existing methods. The recalls at 1, 5, and 10 on a GNPS test subset with structures not in the training set are 0.7871, 0.8950, and 0.9080, respectively. The results are better than those of Spec2Vec with 0.6898, 0.8276, and 0.8620, and DreaMS with 0.7158, 0.8327, and 0.8635. The rationality of embeddings is demonstrated by t-SNE visualization, structural similarity, spectra clustering, compound identification, and analogous compound searching. A user-friendly web server is provided for efficient spectral analysis, and the source code for MSBERT is available at https://github.com/zhanghailiangcsu/MSBERT.
Collapse
Affiliation(s)
- Hailiang Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Qiong Yang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Ting Xie
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Yue Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
17
|
Elhamraoui Z, Borràs E, Wilhelm M, Sabidó E. Theoretical Assessment of Indistinguishable Peptides in Mass Spectrometry-Based Proteomics. Anal Chem 2024; 96:15829-15833. [PMID: 39322219 PMCID: PMC11465223 DOI: 10.1021/acs.analchem.4c02803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 08/28/2024] [Accepted: 08/28/2024] [Indexed: 09/27/2024]
Abstract
Mass-spectrometry-based proteomics has advanced with the integration of experimental and predicted spectral libraries, which have significantly improved peptide identification in complex search spaces. However, challenges persist in distinguishing some peptides with close retention times and nearly identical fragmentation patterns. In this study, we conducted a theoretical assessment to quantify the prevalence of indistinguishable peptides within the human canonical proteome and immunopeptidome using state-of-the-art retention time and spectrum prediction models. By quantifying the proportion of peptides posing challenges to unequivocal identification, we set the theoretical nonaccessible portion within a given proteome, and underscore the effectiveness of contemporary analytical methodologies in resolving the complexity of the human proteome and immunopeptidome via mass spectrometry.
Collapse
Affiliation(s)
- Zahra Elhamraoui
- Centre
for Genomic Regulation (CRG), The Barcelona
Institute of Science and Technology (BIST), Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat
Pompeu Fabra (UPF), Dr. Aiguader 88, Barcelona 08003, Spain
| | - Eva Borràs
- Centre
for Genomic Regulation (CRG), The Barcelona
Institute of Science and Technology (BIST), Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat
Pompeu Fabra (UPF), Dr. Aiguader 88, Barcelona 08003, Spain
| | - Mathias Wilhelm
- Computational
Mass Spectrometry, Technical University
of Munich, D-85354 Freising, Germany
- Munich Data
Science Institute (MDSI), Technical University
of Munich, D-85748 Garching, Germany
| | - Eduard Sabidó
- Centre
for Genomic Regulation (CRG), The Barcelona
Institute of Science and Technology (BIST), Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat
Pompeu Fabra (UPF), Dr. Aiguader 88, Barcelona 08003, Spain
| |
Collapse
|
18
|
Wang X, Strobel M, Aron AT, Phelan VV, Acharya DD, Brown CJ, Clevenger K, Hu J, Kretsch A, Mahood EH, Menegatti C, Xiong Q, Wang M. Network Topology Evaluation and Transitive Alignments for Molecular Networking. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:2165-2175. [PMID: 39133821 PMCID: PMC11516331 DOI: 10.1021/jasms.4c00208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Untargeted tandem mass spectrometry (MS/MS) is an essential technique in modern analytical chemistry, providing a comprehensive snapshot of chemical entities in complex samples and identifying unknowns through their fragmentation patterns. This high-throughput approach generates large data sets that can be challenging to interpret. Molecular Networks (MNs) have been developed as a computational tool to aid in the organization and visualization of complex chemical space in untargeted mass spectrometry data, thereby supporting comprehensive data analysis and interpretation. MNs group related compounds with potentially similar structures from MS/MS data by calculating all pairwise MS/MS similarities and filtering these connections to produce a MN. Such networks are instrumental in metabolomics for identifying novel metabolites, elucidating metabolic pathways, and even discovering biomarkers for disease. While MS/MS similarity metrics have been explored in the literature, the influence of network topology approaches on MN construction remains unexplored. This manuscript introduces metrics for evaluating MN construction, benchmarks state-of-the-art approaches, and proposes the Transitive Alignments approach to improve MN construction. The Transitive Alignment technique leverages the MN topology to realign MS/MS spectra of related compounds that differ by multiple structural modifications. Combining this Transitive Alignments approach with pseudoclique finding, a method for identifying highly connected groups of nodes in a network, resulted in more complete and higher-quality molecular families. Finally, we also introduce a targeted network construction technique called induced transitive alignments where we demonstrate effectiveness on a real world natural product discovery application. We release this transitive alignment technique as a high-throughput workflow that can be used by the wider research community.
Collapse
Affiliation(s)
- Xianghu Wang
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| | - Michael Strobel
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| | - Allegra T Aron
- Department of Chemistry and Biochemistry, University of Denver, 2101 East Wesley Ave, Denver, Colorado 80210, United States
| | - Vanessa V Phelan
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Anschutz Medical Campus, 12850 E Montview Blvd, Aurora, Colorado 80045, United States
| | - Deepa D Acharya
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Christopher J Brown
- Regulatory Science, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Ken Clevenger
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Jie Hu
- Data Science, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Ashley Kretsch
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Elizabeth H Mahood
- Data Science, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Carla Menegatti
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Quanbo Xiong
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| |
Collapse
|
19
|
Coler EA, Melnik A, Lotfi A, Moradi D, Ahiadu B, Gomes PWP, Patan A, Dorrestein PC, Barnes S, Boginski V, Semenov A, Aksenov AA. Ordering molecular diversity in untargeted metabolomics via molecular community networking. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.02.606356. [PMID: 39131284 PMCID: PMC11312580 DOI: 10.1101/2024.08.02.606356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Nature's molecular diversity is not random but displays intricate organization stemming from biological necessity. Molecular networking connects metabolites with structural similarity, enabling molecular discoveries from mass spectrometry data using arbitrary similarity thresholds that can fracture natural metabolite families. We present molecular community networking (MCN), that optimizes connectivity for each metabolite, rescuing lost relationships and capturing otherwise "hidden" metabolite connections. Using MCN, we demonstrate the discovery of novel dipeptide-conjugated bile acids.
Collapse
Affiliation(s)
| | - Alexey Melnik
- Department of Chemistry, University of Connecticut, Storrs, CT, USA
- Arome Science Inc., Farmington, CT, USA
| | - Ali Lotfi
- Department of Chemistry, University of Connecticut, Storrs, CT, USA
| | - Dana Moradi
- Department of Chemistry, University of Connecticut, Storrs, CT, USA
| | | | - Paulo Wender Portal Gomes
- Collaborative Mass Spectrometry innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, San Diego, CA, USA
| | - Abubaker Patan
- Collaborative Mass Spectrometry innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, San Diego, CA, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, San Diego, CA, USA
| | - Stephen Barnes
- Department of Pharmacology and Toxicology, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Vladimir Boginski
- Department of Industrial Engineering & Management Systems, University of Central Florida, Orlando, FL, USA
| | - Alexander Semenov
- Department of Industrial & Systems Engineering, University of Florida, Gainesville, FL, USA
| | - Alexander A Aksenov
- Department of Chemistry, University of Connecticut, Storrs, CT, USA
- Arome Science Inc., Farmington, CT, USA
| |
Collapse
|
20
|
Bui-Thi D, Liu Y, Lippens JL, Laukens K, De Vijlder T. TransExION: a transformer based explainable similarity metric for comparing IONS in tandem mass spectrometry. J Cheminform 2024; 16:61. [PMID: 38807166 PMCID: PMC11134763 DOI: 10.1186/s13321-024-00858-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 05/12/2024] [Indexed: 05/30/2024] Open
Abstract
Small molecule identification is a crucial task in analytical chemistry and life sciences. One of the most commonly used technologies to elucidate small molecule structures is mass spectrometry. Spectral library search of product ion spectra (MS/MS) is a popular strategy to identify or find structural analogues. This approach relies on the assumption that spectral similarity and structural similarity are correlated. However, popular spectral similarity measures, usually calculated based on identical fragment matches between the MS/MS spectra, do not always accurately reflect the structural similarity. In this study, we propose TransExION, a Transformer based Explainable similarity metric for IONS. TransExION detects related fragments between MS/MS spectra through their mass difference and uses these to estimate spectral similarity. These related fragments can be nearly identical, but can also share a substructure. TransExION also provides a post-hoc explanation of its estimation, which can be used to support scientists in evaluating the spectral library search results and thus in structure elucidation of unknown molecules. Our model has a Transformer based architecture and it is trained on the data derived from GNPS MS/MS libraries. The experimental results show that it improves existing spectral similarity measures in searching and interpreting structural analogues as well as in molecular networking. SCIENTIFIC CONTRIBUTION: We propose a transformer-based spectral similarity metrics that improves the comparison of small molecule tandem mass spectra. We provide a post hoc explanation that can serve as a good starting point for unknown spectra annotation based on database spectra.
Collapse
Affiliation(s)
- Danh Bui-Thi
- Computer Science Department, University of Antwerp, Middelheimlaan 1, 2020, Antwerp, Belgium
| | - Youzhong Liu
- Therapeutic Development and Supply, Janssen Pharmaceutica N.V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Jennifer L Lippens
- Therapeutic Development and Supply, Janssen Pharmaceutica N.V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Kris Laukens
- Computer Science Department, University of Antwerp, Middelheimlaan 1, 2020, Antwerp, Belgium
| | - Thomas De Vijlder
- Therapeutic Development and Supply, Janssen Pharmaceutica N.V., Turnhoutseweg 30, 2340, Beerse, Belgium.
| |
Collapse
|
21
|
Perez de Souza L, Fernie AR. Computational methods for processing and interpreting mass spectrometry-based metabolomics. Essays Biochem 2024; 68:5-13. [PMID: 37999335 PMCID: PMC11065554 DOI: 10.1042/ebc20230019] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/10/2023] [Accepted: 11/15/2023] [Indexed: 11/25/2023]
Abstract
Metabolomics has emerged as an indispensable tool for exploring complex biological questions, providing the ability to investigate a substantial portion of the metabolome. However, the vast complexity and structural diversity intrinsic to metabolites imposes a great challenge for data analysis and interpretation. Liquid chromatography mass spectrometry (LC-MS) stands out as a versatile technique offering extensive metabolite coverage. In this mini-review, we address some of the hurdles posed by the complex nature of LC-MS data, providing a brief overview of computational tools designed to help tackling these challenges. Our focus centers on two major steps that are essential to most metabolomics investigations: the translation of raw data into quantifiable features, and the extraction of structural insights from mass spectra to facilitate metabolite identification. By exploring current computational solutions, we aim at providing a critical overview of the capabilities and constraints of mass spectrometry-based metabolomics, while introduce some of the most recent trends in data processing and analysis within the field.
Collapse
Affiliation(s)
- Leonardo Perez de Souza
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
- Center for Plant Systems Biology and Biotechnology, 4000 Plovdiv, Bulgaria
| |
Collapse
|
22
|
Ellin NR, Guo Y, Miranda-Quintana RA, Prentice BM. Extended similarity methods for efficient data mining in imaging mass spectrometry. DIGITAL DISCOVERY 2024; 3:805-817. [PMID: 38638647 PMCID: PMC11022984 DOI: 10.1039/d3dd00165b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 03/19/2024] [Indexed: 04/20/2024]
Abstract
Imaging mass spectrometry is a label-free imaging modality that allows for the spatial mapping of many compounds directly in tissues. In an imaging mass spectrometry experiment, a raster of the tissue surface produces a mass spectrum at each sampled x, y position, resulting in thousands of individual mass spectra, each comprising a pixel in the resulting ion images. However, efficient analysis of imaging mass spectrometry datasets can be challenging due to the hyperspectral characteristics of the data. Each spectrum contains several thousand unique compounds at discrete m/z values that result in unique ion images, which demands robust and efficient algorithms for searching, statistical analysis, and visualization. Some traditional post-processing techniques are fundamentally ill-equipped to dissect these types of data. For example, while principal component analysis (PCA) has long served as a useful tool for mining imaging mass spectrometry datasets to identify correlated analytes and biological regions of interest, the interpretation of the PCA scores and loadings can be non-trivial. The loadings often contain negative peaks in the PCA-derived pseudo-spectra, which are difficult to ascribe to underlying tissue biology. Herein, we have utilized extended similarity indices to streamline the interpretation of imaging mass spectrometry data. This novel workflow uses PCA as a pixel-selection method to parse out the most and least correlated pixels, which are then compared using the extended similarity indices. The extended similarity indices complement PCA by removing all non-physical artifacts and streamlining the interpretation of large volumes of imaging mass spectrometry spectra simultaneously. The linear complexity, O(N), of these indices suggests that large imaging mass spectrometry datasets can be analyzed in a 1 : 1 scale of time and space with respect to the size of the input data. The extended similarity indices algorithmic workflow is exemplified here by identifying discrete biological regions of mouse brain tissue.
Collapse
Affiliation(s)
- Nicholas R Ellin
- Department of Chemistry, University of Florida Gainesville FL 32611-7200 USA
| | - Yingchan Guo
- Department of Chemistry, University of Florida Gainesville FL 32611-7200 USA
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry, University of Florida Gainesville FL 32611-7200 USA
- Quantum Theory Project, University of Florida Gainesville FL 32611-7200 USA
| | - Boone M Prentice
- Department of Chemistry, University of Florida Gainesville FL 32611-7200 USA
| |
Collapse
|
23
|
Mohanty I, Mannochio-Russo H, Schweer JV, El Abiead Y, Bittremieux W, Xing S, Schmid R, Zuffa S, Vasquez F, Muti VB, Zemlin J, Tovar-Herrera OE, Moraïs S, Desai D, Amin S, Koo I, Turck CW, Mizrahi I, Kris-Etherton PM, Petersen KS, Fleming JA, Huan T, Patterson AD, Siegel D, Hagey LR, Wang M, Aron AT, Dorrestein PC. The underappreciated diversity of bile acid modifications. Cell 2024; 187:1801-1818.e20. [PMID: 38471500 DOI: 10.1016/j.cell.2024.02.019] [Citation(s) in RCA: 58] [Impact Index Per Article: 58.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 11/30/2023] [Accepted: 02/15/2024] [Indexed: 03/14/2024]
Abstract
The repertoire of modifications to bile acids and related steroidal lipids by host and microbial metabolism remains incompletely characterized. To address this knowledge gap, we created a reusable resource of tandem mass spectrometry (MS/MS) spectra by filtering 1.2 billion publicly available MS/MS spectra for bile-acid-selective ion patterns. Thousands of modifications are distributed throughout animal and human bodies as well as microbial cultures. We employed this MS/MS library to identify polyamine bile amidates, prevalent in carnivores. They are present in humans, and their levels alter with a diet change from a Mediterranean to a typical American diet. This work highlights the existence of many more bile acid modifications than previously recognized and the value of leveraging public large-scale untargeted metabolomics data to discover metabolites. The availability of a modification-centric bile acid MS/MS library will inform future studies investigating bile acid roles in health and disease.
Collapse
Affiliation(s)
- Ipsita Mohanty
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Helena Mannochio-Russo
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Joshua V Schweer
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Department of Chemistry and Biochemistry, University of California, San Diego, San Diego, CA, USA
| | - Yasin El Abiead
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerpen, Belgium
| | - Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver Campus, Vancouver, BC, Canada
| | - Robin Schmid
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Simone Zuffa
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Felipe Vasquez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Valentina B Muti
- Department of Computer Science and Engineering, University of California, Riverside, Riverside, CA, USA; Department of Chemistry and Biochemistry, University of Denver, Denver, CO 80210, USA
| | - Jasmine Zemlin
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA 92093, USA
| | - Omar E Tovar-Herrera
- Department of Life Sciences, Ben-Gurion University of the Negev, Be'er Sheva, Israel; Goldman Sonnenfeldt School of Sustainability and Climate Change, Ben-Gurion University of the Negev, Be'er Sheva 84105, Israel
| | - Sarah Moraïs
- Department of Life Sciences, Ben-Gurion University of the Negev, Be'er Sheva, Israel; Goldman Sonnenfeldt School of Sustainability and Climate Change, Ben-Gurion University of the Negev, Be'er Sheva 84105, Israel
| | - Dhimant Desai
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA, USA
| | - Shantu Amin
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA, USA
| | - Imhoi Koo
- Center for Molecular Toxicology and Carcinogenesis, Department of Veterinary and Biomedical Sciences, Pennsylvania State University, University Park, PA, USA
| | - Christoph W Turck
- Max Planck Institute of Psychiatry, Proteomics and Biomarkers, Kraepelinstrasse 2-10, Munich 80804, Germany; Key Laboratory of Animal Models and Human Disease Mechanisms of Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650201, China
| | - Itzhak Mizrahi
- Department of Life Sciences, Ben-Gurion University of the Negev, Be'er Sheva, Israel; Goldman Sonnenfeldt School of Sustainability and Climate Change, Ben-Gurion University of the Negev, Be'er Sheva 84105, Israel
| | - Penny M Kris-Etherton
- Department of Nutritional Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Kristina S Petersen
- Department of Nutritional Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Jennifer A Fleming
- Department of Nutritional Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Tao Huan
- Department of Chemistry, Faculty of Science, University of British Columbia, Vancouver Campus, Vancouver, BC, Canada
| | - Andrew D Patterson
- Center for Molecular Toxicology and Carcinogenesis, Department of Veterinary and Biomedical Sciences, Pennsylvania State University, University Park, PA, USA
| | - Dionicio Siegel
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Lee R Hagey
- Department of Medicine, University of California, San Diego, San Diego, CA, USA
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California, Riverside, Riverside, CA, USA
| | - Allegra T Aron
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO 80210, USA
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA; Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA; Department of Pharmacology, University of California, San Diego, La Jolla, CA 92093, USA; Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
24
|
Engler Hart C, Kind T, Dorrestein PC, Healey D, Domingo-Fernández D. Weighting Low-Intensity MS/MS Ions and m/ z Frequency for Spectral Library Annotation. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:266-274. [PMID: 38271611 PMCID: PMC10854760 DOI: 10.1021/jasms.3c00353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 12/29/2023] [Accepted: 01/05/2024] [Indexed: 01/27/2024]
Abstract
Calculating spectral similarity is a fundamental step in MS/MS data analysis in untargeted metabolomics experiments, as it facilitates the identification of related spectra and the annotation of compounds. To improve matching accuracy when querying an experimental mass spectrum against a spectral library, previous approaches have proposed increasing peak intensities for high m/z ranges. These high m/z values tend to be smaller in magnitude, yet they offer more crucial information for identifying the chemical structure. Here, we evaluate the impact of using these weights for identifying structurally related compounds and mass spectral library searches. Additionally, we propose a weighting approach that (i) takes into account the frequency of the m/z values within a spectral library in order to assign higher importance to the most common peaks and (ii) increases the intensity of lower peaks, similar to previous approaches. To demonstrate our approach, we applied weighting preprocessing to modified cosine, entropy, and fidelity distance metrics and benchmarked it against previously reported weights. Our results demonstrate how weighting-based preprocessing can assist in annotating the structure of unknown spectra as well as identifying structurally similar compounds. Finally, we examined scenarios in which the utilization of weights resulted in diminished performance, pinpointing spectral features where the application of weights might be detrimental.
Collapse
Affiliation(s)
- Chloe Engler Hart
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | - Tobias Kind
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | - Pieter C. Dorrestein
- Collaborative
Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and
Pharmaceutical Sciences, University of California
San Diego, La Jolla, California 92093, United States
| | - David Healey
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | | |
Collapse
|
25
|
Bittremieux W, Avalon NE, Thomas SP, Kakhkhorov SA, Aksenov AA, Gomes PWP, Aceves CM, Caraballo-Rodríguez AM, Gauglitz JM, Gerwick WH, Huan T, Jarmusch AK, Kaddurah-Daouk RF, Kang KB, Kim HW, Kondić T, Mannochio-Russo H, Meehan MJ, Melnik AV, Nothias LF, O'Donovan C, Panitchpakdi M, Petras D, Schmid R, Schymanski EL, van der Hooft JJJ, Weldon KC, Yang H, Xing S, Zemlin J, Wang M, Dorrestein PC. Open access repository-scale propagated nearest neighbor suspect spectral library for untargeted metabolomics. Nat Commun 2023; 14:8488. [PMID: 38123557 PMCID: PMC10733301 DOI: 10.1038/s41467-023-44035-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/28/2023] [Indexed: 12/23/2023] Open
Abstract
Despite the increasing availability of tandem mass spectrometry (MS/MS) community spectral libraries for untargeted metabolomics over the past decade, the majority of acquired MS/MS spectra remain uninterpreted. To further aid in interpreting unannotated spectra, we created a nearest neighbor suspect spectral library, consisting of 87,916 annotated MS/MS spectra derived from hundreds of millions of MS/MS spectra originating from published untargeted metabolomics experiments. Entries in this library, or "suspects," were derived from unannotated spectra that could be linked in a molecular network to an annotated spectrum. Annotations were propagated to unknowns based on structural relationships to reference molecules using MS/MS-based spectrum alignment. We demonstrate the broad relevance of the nearest neighbor suspect spectral library through representative examples of propagation-based annotation of acylcarnitines, bacterial and plant natural products, and drug metabolism. Our results also highlight how the library can help to better understand an Alzheimer's brain phenotype. The nearest neighbor suspect spectral library is openly available for download or for data analysis through the GNPS platform to help investigators hypothesize candidate structures for unknown MS/MS spectra in untargeted metabolomics data.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020, Antwerpen, Belgium.
| | - Nicole E Avalon
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, 92093, USA
| | - Sydney P Thomas
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Sarvar A Kakhkhorov
- Laboratory of Physical and Chemical Methods of Research, Center for Advanced Technologies, Tashkent, 100174, Uzbekistan
- Department of Food Science, Faculty of Science, University of Copenhagen, Rolighedsvej 26, 1958, Frederiksberg C, Denmark
| | - Alexander A Aksenov
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Chemistry, University of Connecticut, Storrs, CT, 06269, USA
- Arome Science inc., Farmington, CT, 06032, USA
| | - Paulo Wender P Gomes
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Christine M Aceves
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Andrés Mauricio Caraballo-Rodríguez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Julia M Gauglitz
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - William H Gerwick
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
| | - Tao Huan
- Department of Chemistry, University of British Columbia, Vancouver, BC, V6T 1Z1, Canada
| | - Alan K Jarmusch
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Immunity, Inflammation, and Disease Laboratory, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, 27709, USA
| | - Rima F Kaddurah-Daouk
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, 27701, USA
- Department of Medicine, Duke University, Durham, NC, 27710, USA
- Duke Institute of Brain Sciences, Duke University, Durham, NC, 27710, USA
| | - Kyo Bin Kang
- College of Pharmacy and Research Institute of Pharmaceutical Sciences, Sookmyung Women's University, Seoul, 04310, Korea
| | - Hyun Woo Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University, Goyang, 10326, Korea
| | - Todor Kondić
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4367, Belvaux, Luxembourg
| | - Helena Mannochio-Russo
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Biochemistry and Organic Chemistry, Institute of Chemistry, São Paulo State University, Araraquara, 14800-901, Brazil
| | - Michael J Meehan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Alexey V Melnik
- Department of Chemistry, University of Connecticut, Storrs, CT, 06269, USA
- Arome Science inc., Farmington, CT, 06032, USA
| | - Louis-Felix Nothias
- Université Côte d'Azur, CNRS, ICN, Nice, France
- Interdisciplinary Institute for Artificial Intelligence (3iA) Côte d'Azur, Nice, France
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Morgan Panitchpakdi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Daniel Petras
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Interfaculty Institute of Microbiology and Infection Medicine, University of Tuebingen, 72076, Tuebingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, 92507, USA
| | - Robin Schmid
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4367, Belvaux, Luxembourg
| | - Justin J J van der Hooft
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Bioinformatics Group, Wageningen University & Research, 6708 PB, Wageningen, The Netherlands
| | - Kelly C Weldon
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Heejung Yang
- Laboratory of Natural Products Chemistry, College of Pharmacy, Kangwon National University, Chuncheon, 24341, Korea
| | - Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Chemistry, University of British Columbia, Vancouver, BC, V6T 1Z1, Canada
| | - Jasmine Zemlin
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California Riverside, Riverside, CA, 92507, USA
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
26
|
Jung YH, Kim JH. Feature-Based Molecular Networking Combined with Multivariate Analysis for the Characterization of Glutathione Adducts as a Smoking Gun of Bioactivation. Anal Chem 2023; 95:17450-17457. [PMID: 37976220 DOI: 10.1021/acs.analchem.3c01094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
Feature-based molecular networking (FBMN) is a powerful analytical tool for mass spectrometry (MS)-based untargeted metabolomics data analysis. FBMN plays an important role in drug metabolism studies, enabling the visualization of complex metabolomics data to achieve metabolite characterization. In this study, we propose a strategy for the characterization of glutathione (GSH) adducts formed via in vitro metabolic activation using FBMN assisted by multivariate analysis (MVA). Acetaminophen was used as a model substrate for method development, and the practical potential of the method was investigated by its application to 2-aminophenol (2-AP) and 2,4-dinitrochlorobenzene (DNCB). Two 2-AP GSH adducts and one DNCB GSH adduct were successfully characterized by forming networks with GSH even though the mass spectral information obtained for the parent compound was deficient. False positives were effectively filtered out by the variable influence on projection cutoff criteria obtained from orthogonal partial least-squares-discriminant analysis. The GSH adducts formed by enzymatic or nonenzymatic reactions were intuitively distinguished by the pie chart of FBMN results. In summary, our approach effectively characterizes GSH adducts, which serve as compelling evidence of bioactivation. It can be widely utilized to enhance risk assessment in the context of drug metabolism.
Collapse
Affiliation(s)
- Young-Heun Jung
- College of Pharmacy, Yeungnam University, Gyeongsan 38541, Republic of Korea
| | - Ju-Hyun Kim
- College of Pharmacy, Yeungnam University, Gyeongsan 38541, Republic of Korea
| |
Collapse
|
27
|
Ni X, Murray NB, Archer-Hartmann S, Pepi LE, Helm RF, Azadi P, Hong P. Toward Automatic Inference of Glycan Linkages Using MS n and Machine Learning─Proof of Concept Using Sialic Acid Linkages. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:2127-2135. [PMID: 37621000 PMCID: PMC10557947 DOI: 10.1021/jasms.3c00132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 08/10/2023] [Accepted: 08/15/2023] [Indexed: 08/26/2023]
Abstract
Glycosidic linkages in oligosaccharides play essential roles in determining their chemical properties and biological activities. MSn has been widely used to infer glycosidic linkages but requires a substantial amount of starting material, which limits its application. In addition, there is a lack of rigorous research on what MSn protocols are proper for characterizing glycosidic linkages. In this work, to deliver high-quality experimental data and analysis results, we propose a machine learning-based framework to establish appropriate MSn protocols and build effective data analysis methods. We demonstrate the proof-of-principle by applying our approach to elucidate sialic acid linkages (α2'-3' and α2'-6') in a set of sialyllactose standards and NIST sialic acid-containing N-glycans as well as identify several protocol configurations for producing high-quality experimental data. Our companion data analysis method achieves nearly 100% accuracy in classifying α2'-3' vs α2'-6' using MS5, MS4, MS3, or even MS2 spectra alone. The ability to determine glycosidic linkages using MS2 or MS3 is significant as it requires substantially less sample, enabling linkage analysis for quantity-limited natural glycans and synthesized materials, as well as shortens the overall experimental time. MS2 is also more amenable than MS3/4/5 to automation when coupled to direct infusion or LC-MS. Additionally, our method can predict the ratio of α2'-3' and α2'-6' in a mixture with 8.6% RMSE (root-mean-square error) across data sets using MS5 spectra. We anticipate that our framework will be generally applicable to analysis of other glycosidic linkages.
Collapse
Affiliation(s)
- Xinyi Ni
- Computer
Science, Brandeis University, Waltham, Massachusetts 02453, United States
| | - Nathan B. Murray
- Complex
Carbohydrate Research Center, University
of Georgia, Athens, Georgia 30602, United States
| | | | - Lauren E. Pepi
- Complex
Carbohydrate Research Center, University
of Georgia, Athens, Georgia 30602, United States
| | - Richard F. Helm
- Department
of Biochemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Parastoo Azadi
- Complex
Carbohydrate Research Center, University
of Georgia, Athens, Georgia 30602, United States
| | - Pengyu Hong
- Computer
Science, Brandeis University, Waltham, Massachusetts 02453, United States
| |
Collapse
|
28
|
Miller A, York EM, Stopka SA, Martínez-François JR, Hossain MA, Baquer G, Regan MS, Agar NYR, Yellen G. Spatially resolved metabolomics and isotope tracing reveal dynamic metabolic responses of dentate granule neurons with acute stimulation. Nat Metab 2023; 5:1820-1835. [PMID: 37798473 PMCID: PMC10626993 DOI: 10.1038/s42255-023-00890-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 08/09/2023] [Indexed: 10/07/2023]
Abstract
Neuronal activity creates an intense energy demand that must be met by rapid metabolic responses. To investigate metabolic adaptations in the neuron-enriched dentate granule cell (DGC) layer within its native tissue environment, we employed murine acute hippocampal brain slices, coupled with fast metabolite preservation and followed by mass spectrometry (MS) imaging, to generate spatially resolved metabolomics and isotope-tracing data. Here we show that membrane depolarization induces broad metabolic changes, including increased glycolytic activity in DGCs. Increased glucose metabolism in response to stimulation is accompanied by mobilization of endogenous inosine into pentose phosphates via the action of purine nucleotide phosphorylase (PNP). The PNP reaction is an integral part of the neuronal response to stimulation, because inhibition of PNP leaves DGCs energetically impaired during recovery from strong activation. Performing MS imaging on brain slices bridges the gap between live-cell physiology and the deep chemical analysis enabled by MS.
Collapse
Affiliation(s)
- Anne Miller
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Center for Pathobiochemistry and Genetics, Medical University of Vienna, Vienna, Austria
| | - Elisa M York
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Sylwia A Stopka
- Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Md Amin Hossain
- Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Gerard Baquer
- Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Michael S Regan
- Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Nathalie Y R Agar
- Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
| | - Gary Yellen
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
29
|
Li Y, Fiehn O. Flash entropy search to query all mass spectral libraries in real time. Nat Methods 2023; 20:1475-1478. [PMID: 37735567 PMCID: PMC11511675 DOI: 10.1038/s41592-023-02012-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 08/15/2023] [Indexed: 09/23/2023]
Abstract
Public repositories of metabolomics mass spectra encompass more than 1 billion entries. With open search, dot product or entropy similarity, comparisons of a single tandem mass spectrometry spectrum take more than 8 h. Flash entropy search speeds up calculations more than 10,000 times to query 1 billion spectra in less than 2 s, without loss in accuracy. It benefits from using multiple threads and GPU calculations. This algorithm can fully exploit large spectral libraries with little memory overhead for any mass spectrometry laboratory.
Collapse
Affiliation(s)
- Yuanyue Li
- West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, UC Davis Genome Center, University of California, Davis, CA, USA.
| |
Collapse
|
30
|
Carvalho ARV, Reis JDE, Gomes PWP, Ferraz AC, Mardegan HA, Menegatto MBDS, Souza Lima RL, de Sarges MRV, Pamplona SDGSR, Jeunon Gontijo KS, de Magalhães JC, da Silva MN, Magalhães CLDB, Silva CYYE. Untargeted-based metabolomics analysis and in vitro/in silico antiviral activity of extracts from Phyllanthus brasiliensis (Aubl.) Poir. PHYTOCHEMICAL ANALYSIS : PCA 2023; 34:869-883. [PMID: 37403427 DOI: 10.1002/pca.3259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 06/05/2023] [Accepted: 06/14/2023] [Indexed: 07/06/2023]
Abstract
INTRODUCTION This study describes the molecular profile and the potential antiviral activity of extracts from Phyllanthus brasiliensis, a plant widely found in the Brazilian Amazon. The research aims to shed light on the potential use of this species as a natural antiviral agent. METHODS The extracts were analysed using liquid chromatography-mass spectrometry (LC-MS) system, a potent analytical technique to discover drug candidates. In the meantime, in vitro antiviral assays were performed against Mayaro, Oropouche, Chikungunya, and Zika viruses. In addition, the antiviral activity of annotated compounds was predicted by in silico methods. RESULTS Overall, 44 compounds were annotated in this study. The results revealed that P. brasiliensis has a high content of fatty acids, flavones, flavan-3-ols, and lignans. Furthermore, in vitro assays revealed potent antiviral activity against different arboviruses, especially lignan-rich extracts against Zika virus (ZIKV), as follows: methanolic extract from bark (MEB) [effective concentration for 50% of the cells (EC50 ) = 0.80 μg/mL, selectivity index (SI) = 377.59], methanolic extract from the leaf (MEL) (EC50 = 0.84 μg/mL, SI = 297.62), and hydroalcoholic extract from the leaf (HEL) (EC50 = 1.36 μg/mL, SI = 735.29). These results were supported by interesting in silico prediction, where tuberculatin (a lignan) showed a high antiviral activity score. CONCLUSIONS Phyllanthus brasiliensis extracts contain metabolites that could be a new kick-off point for the discovery of candidates for antiviral drug development, with lignans becoming a promising trend for further virology research.
Collapse
Affiliation(s)
- Alice Rhelly V Carvalho
- Laboratory of Liquid Chromatography, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
- Faculty of Pharmacy, Institute of Health Sciences, Federal University of Pará, Belém, Brazil
| | - José Diogo E Reis
- Laboratory of Liquid Chromatography, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
- Chemistry Post-Graduation Programme, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
| | - Paulo Wender P Gomes
- Collaborative Mass Spectrometry Innovation Centre, University of California San Diego, La Jolla, California, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, USA
| | - Ariane Coelho Ferraz
- Programa de Pós-Graduação em Ciências Biológicas, Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, Brazil
| | - Horrana A Mardegan
- Laboratory of Liquid Chromatography, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
- Pharmaceutical Sciences Post-Graduation Programme, Institute of Health Sciences, Federal University of Pará, Belém, Brazil
| | - Marília Bueno da Silva Menegatto
- Programa de Pós-Graduação em Ciências Biológicas, Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, Brazil
| | - Rafaela Lameira Souza Lima
- Programa de Pós-Graduação em Ciências Biológicas, Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, Brazil
| | - Maria Rosilda V de Sarges
- Laboratory of Liquid Chromatography, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
- Pharmaceutical Sciences Post-Graduation Programme, Institute of Health Sciences, Federal University of Pará, Belém, Brazil
| | - Sônia das G S R Pamplona
- Laboratory of Liquid Chromatography, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
- Chemistry Post-Graduation Programme, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
| | | | - José Carlos de Magalhães
- Programa de Pós-Graduação em Biotecnologia, Universidade Federal de São João del-Rei, São João del Rei, Brazil
| | - Milton N da Silva
- Laboratory of Liquid Chromatography, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
- Chemistry Post-Graduation Programme, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
| | - Cintia Lopes de Brito Magalhães
- Programa de Pós-Graduação em Ciências Biológicas, Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, Brazil
- Programa de Pós-Graduação em Biotecnologia, Universidade Federal de São João del-Rei, São João del Rei, Brazil
- Programa de Pós-Graduação em Biotecnologia, Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, Brazil
| | - Consuelo Yumiko Yoshioka E Silva
- Laboratory of Liquid Chromatography, Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Brazil
- Faculty of Pharmacy, Institute of Health Sciences, Federal University of Pará, Belém, Brazil
- Pharmaceutical Sciences Post-Graduation Programme, Institute of Health Sciences, Federal University of Pará, Belém, Brazil
| |
Collapse
|
31
|
Moorthy AS, Erisman EP, Kearsley AJ, Liang Y, Sisco E, Wallace WE. On the challenge of unambiguous identification of fentanyl analogs: Exploring measurement diversity using standard reference mass spectral libraries. J Forensic Sci 2023; 68:1494-1503. [PMID: 37431311 PMCID: PMC10517722 DOI: 10.1111/1556-4029.15322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 06/13/2023] [Accepted: 06/22/2023] [Indexed: 07/12/2023]
Abstract
Fentanyl analogs are a class of designer drugs that are particularly challenging to unambiguously identify due to the mass spectral and retention time similarities of unique compounds. In this paper, we use agglomerative hierarchical clustering to explore the measurement diversity of fentanyl analogs and better understand the challenge of unambiguous identifications using analytical techniques traditionally available to drug chemists. We consider four measurements in particular: gas chromatography retention indices, electron ionization mass spectra, electrospray ionization tandem mass spectra, and direct analysis in real time mass spectra. Our analysis demonstrates how simultaneously considering data from multiple measurement techniques increases the observable measurement diversity of fentanyl analogs, which can reduce identification ambiguity. This paper further supports the use of multiple analytical techniques to identify fentanyl analogs (among other substances), as is recommended by the Scientific Working Group for the Analysis of Seized Drugs (SWGDRUG).
Collapse
Affiliation(s)
- Arun S Moorthy
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
| | - Edward P Erisman
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
| | - Anthony J Kearsley
- Mathematical Analysis and Modeling Group, Applied and Computational Mathematics Division, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
| | - Yuxue Liang
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
| | - Edward Sisco
- Surface and Trace Chemical Analysis Group, Materials Measurement Science Division, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
| | - William E Wallace
- Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, Maryland, USA
| |
Collapse
|
32
|
Ellin NR, Miranda-Quintana RA, Prentice BM. Extended Similarity Methods for Efficient Data Mining in Imaging Mass Spectrometry. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.27.550838. [PMID: 37546817 PMCID: PMC10402165 DOI: 10.1101/2023.07.27.550838] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Imaging mass spectrometry is a label-free imaging modality that allows for the spatial mapping of many compounds directly in tissues. In an imaging mass spectrometry experiment, a raster of the tissue surface produces a mass spectrum at each sampled x , y position, resulting in thousands of individual mass spectra, each comprising a pixel in the resulting ion images. However, efficient analysis of imaging mass spectrometry datasets can be challenging due to the hyperspectral characteristics of the data. Each spectrum contains several thousand unique compounds at discrete m/z values that result in unique ion images, which demands robust and efficient algorithms for searching, statistical analysis, and visualization. Some traditional post-processing techniques are fundamentally ill-equipped to dissect these types of data. For example, while principal component analysis (PCA) has long served as a useful tool for mining imaging mass spectrometry datasets to identify correlated analytes and biological regions of interest, the interpretation of the PCA scores and loadings can be non-trivial. The loadings often containing negative peaks in the PCA-derived pseudo-spectra, which are difficult to ascribe to underlying tissue biology. Herein, we have utilized extended similarity indices to streamline the interpretation of imaging mass spectrometry data. This novel workflow uses PCA as a pixel-selection method to parse out the most and least correlated pixels, which are then compared using the extended similarity indices. The extended similarity indices complement PCA by removing all non-physical artifacts and streamlining the interpretation of large volumes of IMS spectra simultaneously. The linear complexity, O ( N ) , of these indices suggests that large imaging mass spectrometry datasets can be analyzed in a 1:1 scale of time and space with respect to the size of the input data. The extended similarity indices algorithmic workflow is exemplified here by identifying discrete biological regions of mouse brain tissue.
Collapse
Affiliation(s)
- Nicholas R Ellin
- Department of Chemistry, University of Florida, Gainesville, FL, 32611-7200; USA
| | - Ramón Alain Miranda-Quintana
- Department of Chemistry, University of Florida, Gainesville, FL, 32611-7200; USA
- Quantum Theory Project, University of Florida, Gainesville, FL, 32611-7200; USA
| | - Boone M Prentice
- Department of Chemistry, University of Florida, Gainesville, FL, 32611-7200; USA
| |
Collapse
|
33
|
Miller A, York E, Stopka S, Martínez-François J, Hossain MA, Baquer G, Regan M, Agar N, Yellen G. Spatially resolved metabolomics and isotope tracing reveal dynamic metabolic responses of dentate granule neurons with acute stimulation. RESEARCH SQUARE 2023:rs.3.rs-2276903. [PMID: 37546759 PMCID: PMC10402263 DOI: 10.21203/rs.3.rs-2276903/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Neuronal activity creates an intense energy demand that must be met by rapid metabolic responses. To investigate metabolic adaptations in the neuron-enriched dentate granule cell (DGC) layer within its native tissue environment, we employed murine acute hippocampal brain slices coupled with fast metabolite preservation, followed by mass spectrometry imaging (MALDI-MSI) to generate spatially resolved metabolomics and isotope tracing data. Here we show that membrane depolarization induces broad metabolic changes, including increased glycolytic activity in DGCs. Increased glucose metabolism in response to stimulation is accompanied by mobilization of endogenous inosine into pentose phosphates, via the action of purine nucleotide phosphorylase (PNP). The PNP reaction is an integral part of the neuronal response to stimulation, as inhibiting PNP leaves DGCs energetically impaired during recovery from strong activation. Performing MSI on brain slices bridges the gap between live cell physiology and the deep chemical analysis enabled by mass spectrometry.
Collapse
|
34
|
Mehnert S, Davidson JT, Adeoye A, Lowe BD, Ruiz EA, King JR, Jackson GP. Expert Algorithm for Substance Identification Using Mass Spectrometry: Application to the Identification of Cocaine on Different Instruments Using Binary Classification Models. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:1235-1247. [PMID: 37254938 PMCID: PMC10326919 DOI: 10.1021/jasms.3c00090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 05/09/2023] [Accepted: 05/15/2023] [Indexed: 06/01/2023]
Abstract
This is the second of two manuscripts describing how general linear modeling (GLM) of a selection of the most abundant normalized fragment ion abundances of replicate mass spectra from one laboratory can be used in conjunction with binary classifiers to enable specific and selective identifications with reportable error rates of spectra from other laboratories. Here, the proof-of-concept uses a training set of 128 replicate cocaine spectra from one crime laboratory as the basis of GLM modeling. GLM models for the 20 most abundant fragments of cocaine were then applied to 175 additional test/validation cocaine spectra collected in more than a dozen crime laboratories and 716 known negative spectra, which included 10 spectra of three diastereomers of cocaine. Spectral similarity and dissimilarity between the measured and predicted abundances were assessed using a variety of conventional measures, including the mean absolute residual and NIST's spectral similarity score. For each spectral measure, GLM predictions were compared to the traditional exemplar approach, which used the average of the cocaine training set as the consensus spectrum for comparisons. In unsupervised models, EASI provided better than a 95% true positive rate for cocaine with a 0% false positive rate. A supervised binary logistic regression model provided 100% accuracy and no errors using EASI-predicted abundances of only four peaks at m/z 152, 198, 272, and 303. Regardless of the measure of spectral similarity, error rates for identifications using EASI were superior to the traditional exemplar/consensus approach. As a supervised binary classifier, EASI was more reliable than using Mahalanobis distances.
Collapse
Affiliation(s)
- Samantha
A. Mehnert
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - J. Tyler Davidson
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
| | - Alexandra Adeoye
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
| | - Brandon D. Lowe
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Emily A. Ruiz
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Jacob R. King
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| | - Glen P. Jackson
- Department
of Forensic and Investigative Science, West
Virginia University, Morgantown, West Virginia 26506, United States
- C.
Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, United States
| |
Collapse
|
35
|
Cai Y, Zhou Z, Zhu ZJ. Advanced analytical and informatic strategies for metabolite annotation in untargeted metabolomics. Trends Analyt Chem 2022. [DOI: 10.1016/j.trac.2022.116903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
36
|
Nogueira-Lima SHC, Gomes PWP, Navegantes-Lima KC, Reis JDE, Carvalho ARV, Pamplona SDGSR, Muribeca ADJB, da Silva MN, Monteiro MC, e Silva CYY. The Roots of Deguelia nitidula as a Natural Antibacterial Source against Staphylococcus aureus Strains. Metabolites 2022; 12:1083. [PMID: 36355166 PMCID: PMC9696647 DOI: 10.3390/metabo12111083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/08/2022] [Accepted: 10/13/2022] [Indexed: 09/10/2024] Open
Abstract
Deguelia nitidula (Benth.) A.M.G.Azevedo & R.A.Camargo (Fabaceae) is an herbaceous plant distributed in the Brazilian Amazon, and it is called "raiz do sol" (sun roots). On Marajó Island, quilombola communities use its prepared roots to treat skin diseases commonly caused by fungi, viruses, and bacteria. Thus, in this study, the extract, and its fractions from D. nitidula roots were used to perform in vitro cytotoxic and antibacterial assays against Staphylococcus aureus strains. Thereafter, liquid chromatography-mass spectrometry (LC-MS) was used for the metabolite annotation process. The ethanolic extract of D. nitidula roots show significant bactericidal activity against S. aureus with IC50 82 μg.mL-1 and a selectivity index (SI) of 21.35. Furthermore, the SREFr2 and SREFr3 fractions show a potent bactericidal activity, i.e., MIC of 46.8 μg.mL-1 for both, and MBC of 375 and 93.7 μg.mL-1, respectively. As showcased, SREFr3 shows safe and effective antibacterial activity mainly in respect to the excellent selectivity index (SI = 82.06). On the other hand, SREFr2 shows low selectivity (SI = 6.8), which characterizes it as not safe for therapeutic use. Otherwise, due to a limited amount of reference MS2 spectra in public libraries, up to now, it was not possible to perform a complete metabolite annotation. Despite that, our antibacterial results for SREFr3 and correlated substructures of amino acid derivatives show that the roots of D. nitidula are a natural source of specialized metabolites, which can be isolated in the future, and then used as a support for further bio-guided research, as well as natural drug development.
Collapse
Affiliation(s)
| | - Paulo Wender P. Gomes
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, San Diego, CA 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA 92093, USA
| | - Kely C. Navegantes-Lima
- Institute of Health Sciences, Postgraduate Program in Neuroscience and Molecular Biology, Federal University of Pará, Belém 66075-110, Brazil
| | - José Diogo E. Reis
- Institute of Exact and Natural Sciences, Postgraduate Program in Chemistry, Federal University of Pará, Belém 66075-110, Brazil
| | - Alice Rhelly Veloso Carvalho
- Institute of Health Sciences, Faculty of Pharmaceutical Sciences, Federal University of Pará, Belém 66075-110, Brazil
| | | | - Abraão de Jesus B. Muribeca
- Institute of Exact and Natural Sciences, Postgraduate Program in Chemistry, Federal University of Pará, Belém 66075-110, Brazil
| | - Milton N. da Silva
- Institute of Exact and Natural Sciences, Postgraduate Program in Chemistry, Federal University of Pará, Belém 66075-110, Brazil
| | - Marta C. Monteiro
- Institute of Health Sciences, Postgraduate Program in Pharmaceutical Sciences, Federal University of Pará, Belém 66075-110, Brazil
- Institute of Health Sciences, Postgraduate Program in Neuroscience and Molecular Biology, Federal University of Pará, Belém 66075-110, Brazil
| | - Consuelo Yumiko Yoshioka e Silva
- Institute of Health Sciences, Postgraduate Program in Pharmaceutical Sciences, Federal University of Pará, Belém 66075-110, Brazil
| |
Collapse
|