1
|
van Herwerden D, O’Brien JW, Lege S, Pirok BWJ, Thomas KV, Samanipour S. Cumulative Neutral Loss Model for Fragment Deconvolution in Electrospray Ionization High-Resolution Mass Spectrometry Data. Anal Chem 2023; 95:12247-12255. [PMID: 37549176 PMCID: PMC10448439 DOI: 10.1021/acs.analchem.3c00896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 07/03/2023] [Indexed: 08/09/2023]
Abstract
Clean high-resolution mass spectra (HRMS) are essential to a successful structural elucidation of an unknown feature during nontarget analysis (NTA) workflows. This is a crucial step, particularly for the spectra generated during data-independent acquisition or during direct infusion experiments. The most commonly available tools only take advantage of the time domain for spectral cleanup. Here, we present an algorithm that combines the time domain and mass domain information to perform spectral deconvolution. The algorithm employs a probability-based cumulative neutral loss (CNL) model for fragment deconvolution. The optimized model, with a mass tolerance of 0.005 Da and a scoreCNL threshold of 0.00, was able to achieve a true positive rate (TPr) of 95.0%, a false discovery rate (FDr) of 20.6%, and a reduction rate of 35.4%. Additionally, the CNL model was extensively tested on real samples containing predominantly pesticides at different concentration levels and with matrix effects. Overall, the model was able to obtain a TPr above 88.8% with FD rates between 33 and 79% and reduction rates between 9 and 45%. Finally, the CNL model was compared with the retention time difference method and peak shape correlation analysis, showing that a combination of correlation analysis and the CNL model was the most effective for fragment deconvolution, obtaining a TPr of 84.7%, an FDr of 54.4%, and a reduction rate of 51.0%.
Collapse
Affiliation(s)
- Denice van Herwerden
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Jake W. O’Brien
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Sascha Lege
- Agilent
Technologies Deutschland GmbH, Waldbronn 76337, Germany
| | - Bob W. J. Pirok
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
| | - Kevin V. Thomas
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
| | - Saer Samanipour
- Van
’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1012 WX, The Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Brisbane 4102, Australia
- UvA
Data Science Center, University of Amsterdam, Amsterdam 1012 WP, The Netherlands
| |
Collapse
|
2
|
Du X, Dastmalchi F, Ye H, Garrett TJ, Diller MA, Liu M, Hogan WR, Brochhausen M, Lemas DJ. Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software. Metabolomics 2023; 19:11. [PMID: 36745241 DOI: 10.1007/s11306-023-01974-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 01/20/2023] [Indexed: 02/07/2023]
Abstract
BACKGROUND Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS). AIM OF REVIEW This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software. KEY SCIENTIFIC CONCEPTS OF REVIEW We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.
Collapse
Affiliation(s)
- Xinsong Du
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Farhad Dastmalchi
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Hao Ye
- Health Science Center Libraries, University of Florida, Florida, USA
| | - Timothy J Garrett
- Department of Pathology, Immunology and Laboratory Medicine, College of Medicine, University of Florida, Florida, USA
| | - Matthew A Diller
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Mei Liu
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - William R Hogan
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Mathias Brochhausen
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, USA
| | - Dominick J Lemas
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA.
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Florida, Gainesville, United States.
- Center for Perinatal Outcomes Research, University of Florida College of Medicine, Gainesville, United States.
| |
Collapse
|
3
|
Tian Z, Liu F, Li D, Fernie AR, Chen W. Strategies for structure elucidation of small molecules based on LC–MS/MS data from complex biological samples. Comput Struct Biotechnol J 2022; 20:5085-5097. [PMID: 36187931 PMCID: PMC9489805 DOI: 10.1016/j.csbj.2022.09.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 09/03/2022] [Accepted: 09/03/2022] [Indexed: 11/06/2022] Open
Abstract
LC–MS/MS is a major analytical platform for metabolomics, which has become a recent hotspot in the research fields of life and environmental sciences. By contrast, structure elucidation of small molecules based on LC–MS/MS data remains a major challenge in the chemical and biological interpretation of untargeted metabolomics datasets. In recent years, several strategies for structure elucidation using LC–MS/MS data from complex biological samples have been proposed, these strategies can be simply categorized into two types, one based on structure annotation of mass spectra and for the other on retention time prediction. These strategies have helped many scientists conduct research in metabolite-related fields and are indispensable for the development of future tools. Here, we summarized the characteristics of the current tools and strategies for structure elucidation of small molecules based on LC–MS/MS data, and further discussed the directions and perspectives to improve the power of the tools or strategies for structure elucidation.
Collapse
|
4
|
Sementé L, Baquer G, García-Altares M, Correig-Blanchar X, Ràfols P. rMSIannotation: A peak annotation tool for mass spectrometry imaging based on the analysis of isotopic intensity ratios. Anal Chim Acta 2021; 1171:338669. [PMID: 34112434 DOI: 10.1016/j.aca.2021.338669] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/12/2021] [Accepted: 05/20/2021] [Indexed: 11/15/2022]
Abstract
Mass spectrometry imaging (MSI) consist of spatially located spectra with thousands of peaks. Only a fraction of these peaks corresponds to unique monoisotopic peaks, as mass spectra include isotopes, adducts and fragments of compounds. Current peak annotation solutions depend on matching MS features to compounds libraries. We present rMSIannotation, a peak annotation algorithm to annotate carbon isotopes and adducts in metabolomics and lipidomics imaging mass spectrometry datasets without using supporting libraries. rMSIannotation measures and evaluates the intensity ratio between carbon isotopic peaks and models their distribution across the m/z axis of the compounds in the Human Metabolome Database. Monoisotopic peak selection is based on the isotopic likelihood score (ILS) made of three components: image morphology correlation, validation of isotopic intensity ratios, and peak centroid mass deviation. rMSIannotation proposes pairs of peaks that can be adducts based on three scores: isotopic pattern coherence, image correlation and mass error. We validated rMSIannotation with three MALDI-MSI datasets which were manually annotated by experts, and compared the annotations obtained with rMSIannotation and with the METASPACE annotation platform. rMSIannotation replicated more than 90% of the manual annotation reported in FT-ICR datasets and expanded the list of annotated compounds with additional monoisotopic peaks and neutral masses. Finally, we evaluated isotopic peak annotation as a data reduction method for MSI by comparing the results of PCA and k-means segmentation before and after removing non-monoisotopic peaks. The results show that monoisotopic peaks retain most of the biologic variance in the dataset.
Collapse
Affiliation(s)
- Lluc Sementé
- University Rovira I Virgili, Department of Electronic Engineering, Tarragona, Spain
| | - Gerard Baquer
- University Rovira I Virgili, Department of Electronic Engineering, Tarragona, Spain
| | - María García-Altares
- University Rovira I Virgili, Department of Electronic Engineering, Tarragona, Spain; Spanish Biomedical Research Centre in Diabetes and Associated Metabolic Disorders (CIBERDEM), 28029, Madrid, Spain.
| | - Xavier Correig-Blanchar
- University Rovira I Virgili, Department of Electronic Engineering, Tarragona, Spain; Spanish Biomedical Research Centre in Diabetes and Associated Metabolic Disorders (CIBERDEM), 28029, Madrid, Spain; Institut D'Investigació Sanitària Pere Virgili, Tarragona, Spain
| | - Pere Ràfols
- University Rovira I Virgili, Department of Electronic Engineering, Tarragona, Spain; Spanish Biomedical Research Centre in Diabetes and Associated Metabolic Disorders (CIBERDEM), 28029, Madrid, Spain; Institut D'Investigació Sanitària Pere Virgili, Tarragona, Spain
| |
Collapse
|
5
|
Li D, Gaquerel E. Next-Generation Mass Spectrometry Metabolomics Revives the Functional Analysis of Plant Metabolic Diversity. ANNUAL REVIEW OF PLANT BIOLOGY 2021; 72:867-891. [PMID: 33781077 DOI: 10.1146/annurev-arplant-071720-114836] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The remarkable diversity of specialized metabolites produced by plants has inspired several decades of research and nucleated a long list of theories to guide empirical ecological studies. However, analytical constraints and the lack of untargeted processing workflows have long precluded comprehensive metabolite profiling and, consequently, the collection of the critical currencies to test theory predictions for the ecological functions of plant metabolic diversity. Developments in mass spectrometry (MS) metabolomics have revolutionized the large-scale inventory and annotation of chemicals from biospecimens. Hence, the next generation of MS metabolomics propelled by new bioinformatics developments provides a long-awaited framework to revisit metabolism-centered ecological questions, much like the advances in next-generation sequencing of the last two decades impacted all research horizons in genomics. Here, we review advances in plant (computational) metabolomics to foster hypothesis formulation from complex metabolome data. Additionally, we reflect on how next-generation metabolomics could reinvigorate the testing of long-standing theories on plant metabolic diversity.
Collapse
Affiliation(s)
- Dapeng Li
- Department of Molecular Ecology, Max Planck Institute for Chemical Ecology, 07745 Jena, Germany;
| | - Emmanuel Gaquerel
- Institut de Biologie Moléculaire des Plantes du CNRS, Université de Strasbourg, 67084 Strasbourg, France;
| |
Collapse
|
6
|
Kouřil Š, de Sousa J, Václavík J, Friedecký D, Adam T. CROP: correlation-based reduction of feature multiplicities in untargeted metabolomic data. Bioinformatics 2020; 36:2941-2942. [PMID: 31930393 DOI: 10.1093/bioinformatics/btaa012] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 01/02/2020] [Accepted: 01/10/2020] [Indexed: 11/12/2022] Open
Abstract
SUMMARY Untargeted liquid chromatography-high-resolution mass spectrometry analysis produces a large number of features which correspond to the potential compounds in the sample that is analyzed. During the data processing, it is necessary to merge features associated with one compound to prevent multiplicities in the data and possible misidentification. The processing tools that are currently employed use complex algorithms to detect abundances, such as adducts or isotopes. However, most of them are not able to deal with unpredictable adducts and in-source fragments. We introduce a simple open-source R-script CROP based on Pearson pairwise correlations and retention time together with a graphical representation of the correlation network to remove these redundant features. AVAILABILITY AND IMPLEMENTATION The CROP R-script is available online at www.github.com/rendju/CROP under GNU GPL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Štěpán Kouřil
- Laboratory of Metabolomics, Institute of Molecular and Translational Medicine, Palacký University Olomouc, Olomouc 779 00, Czech Republic.,Department of Clinical Biochemistry, University Hospital Olomouc, Olomouc 779 00, Czech Republic
| | - Julie de Sousa
- Laboratory of Metabolomics, Institute of Molecular and Translational Medicine, Palacký University Olomouc, Olomouc 779 00, Czech Republic.,Department of Mathematical Analysis and Applications of Mathematics, Faculty of Science, Palacký University Olomouc, Olomouc 779 00, Czech Republic
| | - Jan Václavík
- Laboratory of Metabolomics, Institute of Molecular and Translational Medicine, Palacký University Olomouc, Olomouc 779 00, Czech Republic
| | - David Friedecký
- Laboratory of Metabolomics, Institute of Molecular and Translational Medicine, Palacký University Olomouc, Olomouc 779 00, Czech Republic.,Department of Clinical Biochemistry, University Hospital Olomouc, Olomouc 779 00, Czech Republic
| | - Tomáš Adam
- Laboratory of Metabolomics, Institute of Molecular and Translational Medicine, Palacký University Olomouc, Olomouc 779 00, Czech Republic.,Department of Clinical Biochemistry, University Hospital Olomouc, Olomouc 779 00, Czech Republic
| |
Collapse
|
7
|
Kachman M, Habra H, Duren W, Wigginton J, Sajjakulnukit P, Michailidis G, Burant C, Karnovsky A. Deep annotation of untargeted LC-MS metabolomics data with Binner. Bioinformatics 2020; 36:1801-1806. [PMID: 31642507 DOI: 10.1093/bioinformatics/btz798] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Revised: 06/20/2019] [Accepted: 10/22/2019] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION When metabolites are analyzed by electrospray ionization (ESI)-mass spectrometry, they are usually detected as multiple ion species due to the presence of isotopes, adducts and in-source fragments. The signals generated by these degenerate features (along with contaminants and other chemical noise) obscure meaningful patterns in MS data, complicating both compound identification and downstream statistical analysis. To address this problem, we developed Binner, a new tool for the discovery and elimination of many degenerate feature signals typically present in untargeted ESI-LC-MS metabolomics data. RESULTS Binner generates feature annotations and provides tools to help users visualize informative feature relationships that can further elucidate the underlying structure of the data. To demonstrate the utility of Binner and to evaluate its performance, we analyzed data from reversed phase LC-MS and hydrophilic interaction chromatography (HILIC) platforms and demonstrated the accuracy of selected annotations using MS/MS. When we compared Binner annotations of 75 compounds previously identified in human plasma samples with annotations generated by three similar tools, we found that Binner achieves superior performance in the number and accuracy of annotations while simultaneously minimizing the number of incorrectly annotated principal ions. Data reduction and pattern exploration with Binner have allowed us to catalog a number of previously unrecognized complex adducts and neutral losses generated during the ionization of molecules in LC-MS. In summary, Binner allows users to explore patterns in their data and to efficiently and accurately eliminate a significant number of the degenerate features typically found in various LC-MS modalities. AVAILABILITY AND IMPLEMENTATION Binner is written in Java and is freely available from http://binner.med.umich.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maureen Kachman
- Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Hani Habra
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - William Duren
- Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan Medical School, Ann Arbor, MI 48109, USA.,Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Janis Wigginton
- Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Peter Sajjakulnukit
- Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - George Michailidis
- Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan Medical School, Ann Arbor, MI 48109, USA.,Department of Statistics, University of Florida, Gainesville, FL 32611, USA
| | - Charles Burant
- Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan Medical School, Ann Arbor, MI 48109, USA.,Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Alla Karnovsky
- Michigan Regional Comprehensive Metabolomics Resource Core, University of Michigan Medical School, Ann Arbor, MI 48109, USA.,Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
8
|
Lu W, Xing X, Wang L, Chen L, Zhang S, McReynolds MR, Rabinowitz JD. Improved Annotation of Untargeted Metabolomics Data through Buffer Modifications That Shift Adduct Mass and Intensity. Anal Chem 2020; 92:11573-11581. [PMID: 32614575 PMCID: PMC7484094 DOI: 10.1021/acs.analchem.0c00985] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Annotation of untargeted high-resolution full-scan LC-MS metabolomics data remains challenging due to individual metabolites generating multiple LC-MS peaks arising from isotopes, adducts, and fragments. Adduct annotation is a particular challenge, as the same mass difference between peaks can arise from adduct formation, fragmentation, or different biological species. To address this, here we describe a buffer modification workflow (BMW) in which the same sample is run by LC-MS in both liquid chromatography solvent with 14NH3-acetate buffer and in solvent with the buffer modified with 15NH3-formate. Buffer switching results in characteristic mass and signal intensity changes for adduct peaks, facilitating their annotation. This relatively simple and convenient chromatography modification annotated yeast metabolomics data with similar effectiveness to growing the yeast in isotope-labeled media. Application to mouse liver data annotated both known metabolite and known adduct peaks with 95% accuracy. Overall, it identified 26% of ∼27 000 liver LC-MS features as putative metabolites, of which ∼2600 showed HMDB or KEGG database formula match. This workflow is well suited to biological samples that cannot be readily isotope labeled, including plants, mammalian tissues, and tumors.
Collapse
Affiliation(s)
- Wenyun Lu
- Lewis Sigler Institute for Integrative Genomics and Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Xi Xing
- Lewis Sigler Institute for Integrative Genomics and Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Lin Wang
- Lewis Sigler Institute for Integrative Genomics and Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Li Chen
- Lewis Sigler Institute for Integrative Genomics and Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Sisi Zhang
- Lewis Sigler Institute for Integrative Genomics and Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Melanie R McReynolds
- Lewis Sigler Institute for Integrative Genomics and Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Joshua D Rabinowitz
- Lewis Sigler Institute for Integrative Genomics and Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
9
|
Senan O, Aguilar-Mogas A, Navarro M, Capellades J, Noon L, Burks D, Yanes O, Guimerà R, Sales-Pardo M. CliqueMS: a computational tool for annotating in-source metabolite ions from LC-MS untargeted metabolomics data based on a coelution similarity network. Bioinformatics 2020; 35:4089-4097. [PMID: 30903689 PMCID: PMC6792096 DOI: 10.1093/bioinformatics/btz207] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 01/30/2019] [Accepted: 03/21/2019] [Indexed: 11/26/2022] Open
Abstract
Motivation The analysis of biological samples in untargeted metabolomic studies using LC-MS yields tens of thousands of ion signals. Annotating these features is of the utmost importance for answering questions as fundamental as, e.g. how many metabolites are there in a given sample. Results Here, we introduce CliqueMS, a new algorithm for annotating in-source LC-MS1 data. CliqueMS is based on the similarity between coelution profiles and therefore, as opposed to most methods, allows for the annotation of a single spectrum. Furthermore, CliqueMS improves upon the state of the art in several dimensions: (i) it uses a more discriminatory feature similarity metric; (ii) it treats the similarities between features in a transparent way by means of a simple generative model; (iii) it uses a well-grounded maximum likelihood inference approach to group features; (iv) it uses empirical adduct frequencies to identify the parental mass and (v) it deals more flexibly with the identification of the parental mass by proposing and ranking alternative annotations. We validate our approach with simple mixtures of standards and with real complex biological samples. CliqueMS reduces the thousands of features typically obtained in complex samples to hundreds of metabolites, and it is able to correctly annotate more metabolites and adducts from a single spectrum than available tools. Availability and implementation https://CRAN.R-project.org/package=cliqueMS and https://github.com/osenan/cliqueMS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Oriol Senan
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona, Spain
| | - Antoni Aguilar-Mogas
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona, Spain
| | - Miriam Navarro
- Department of Electronic Engineering, Metabolomics Platform, IISPV, Universitat Rovira i Virgili, Tarragona, Spain.,CIBER of Diabetes and Associated Metabolic Diseases (CIBERDEM), Madrid, Spain
| | - Jordi Capellades
- Department of Electronic Engineering, Metabolomics Platform, IISPV, Universitat Rovira i Virgili, Tarragona, Spain.,CIBER of Diabetes and Associated Metabolic Diseases (CIBERDEM), Madrid, Spain
| | - Luke Noon
- CIBER of Diabetes and Associated Metabolic Diseases (CIBERDEM), Madrid, Spain.,Centro de Investigación Príncipe Felipe, Valencia, Spain
| | - Deborah Burks
- CIBER of Diabetes and Associated Metabolic Diseases (CIBERDEM), Madrid, Spain.,Centro de Investigación Príncipe Felipe, Valencia, Spain
| | - Oscar Yanes
- Department of Electronic Engineering, Metabolomics Platform, IISPV, Universitat Rovira i Virgili, Tarragona, Spain.,CIBER of Diabetes and Associated Metabolic Diseases (CIBERDEM), Madrid, Spain
| | - Roger Guimerà
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona, Spain.,ICREA, Barcelona, Spain
| | - Marta Sales-Pardo
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona, Spain
| |
Collapse
|
10
|
Abstract
Untargeted metabolomics aims to quantify the complete set of metabolites within a biological system, most commonly by liquid chromatography/mass spectrometry (LC/MS). Since nearly the inception of the field, compound identification has been widely recognized as the rate-limiting step of the experimental workflow. In spite of exponential increases in the size of metabolomic databases, which now contain experimental MS/MS spectra for over a half a million reference compounds, chemical structures still cannot be confidently assigned to many signals in a typical LC/MS dataset. The purpose of this Perspective is to consider why identification rates continue to be low in untargeted metabolomics. One rationalization is that many naturally occurring metabolites detected by LC/MS are true "novel" compounds that have yet to be incorporated into metabolomic databases. An alternative possibility, however, is that research data do not provide database matches because of informatic artifacts, chemical contaminants, and signal redundancies. Increasing evidence suggests that, for at least some sample types, many unidentifiable signals in untargeted metabolomics result from the latter rather than new compounds originating from the specimen being measured. The implications of these observations on chemical discovery in untargeted metabolomics are discussed.
Collapse
Affiliation(s)
- Miriam Sindelar
- Department of Chemistry, Washington University in St. Louis, St. Louis, MO, USA
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Gary J. Patti
- Department of Chemistry, Washington University in St. Louis, St. Louis, MO, USA
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, USA
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
11
|
Naake T, Gaquerel E, Fernie AR. Annotation of Specialized Metabolites from High-Throughput and High-Resolution Mass Spectrometry Metabolomics. Methods Mol Biol 2020; 2104:209-225. [PMID: 31953820 DOI: 10.1007/978-1-0716-0239-3_12] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
High-throughput mass spectrometry (MS) metabolomics profiling of highly complex samples allows the comprehensive detection of hundreds to thousands of metabolites under a given condition and point in time and produces information-rich data sets on known and unknown metabolites. One of the main challenges is the identification and annotation of metabolites from these complex data sets since the number of authentic standards available for specialized metabolites is far lower than an account for the number of mass spectral features. Previously, we reported two novel tools, MetNet and MetCirc, for putative annotation and structural prediction on unknown metabolites using known metabolites as baits. MetNet employs differences between m/z values of MS1 features, which correspond to metabolic transformations, and statistical associations, while MetCirc uses MS/MS features as input and calculates similarity scores of aligned spectra between features to guide the annotation of metabolites. Here, we showcase the use of MetNet and MetCirc to putatively annotate metabolites and provide detailed instructions as to how those can be used. While our case studies are from plants, the tools find equal utility in studies on bacterial, fungal, or mammalian xenobiotic samples.
Collapse
Affiliation(s)
- Thomas Naake
- Central Metabolism, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Emmanuel Gaquerel
- Institute of Plant Molecular Biology, University of Strasbourg, Strasbourg, France.,Centre for Organismal Studies, University of Heidelberg, Heidelberg, Germany
| | - Alisdair R Fernie
- Central Metabolism, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.
| |
Collapse
|
12
|
Ivanisevic J, Want EJ. From Samples to Insights into Metabolism: Uncovering Biologically Relevant Information in LC-HRMS Metabolomics Data. Metabolites 2019; 9:metabo9120308. [PMID: 31861212 PMCID: PMC6950334 DOI: 10.3390/metabo9120308] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 12/09/2019] [Accepted: 12/12/2019] [Indexed: 12/31/2022] Open
Abstract
Untargeted metabolomics (including lipidomics) is a holistic approach to biomarker discovery and mechanistic insights into disease onset and progression, and response to intervention. Each step of the analytical and statistical pipeline is crucial for the generation of high-quality, robust data. Metabolite identification remains the bottleneck in these studies; therefore, confidence in the data produced is paramount in order to maximize the biological output. Here, we outline the key steps of the metabolomics workflow and provide details on important parameters and considerations. Studies should be designed carefully to ensure appropriate statistical power and adequate controls. Subsequent sample handling and preparation should avoid the introduction of bias, which can significantly affect downstream data interpretation. It is not possible to cover the entire metabolome with a single platform; therefore, the analytical platform should reflect the biological sample under investigation and the question(s) under consideration. The large, complex datasets produced need to be pre-processed in order to extract meaningful information. Finally, the most time-consuming steps are metabolite identification, as well as metabolic pathway and network analysis. Here we discuss some widely used tools and the pitfalls of each step of the workflow, with the ultimate aim of guiding the reader towards the most efficient pipeline for their metabolomics studies.
Collapse
Affiliation(s)
- Julijana Ivanisevic
- Metabolomics Platform, Faculty of Biology and Medicine, University of Lausanne, Rue du Bugnon 19, 1005 Lausanne, Switzerland
- Correspondence: (J.I.); (E.J.W.)
| | - Elizabeth J. Want
- Section of Biomolecular Medicine, Department of Metabolism, Digestion and Reproduction, Faculty of Medicine, Imperial College London, London SW7 2AZ, UK
- Correspondence: (J.I.); (E.J.W.)
| |
Collapse
|
13
|
Analytic Correlation Filtration: A New Tool to Reduce Analytical Complexity of Metabolomic Datasets. Metabolites 2019; 9:metabo9110250. [PMID: 31653057 PMCID: PMC6918187 DOI: 10.3390/metabo9110250] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 10/21/2019] [Accepted: 10/22/2019] [Indexed: 11/16/2022] Open
Abstract
Metabolomics generates massive and complex data. Redundant different analytical species and the high degree of correlation in datasets is a constraint for the use of data mining/statistical methods and interpretation. In this context, we developed a new tool to detect analytical correlation into datasets without confounding them with biological correlations. Based on several parameters, such as a similarity measure, retention time, and mass information from known isotopes, adducts, or fragments, the algorithm principle is used to group features coming from the same analyte, and to propose one single representative per group. To illustrate the functionalities and added-value of this tool, it was applied to published datasets and compared to one of the most commonly used free packages proposing a grouping method for metabolomics data: 'CAMERA'. This tool was developed to be included in Galaxy and will be available in Workflow4Metabolomics (http://workflow4metabolomics.org). Source code is freely available for download under CeCILL 2.1 license at https://services.pfem.clermont.inra.fr/gitlab/grandpa /tool-acf and implement in Perl.
Collapse
|
14
|
Stanstrup J, Broeckling CD, Helmus R, Hoffmann N, Mathé E, Naake T, Nicolotti L, Peters K, Rainer J, Salek RM, Schulze T, Schymanski EL, Stravs MA, Thévenot EA, Treutler H, Weber RJM, Willighagen E, Witting M, Neumann S. The metaRbolomics Toolbox in Bioconductor and beyond. Metabolites 2019; 9:E200. [PMID: 31548506 PMCID: PMC6835268 DOI: 10.3390/metabo9100200] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/17/2022] Open
Abstract
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
Collapse
Affiliation(s)
- Jan Stanstrup
- Preventive and Clinical Nutrition, University of Copenhagen, Rolighedsvej 30, 1958 Frederiksberg C, Denmark.
| | - Corey D Broeckling
- Proteomics and Metabolomics Facility, Colorado State University, Fort Collins, CO 80523, USA.
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1098 XH Amsterdam, The Netherlands.
| | - Nils Hoffmann
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Otto-Hahn-Straße 6b, 44227 Dortmund, Germany.
| | - Ewy Mathé
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | - Thomas Naake
- Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
| | - Luca Nicolotti
- The Australian Wine Research Institute, Metabolomics Australia, PO Box 197, Adelaide SA 5064, Australia.
| | - Kristian Peters
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Johannes Rainer
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, 39100 Bolzano, Italy.
| | - Reza M Salek
- The International Agency for Research on Cancer, 150 cours Albert Thomas, CEDEX 08, 69372 Lyon, France.
| | - Tobias Schulze
- Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research-UFZ, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg.
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dubendorf, Switzerland.
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Sciences and Decision, MetaboHUB, Gif-Sur-Yvette F-91191, France.
| | - Hendrik Treutler
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Ralf J M Weber
- Phenome Centre Birmingham and School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, The Netherlands.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, 85764 Neuherberg, Germany.
- Chair of Analytical Food Chemistry, Technische Universität München, 85354 Weihenstephan, Germany.
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103 Leipzig, Germany.
| |
Collapse
|
15
|
Lynn KS, Cheng ML, Yang HC, Liang YJ, Kang MJ, Chen FL, Shiao MS, Pan WH. Vegetable Signatures Derived from Human Urinary Metabolomic Data in Controlled Feeding Studies. J Proteome Res 2019; 18:159-168. [PMID: 30517004 DOI: 10.1021/acs.jproteome.8b00470] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Examination of changes in urinary metabolomic profiles after vegetable ingestion may lead to new methods of assessing plant food intake. To this regard, we developed a proof-of-principle methodology to identify urinary metabolomic signatures for spinach, celery, and onion. Three feeding studies were conducted. In the first study, healthy individuals were fed with spinach, celery, onion, and no vegetables in four separate experiments with pooled urinary samples for metabolite discovery. The same protocol was used to validate the finding at the individual level in the second study and when feeding all three vegetables simultaneously in the third study. An LC-MS-based metabolomics approach was adopted to search for indicative metabolites from urine samples collected during multiple time periods before and after the meal. Consequently, a total of 1, 9, and 3 nonoverlapping urinary metabolites were associated with the intake of spinach, celery, and onion, respectively. The PCA signature of these metabolites followed a similar "time cycle" pattern, which maximized at approximately 2-4 h after intake. In addition, the metabolite profiles for the same vegetable were consistent across samples, regardless of whether it was consumed individually or in combination. The developed methodology along with the identified urinary metabolomic signatures were potential tools for assessing plant food intake.
Collapse
Affiliation(s)
- Ke-Shiuan Lynn
- Department of Mathematics , Fu Jen Catholic University , New Taipei City 24205 , Taiwan
| | - Mei-Ling Cheng
- Department of Biomedical Sciences, College of Medicine , Chang Gung University , Taoyuan 33302 , Taiwan.,Metabolomics Core Laboratory, Healthy Aging Research Center , Chang Gung University , Taoyuan 33302 , Taiwan.,Clinical Metabolomics Core Laboratory , Chang Gung Memorial Hospital , Taoyuan 33305 , Taiwan
| | - Hsin-Chou Yang
- Institute of Statistical Science , Academia Sinica , Taipei 11529 , Taiwan
| | - Yu-Jen Liang
- Institute of Statistical Science , Academia Sinica , Taipei 11529 , Taiwan
| | - Mei-Jyh Kang
- Institute of Biomedical Sciences , Academia Sinica , Taipei 11529 , Taiwan
| | - Fong-Ling Chen
- Institute of Biomedical Sciences , Academia Sinica , Taipei 11529 , Taiwan
| | - Ming-Shi Shiao
- Department of Biomedical Sciences, College of Medicine , Chang Gung University , Taoyuan 33302 , Taiwan
| | - Wen-Harn Pan
- Institute of Biomedical Sciences , Academia Sinica , Taipei 11529 , Taiwan.,Institute of Population Health Sciences , National Health Research Institutes , Miaoli 35053 , Taiwan
| |
Collapse
|
16
|
Tebani A, Afonso C, Bekri S. Advances in metabolome information retrieval: turning chemistry into biology. Part II: biological information recovery. J Inherit Metab Dis 2018; 41:393-406. [PMID: 28842777 PMCID: PMC5959951 DOI: 10.1007/s10545-017-0080-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2017] [Revised: 07/27/2017] [Accepted: 07/28/2017] [Indexed: 12/11/2022]
Abstract
This work reports the second part of a review intending to give the state of the art of major metabolic phenotyping strategies. It particularly deals with inherent advantages and limits regarding data analysis issues and biological information retrieval tools along with translational challenges. This Part starts with introducing the main data preprocessing strategies of the different metabolomics data. Then, it describes the main data analysis techniques including univariate and multivariate aspects. It also addresses the challenges related to metabolite annotation and characterization. Finally, functional analysis including pathway and network strategies are discussed. The last section of this review is devoted to practical considerations and current challenges and pathways to bring metabolomics into clinical environments.
Collapse
Affiliation(s)
- Abdellah Tebani
- Department of Metabolic Biochemistry, Rouen University Hospital, 76000, Rouen, France
- Normandie Université, UNIROUEN, CHU Rouen, IRIB, INSERM U1245, 76000, Rouen, France
- Normandie Université, UNIROUEN, INSA Rouen, CNRS, COBRA, 76000, Rouen, France
| | - Carlos Afonso
- Normandie Université, UNIROUEN, INSA Rouen, CNRS, COBRA, 76000, Rouen, France
| | - Soumeya Bekri
- Department of Metabolic Biochemistry, Rouen University Hospital, 76000, Rouen, France.
- Normandie Université, UNIROUEN, CHU Rouen, IRIB, INSERM U1245, 76000, Rouen, France.
| |
Collapse
|
17
|
Basu S, Duren W, Evans CR, Burant CF, Michailidis G, Karnovsky A. Sparse network modeling and metscape-based visualization methods for the analysis of large-scale metabolomics data. Bioinformatics 2018; 33:1545-1553. [PMID: 28137712 DOI: 10.1093/bioinformatics/btx012] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Accepted: 01/11/2017] [Indexed: 02/01/2023] Open
Abstract
Motivation Recent technological advances in mass spectrometry, development of richer mass spectral libraries and data processing tools have enabled large scale metabolic profiling. Biological interpretation of metabolomics studies heavily relies on knowledge-based tools that contain information about metabolic pathways. Incomplete coverage of different areas of metabolism and lack of information about non-canonical connections between metabolites limits the scope of applications of such tools. Furthermore, the presence of a large number of unknown features, which cannot be readily identified, but nonetheless can represent bona fide compounds, also considerably complicates biological interpretation of the data. Results Leveraging recent developments in the statistical analysis of high-dimensional data, we developed a new Debiased Sparse Partial Correlation algorithm (DSPC) for estimating partial correlation networks and implemented it as a Java-based CorrelationCalculator program. We also introduce a new version of our previously developed tool Metscape that enables building and visualization of correlation networks. We demonstrate the utility of these tools by constructing biologically relevant networks and in aiding identification of unknown compounds. Availability and Implementation http://metscape.med.umich.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sumanta Basu
- Department of Statistics, University of California, Berkeley, CA, USA.,Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - William Duren
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Charles R Evans
- Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Charles F Burant
- Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, USA
| | | | - Alla Karnovsky
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| |
Collapse
|
18
|
Domingo-Almenara X, Montenegro-Burke JR, Benton HP, Siuzdak G. Annotation: A Computational Solution for Streamlining Metabolomics Analysis. Anal Chem 2018; 90:480-489. [PMID: 29039932 PMCID: PMC5750104 DOI: 10.1021/acs.analchem.7b03929] [Citation(s) in RCA: 105] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Metabolite identification is still considered an imposing bottleneck in liquid chromatography mass spectrometry (LC/MS) untargeted metabolomics. The identification workflow usually begins with detecting relevant LC/MS peaks via peak-picking algorithms and retrieving putative identities based on accurate mass searching. However, accurate mass search alone provides poor evidence for metabolite identification. For this reason, computational annotation is used to reveal the underlying metabolites monoisotopic masses, improving putative identification in addition to confirmation with tandem mass spectrometry. This review examines LC/MS data from a computational and analytical perspective, focusing on the occurrence of neutral losses and in-source fragments, to understand the challenges in computational annotation methodologies. Herein, we examine the state-of-the-art strategies for computational annotation including: (i) peak grouping or full scan (MS1) pseudo-spectra extraction, i.e., clustering all mass spectral signals stemming from each metabolite; (ii) annotation using ion adduction and mass distance among ion peaks; (iii) incorporation of biological knowledge such as biotransformations or pathways; (iv) tandem MS data; and (v) metabolite retention time calibration, usually achieved by prediction from molecular descriptors. Advantages and pitfalls of each of these strategies are discussed, as well as expected future trends in computational annotation.
Collapse
Affiliation(s)
- Xavier Domingo-Almenara
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - J Rafael Montenegro-Burke
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - H Paul Benton
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | - Gary Siuzdak
- Scripps Center for Metabolomics, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| |
Collapse
|
19
|
Godzien J, Gil de la Fuente A, Otero A, Barbas C. Metabolite Annotation and Identification. COMPREHENSIVE ANALYTICAL CHEMISTRY 2018. [DOI: 10.1016/bs.coac.2018.07.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
20
|
Bowler RP, Wendt CH, Fessler MB, Foster MW, Kelly RS, Lasky-Su J, Rogers AJ, Stringer KA, Winston BW. New Strategies and Challenges in Lung Proteomics and Metabolomics. An Official American Thoracic Society Workshop Report. Ann Am Thorac Soc 2017; 14:1721-1743. [PMID: 29192815 PMCID: PMC5946579 DOI: 10.1513/annalsats.201710-770ws] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
This document presents the proceedings from the workshop entitled, "New Strategies and Challenges in Lung Proteomics and Metabolomics" held February 4th-5th, 2016, in Denver, Colorado. It was sponsored by the National Heart Lung Blood Institute, the American Thoracic Society, the Colorado Biological Mass Spectrometry Society, and National Jewish Health. The goal of this workshop was to convene, for the first time, relevant experts in lung proteomics and metabolomics to discuss and overcome specific challenges in these fields that are unique to the lung. The main objectives of this workshop were to identify, review, and/or understand: (1) emerging technologies in metabolomics and proteomics as applied to the study of the lung; (2) the unique composition and challenges of lung-specific biological specimens for metabolomic and proteomic analysis; (3) the diverse informatics approaches and databases unique to metabolomics and proteomics, with special emphasis on the lung; (4) integrative platforms across genetic and genomic databases that can be applied to lung-related metabolomic and proteomic studies; and (5) the clinical applications of proteomics and metabolomics. The major findings and conclusions of this workshop are summarized at the end of the report, and outline the progress and challenges that face these rapidly advancing fields.
Collapse
|
21
|
Guitton Y, Tremblay-Franco M, Le Corguillé G, Martin JF, Pétéra M, Roger-Mele P, Delabrière A, Goulitquer S, Monsoor M, Duperier C, Canlet C, Servien R, Tardivel P, Caron C, Giacomoni F, Thévenot EA. Create, run, share, publish, and reference your LC–MS, FIA–MS, GC–MS, and NMR data analysis workflows with the Workflow4Metabolomics 3.0 Galaxy online infrastructure for metabolomics. Int J Biochem Cell Biol 2017; 93:89-101. [DOI: 10.1016/j.biocel.2017.07.002] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Revised: 06/14/2017] [Accepted: 07/10/2017] [Indexed: 12/11/2022]
|
22
|
Perez de Souza L, Naake T, Tohge T, Fernie AR. From chromatogram to analyte to metabolite. How to pick horses for courses from the massive web resources for mass spectral plant metabolomics. Gigascience 2017; 6:1-20. [PMID: 28520864 PMCID: PMC5499862 DOI: 10.1093/gigascience/gix037] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 05/08/2017] [Accepted: 05/12/2017] [Indexed: 01/19/2023] Open
Abstract
The grand challenge currently facing metabolomics is the expansion of the coverage of the metabolome from a minor percentage of the metabolic complement of the cell toward the level of coverage afforded by other post-genomic technologies such as transcriptomics and proteomics. In plants, this problem is exacerbated by the sheer diversity of chemicals that constitute the metabolome, with the number of metabolites in the plant kingdom generally considered to be in excess of 200 000. In this review, we focus on web resources that can be exploited in order to improve analyte and ultimately metabolite identification and quantification. There is a wide range of available software that not only aids in this but also in the related area of peak alignment; however, for the uninitiated, choosing which program to use is a daunting task. For this reason, we provide an overview of the pros and cons of the software as well as comments regarding the level of programing skills required to effectively exploit their basic functions. In addition, the torrent of available genome and transcriptome sequences that followed the advent of next-generation sequencing has opened up further valuable resources for metabolite identification. All things considered, we posit that only via a continued communal sharing of information such as that deposited in the databases described within the article are we likely to be able to make significant headway toward improving our coverage of the plant metabolome.
Collapse
Affiliation(s)
- Leonardo Perez de Souza
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Thomas Naake
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Takayuki Tohge
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| |
Collapse
|
23
|
Uppal K, Walker DI, Jones DP. xMSannotator: An R Package for Network-Based Annotation of High-Resolution Metabolomics Data. Anal Chem 2017; 89:1063-1067. [PMID: 27977166 DOI: 10.1021/acs.analchem.6b01214] [Citation(s) in RCA: 207] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Improved analytical technologies and data extraction algorithms enable detection of >10 000 reproducible signals by liquid chromatography-high-resolution mass spectrometry, creating a bottleneck in chemical identification. In principle, measurement of more than one million chemicals would be possible if algorithms were available to facilitate utilization of the raw mass spectrometry data, especially low-abundance metabolites. Here we describe an automated computational framework to annotate ions for possible chemical identity using a multistage clustering algorithm in which metabolic pathway associations are used along with intensity profiles, retention time characteristics, mass defect, and isotope/adduct patterns. The algorithm uses high-resolution mass spectrometry data for a series of samples with common properties and publicly available chemical, metabolic, and environmental databases to assign confidence levels to annotation results. Evaluation results show that the algorithm achieves an F1-measure of 0.8 for a data set with known targets and is more robust than previously reported results for cases when database size is much greater than the actual number of metabolites. MS/MS evaluation of a set of randomly selected 210 metabolites annotated using xMSannotator in an untargeted metabolomics human data set shows that 80% of features with high or medium confidence scores have ion dissociation patterns consistent with the xMSannotator annotation. The algorithm has been incorporated into an R package, xMSannotator, which includes utilities for querying local or online databases such as ChemSpider, KEGG, HMDB, T3DB, and LipidMaps.
Collapse
Affiliation(s)
- Karan Uppal
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30308, United States
| | - Douglas I Walker
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30308, United States.,Department of Civil and Environmental Engineering, Tufts University , Medford, Massachusetts 02153, United States
| | - Dean P Jones
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30308, United States
| |
Collapse
|
24
|
Prediction, Detection, and Validation of Isotope Clusters in Mass Spectrometry Data. Metabolites 2016; 6:metabo6040037. [PMID: 27775610 PMCID: PMC5192443 DOI: 10.3390/metabo6040037] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 09/29/2016] [Accepted: 10/14/2016] [Indexed: 12/04/2022] Open
Abstract
Mass spectrometry is a key analytical platform for metabolomics. The precise quantification and identification of small molecules is a prerequisite for elucidating the metabolism and the detection, validation, and evaluation of isotope clusters in LC-MS data is important for this task. Here, we present an approach for the improved detection of isotope clusters using chemical prior knowledge and the validation of detected isotope clusters depending on the substance mass using database statistics. We find remarkable improvements regarding the number of detected isotope clusters and are able to predict the correct molecular formula in the top three ranks in 92% of the cases. We make our methodology freely available as part of the Bioconductor packages xcms version 1.50.0 and CAMERA version 1.30.0.
Collapse
|
25
|
Uppal K, Walker DI, Liu K, Li S, Go YM, Jones DP. Computational Metabolomics: A Framework for the Million Metabolome. Chem Res Toxicol 2016; 29:1956-1975. [PMID: 27629808 DOI: 10.1021/acs.chemrestox.6b00179] [Citation(s) in RCA: 167] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
"Sola dosis facit venenum." These words of Paracelsus, "the dose makes the poison", can lead to a cavalier attitude concerning potential toxicities of the vast array of low abundance environmental chemicals to which humans are exposed. Exposome research teaches that 80-85% of human disease is linked to environmental exposures. The human exposome is estimated to include >400,000 environmental chemicals, most of which are uncharacterized with regard to human health. In fact, mass spectrometry measures >200,000 m/z features (ions) in microliter volumes derived from human samples; most are unidentified. This crystallizes a grand challenge for chemical research in toxicology: to develop reliable and affordable analytical methods to understand health impacts of the extensive human chemical experience. To this end, there appears to be no choice but to abandon the limitations of measuring one chemical at a time. The present review looks at progress in computational metabolomics to provide probability-based annotation linking ions to known chemicals and serve as a foundation for unambiguous designation of unidentified ions for toxicologic study. We review methods to characterize ions in terms of accurate mass m/z, chromatographic retention time, correlation of adduct, isotopic and fragment forms, association with metabolic pathways and measurement of collision-induced dissociation products, collision cross section, and chirality. Such information can support a largely unambiguous system for documenting unidentified ions in environmental surveillance and human biomonitoring. Assembly of this data would provide a resource to characterize and understand health risks of the array of low-abundance chemicals to which humans are exposed.
Collapse
Affiliation(s)
- Karan Uppal
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30322, United States
| | - Douglas I Walker
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30322, United States.,Hercules Exposome Research Center, Department of Environmental Health, Rollins School of Public Health, Emory University , Atlanta, Georgia 30322, United States.,Department of Civil and Environmental Engineering, Tufts University , Medford, Massachusetts 02155, United States
| | - Ken Liu
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30322, United States
| | - Shuzhao Li
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30322, United States.,Hercules Exposome Research Center, Department of Environmental Health, Rollins School of Public Health, Emory University , Atlanta, Georgia 30322, United States
| | - Young-Mi Go
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30322, United States
| | - Dean P Jones
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University , Atlanta, Georgia 30322, United States.,Hercules Exposome Research Center, Department of Environmental Health, Rollins School of Public Health, Emory University , Atlanta, Georgia 30322, United States
| |
Collapse
|
26
|
Rinaudo P, Boudah S, Junot C, Thévenot EA. biosigner: A New Method for the Discovery of Significant Molecular Signatures from Omics Data. Front Mol Biosci 2016; 3:26. [PMID: 27446929 PMCID: PMC4914951 DOI: 10.3389/fmolb.2016.00026] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 06/03/2016] [Indexed: 01/02/2023] Open
Abstract
High-throughput technologies such as transcriptomics, proteomics, and metabolomics show great promise for the discovery of biomarkers for diagnosis and prognosis. Selection of the most promising candidates between the initial untargeted step and the subsequent validation phases is critical within the pipeline leading to clinical tests. Several statistical and data mining methods have been described for feature selection: in particular, wrapper approaches iteratively assess the performance of the classifier on distinct subsets of variables. Current wrappers, however, do not estimate the significance of the selected features. We therefore developed a new methodology to find the smallest feature subset which significantly contributes to the model performance, by using a combination of resampling, ranking of variable importance, significance assessment by permutation of the feature values in the test subsets, and half-interval search. We wrapped our biosigner algorithm around three reference binary classifiers (Partial Least Squares—Discriminant Analysis, Random Forest, and Support Vector Machines) which have been shown to achieve specific performances depending on the structure of the dataset. By using three real biological and clinical metabolomics and transcriptomics datasets (containing up to 7000 features), complementary signatures were obtained in a few minutes, generally providing higher prediction accuracies than the initial full model. Comparison with alternative feature selection approaches further indicated that our method provides signatures of restricted size and high stability. Finally, by using our methodology to seek metabolites discriminating type 1 from type 2 diabetic patients, several features were selected, including a fragment from the taurochenodeoxycholic bile acid. Our methodology, implemented in the biosigner R/Bioconductor package and Galaxy/Workflow4metabolomics module, should be of interest for both experimenters and statisticians to identify robust molecular signatures from large omics datasets in the process of developing new diagnostics.
Collapse
Affiliation(s)
- Philippe Rinaudo
- CEA, LIST, Laboratory for Data Analysis and Systems' Intelligence, MetaboHUB Gif-sur-Yvette, France
| | - Samia Boudah
- Laboratoire d'Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB, CEA-Saclay Gif-sur-Yvette, France
| | - Christophe Junot
- Laboratoire d'Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB, CEA-Saclay Gif-sur-Yvette, France
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Analysis and Systems' Intelligence, MetaboHUB Gif-sur-Yvette, France
| |
Collapse
|
27
|
Vinaixa M, Schymanski EL, Neumann S, Navarro M, Salek RM, Yanes O. Mass spectral databases for LC/MS- and GC/MS-based metabolomics: State of the field and future prospects. Trends Analyt Chem 2016. [DOI: 10.1016/j.trac.2015.09.005] [Citation(s) in RCA: 325] [Impact Index Per Article: 40.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
28
|
Trutschel D, Schmidt S, Grosse I, Neumann S. Joint Analysis of Dependent Features within Compound Spectra Can Improve Detection of Differential Features. Front Bioeng Biotechnol 2015; 3:129. [PMID: 26442246 PMCID: PMC4585098 DOI: 10.3389/fbioe.2015.00129] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2015] [Accepted: 08/13/2015] [Indexed: 11/13/2022] Open
Abstract
Mass spectrometry is an important analytical technology in metabolomics. After the initial feature detection and alignment steps, the raw data processing results in a high-dimensional data matrix of mass spectral features, which is then subjected to further statistical analysis. Univariate tests like Student's t-test and Analysis of Variances (ANOVA) are hypothesis tests, which aim to detect differences between two or more sample classes, e.g., wildtype-mutant or between different doses of treatments. In both cases, one of the underlying assumptions is the independence between metabolic features. However, in mass spectrometry, a single metabolite usually gives rise to several mass spectral features, which are observed together and show a common behavior. This paper suggests to group the related features of metabolites with CAMERA into compound spectra, and then to use a multivariate statistical method to test whether a compound spectrum (and thus the actual metabolite) is differential between two sample classes. The multivariate method is first demonstrated with an analysis between wild-type and an over-expression line of the model plant Arabidopsis thaliana. For a quantitative evaluation data sets with a simulated known effect between two sample classes were analyzed. The spectra-wise analysis showed better detection results for all simulated effects.
Collapse
Affiliation(s)
- Diana Trutschel
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany
| | - Stephan Schmidt
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
| | - Ivo Grosse
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| | - Steffen Neumann
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
| |
Collapse
|
29
|
Thévenot EA, Roux A, Xu Y, Ezan E, Junot C. Analysis of the Human Adult Urinary Metabolome Variations with Age, Body Mass Index, and Gender by Implementing a Comprehensive Workflow for Univariate and OPLS Statistical Analyses. J Proteome Res 2015; 14:3322-35. [PMID: 26088811 DOI: 10.1021/acs.jproteome.5b00354] [Citation(s) in RCA: 716] [Impact Index Per Article: 79.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Urine metabolomics is widely used for biomarker research in the fields of medicine and toxicology. As a consequence, characterization of the variations of the urine metabolome under basal conditions becomes critical in order to avoid confounding effects in cohort studies. Such physiological information is however very scarce in the literature and in metabolomics databases so far. Here we studied the influence of age, body mass index (BMI), and gender on metabolite concentrations in a large cohort of 183 adults by using liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS). We implemented a comprehensive statistical workflow for univariate hypothesis testing and modeling by orthogonal partial least-squares (OPLS), which we made available to the metabolomics community within the online Workflow4Metabolomics.org resource. We found 108 urine metabolites displaying concentration variations with either age, BMI, or gender, by integrating the results from univariate p-values and multivariate variable importance in projection (VIP). Several metabolite clusters were further evidenced by correlation analysis, and they allowed stratification of the cohort. In conclusion, our study highlights the impact of gender and age on the urinary metabolome, and thus it indicates that these factors should be taken into account for the design of metabolomics studies.
Collapse
Affiliation(s)
- Etienne A Thévenot
- †CEA, LIST, Laboratory for Data Analysis and Smart Systems, MetaboHUB Paris, F-91191 Gif-sur-Yvette, France
| | - Aurélie Roux
- ‡Laboratoire d'Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB Paris, CEA-Saclay, Gif-Sur-Yvette, France
| | - Ying Xu
- ‡Laboratoire d'Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB Paris, CEA-Saclay, Gif-Sur-Yvette, France
| | - Eric Ezan
- ‡Laboratoire d'Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB Paris, CEA-Saclay, Gif-Sur-Yvette, France
| | - Christophe Junot
- ‡Laboratoire d'Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB Paris, CEA-Saclay, Gif-Sur-Yvette, France
| |
Collapse
|
30
|
Uppal K, Soltow QA, Promislow DEL, Wachtman LM, Quyyumi AA, Jones DP. MetabNet: An R Package for Metabolic Association Analysis of High-Resolution Metabolomics Data. Front Bioeng Biotechnol 2015; 3:87. [PMID: 26125020 PMCID: PMC4464066 DOI: 10.3389/fbioe.2015.00087] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Accepted: 05/27/2015] [Indexed: 01/20/2023] Open
Abstract
Liquid-chromatography high-resolution mass spectrometry provides capability to measure >40,000 ions derived from metabolites in biologic samples. This presents challenges to confirm identities of known chemicals and delineate potential metabolic pathway associations of unidentified chemicals. We provide an R package for metabolic network analysis, MetabNet, to perform targeted metabolome-wide association study of specific metabolites to facilitate detection of their related metabolic pathways and network structures.
Collapse
Affiliation(s)
- Karan Uppal
- Division of Pulmonary Medicine, Department of Medicine, Emory University , Atlanta, GA , USA
| | - Quinlyn A Soltow
- Division of Pulmonary Medicine, Department of Medicine, Emory University , Atlanta, GA , USA
| | | | - Lynn M Wachtman
- New England Primate Research Center, Harvard University , Southborough, MA , USA
| | - Arshed Ali Quyyumi
- Division of Cardiology, Department of Medicine, Emory University , Atlanta, GA , USA
| | - Dean P Jones
- Division of Pulmonary Medicine, Department of Medicine, Emory University , Atlanta, GA , USA
| |
Collapse
|
31
|
Alonso A, Marsal S, Julià A. Analytical methods in untargeted metabolomics: state of the art in 2015. Front Bioeng Biotechnol 2015; 3:23. [PMID: 25798438 PMCID: PMC4350445 DOI: 10.3389/fbioe.2015.00023] [Citation(s) in RCA: 388] [Impact Index Per Article: 43.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 02/18/2015] [Indexed: 12/20/2022] Open
Abstract
Metabolomics comprises the methods and techniques that are used to measure the small molecule composition of biofluids and tissues, and is actually one of the most rapidly evolving research fields. The determination of the metabolomic profile - the metabolome - has multiple applications in many biological sciences, including the developing of new diagnostic tools in medicine. Recent technological advances in nuclear magnetic resonance and mass spectrometry are significantly improving our capacity to obtain more data from each biological sample. Consequently, there is a need for fast and accurate statistical and bioinformatic tools that can deal with the complexity and volume of the data generated in metabolomic studies. In this review, we provide an update of the most commonly used analytical methods in metabolomics, starting from raw data processing and ending with pathway analysis and biomarker identification. Finally, the integration of metabolomic profiles with molecular data from other high-throughput biotechnologies is also reviewed.
Collapse
Affiliation(s)
- Arnald Alonso
- Rheumatology Research Group, Vall d’Hebron Research Institute, Barcelona, Spain
- Department of Automatic Control (ESAII), Polytechnic University of Catalonia, Barcelona, Spain
| | - Sara Marsal
- Rheumatology Research Group, Vall d’Hebron Research Institute, Barcelona, Spain
| | - Antonio Julià
- Rheumatology Research Group, Vall d’Hebron Research Institute, Barcelona, Spain
| |
Collapse
|
32
|
Dhanasekaran AR, Pearson JL, Ganesan B, Weimer BC. Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction. BMC Bioinformatics 2015; 16:62. [PMID: 25887958 PMCID: PMC4347650 DOI: 10.1186/s12859-015-0462-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 01/13/2015] [Indexed: 01/19/2023] Open
Abstract
Background Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism’s genome as a database restricts metabolite identification to only those compounds that the organism can produce. Results To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment’s MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Conclusions Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Collapse
Affiliation(s)
- A Ranjitha Dhanasekaran
- Center for Integrated BioSystems, Computer Science Department, Utah State University, Logan, 84322-8700, USA. .,Linda Crnic Institute for Down Syndrome, Department of Pediatrics, School of Medicine, University of Colorado Denver, 12700 E 19th Avenue, Aurora, CO, 80045, USA.
| | - Jon L Pearson
- Center for Integrated BioSystems, Computer Science Department, Utah State University, Logan, 84322-8700, USA. .,Spillman Technologies, 4625 West Lake Park Blvd, Salt Lake City, UT, 84120, USA.
| | - Balasubramanian Ganesan
- Center for Integrated BioSystems, Computer Science Department, Utah State University, Logan, 84322-8700, USA. .,Western Dairy Center, Department of Nutrition, Dietetics, and Food Sciences, Utah State University, Logan, 84322-8700, USA.
| | - Bart C Weimer
- University of California, Davis, School of Veterinary Medicine, 1089 Veterinary Medicine Dr., VM3B, Room 4023, Davis, CA, 95616, USA.
| |
Collapse
|
33
|
Lynn KS, Cheng ML, Chen YR, Hsu C, Chen A, Lih TM, Chang HY, Huang CJ, Shiao MS, Pan WH, Sung TY, Hsu WL. Metabolite Identification for Mass Spectrometry-Based Metabolomics Using Multiple Types of Correlated Ion Information. Anal Chem 2015; 87:2143-51. [DOI: 10.1021/ac503325c] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Ke-Shiuan Lynn
- Institute of Information
Science, Academia Sinica, Taipei, Taiwan
| | - Mei-Ling Cheng
- Department
of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| | - Yet-Ran Chen
- Agricultural Biotechnology
Research Center, Academia Sinica, Taipei, Taiwan
| | - Chin Hsu
- Department
of Exercise Health Science, National Taiwan University of Physical Education and Sport, Taichung, Taiwan
| | - Ann Chen
- Department
of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| | - T. Mamie Lih
- Bioinformatics
Program, TIGP, Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Hui-Yin Chang
- Bioinformatics
Program, TIGP, Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Ching-jang Huang
- Department
of Biochemical Science and Technology, National Taiwan University, Taipei, Taiwan
| | - Ming-Shi Shiao
- Department
of Biomedical Sciences, Chang Gung University, Taoyuan, Taiwan
| | - Wen-Harn Pan
- Institute of Biomedical
Sciences, Academia Sinica, Taipei, Taiwan
| | - Ting-Yi Sung
- Institute of Information
Science, Academia Sinica, Taipei, Taiwan
| | - Wen-Lian Hsu
- Institute of Information
Science, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
34
|
Roux A, Thévenot EA, Seguin F, Olivier MF, Junot C. Impact of collection conditions on the metabolite content of human urine samples as analyzed by liquid chromatography coupled to mass spectrometry and nuclear magnetic resonance spectroscopy. Metabolomics 2015; 11:1095-1105. [PMID: 26366133 PMCID: PMC4559108 DOI: 10.1007/s11306-014-0764-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Accepted: 12/13/2014] [Indexed: 12/31/2022]
Abstract
There is a lack of comprehensive studies documenting the impact of sample collection conditions on metabolic composition of human urine. To address this issue, two experiments were performed at a 3-month interval, in which midstream urine samples from healthy individuals were collected, pooled, divided into several aliquots and kept under specific conditions (room temperature, 4 °C, with or without preservative) up to 72 h before storage at -80 °C. Samples were analyzed by high-performance liquid chromatography coupled to high-resolution mass spectrometry and bacterial contamination was monitored by turbidimetry. Multivariate analyses showed that urinary metabolic fingerprints were affected by the presence of preservatives and also by storage at room temperature from 24 to 72 h, whereas no change was observed for urine samples stored at 4 °C over a 72-h period. Investigations were then focused on 280 metabolites previously identified in urine: 19 of them were impacted by the kind of sample collection protocol in both experiments, including 12 metabolites affected by bacterial contamination and 7 exhibiting poor chemical stability. Finally, our results emphasize that the use of preservative prevents bacterial overgrowth, but does not avoid metabolite instability in solution, whereas storage at 4 °C inhibits bacterial overgrowth at least over a 72-h period and slows the chemical degradation process. Consequently, and for further LC/MS analyses, human urine samples should be kept at 4 °C if their collection is performed over 24 h.
Collapse
Affiliation(s)
- Aurélie Roux
- Laboratoire d’Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB Paris, CEA - Centre d’Etude de Saclay, 91191 Gif-Sur-Yvette, France
| | - Etienne A. Thévenot
- CEA, LIST, Laboratory for Data Analysis and Smart Systems, MetaboHUB Paris, 91191 Gif-Sur-Yvette, France
| | - François Seguin
- INSERM U1082, Université de Poitiers, Hôpital La Milêtrie, Poitiers, France
| | - Marie-Françoise Olivier
- Laboratoire d’Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB Paris, CEA - Centre d’Etude de Saclay, 91191 Gif-Sur-Yvette, France
- Laboratoire de Recherche sur la Transcription et la Réparation des Cellules Souches, DSV/IRCM, CEA, Fontenay-Aux-Roses, 92265 France
| | - Christophe Junot
- Laboratoire d’Etude du Métabolisme des Médicaments, DSV/iBiTec-S/SPI, MetaboHUB Paris, CEA - Centre d’Etude de Saclay, 91191 Gif-Sur-Yvette, France
| |
Collapse
|
35
|
Mahieu NG, Huang X, Chen YJ, Patti GJ. Credentialing features: a platform to benchmark and optimize untargeted metabolomic methods. Anal Chem 2014; 86:9583-9. [PMID: 25160088 PMCID: PMC4188275 DOI: 10.1021/ac503092d] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
![]()
The
aim of untargeted metabolomics is to profile as many metabolites
as possible, yet a major challenge is comparing experimental method
performance on the basis of metabolome coverage. To date, most published
approaches have compared experimental methods by counting the total
number of features detected. Due to artifactual interference, however,
this number is highly variable and therefore is a poor metric for
comparing metabolomic methods. Here we introduce an alternative approach
to benchmarking metabolome coverage which relies on mixed Escherichia coli extracts from cells cultured in
regular and 13C-enriched media. After mass spectrometry-based
metabolomic analysis of these extracts, we “credential”
features arising from E. coli metabolites
on the basis of isotope spacing and intensity. This credentialing
platform enables us to accurately compare the number of nonartifactual
features yielded by different experimental approaches. We highlight
the value of our platform by reoptimizing a published untargeted metabolomic
method for XCMS data processing. Compared to the published parameters,
the new XCMS parameters decrease the total number of features by 15%
(a reduction in noise features) while increasing the number of true
metabolites detected and grouped by 20%. Our credentialing platform
relies on easily generated E. coli samples
and a simple software algorithm that is freely available on our laboratory
Web site (http://pattilab.wustl.edu/software/credential/). We have validated the credentialing platform with reversed-phase
and hydrophilic interaction liquid chromatography as well as Agilent,
Thermo Scientific, AB SCIEX, and LECO mass spectrometers. Thus, the
credentialing platform can readily be applied by any laboratory to
optimize their untargeted metabolomic pipeline for metabolite extraction,
chromatographic separation, mass spectrometric detection, and bioinformatic
processing.
Collapse
Affiliation(s)
- Nathaniel Guy Mahieu
- Department of Chemistry, Washington University in St. Louis , St. Louis, Missouri 63130, United States
| | | | | | | |
Collapse
|
36
|
Cho K, Evans BS, Wood BM, Kumar R, Erb TJ, Warlick BP, Gerlt JA, Sweedler JV. Integration of untargeted metabolomics with transcriptomics reveals active metabolic pathways. Metabolomics 2014; 2014. [PMID: 25705145 PMCID: PMC4334135 DOI: 10.1007/s11306-014-0713-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
While recent advances in metabolomic measurement technologies have been dramatic, extracting biological insight from complex metabolite profiles remains a challenge. We present an analytical strategy that uses data obtained from high resolution liquid chromatography-mass spectrometry and a bioinformatics toolset for detecting actively changing metabolic pathways upon external perturbation. We begin with untargeted metabolite profiling to nominate altered metabolites and identify pathway candidates, followed by validation of those pathways with transcriptomics. Using the model organisms Rhodospirillum rubrum and Bacillus subtilis, our results reveal metabolic pathways that are interconnected with methionine salvage. The rubrum-type methionine salvage pathway is interconnected with the active methyl cycle in which re-methylation, a key reaction for recycling methionine from homocysteine, is unexpectedly suppressed; instead, homocysteine is catabolized by the transsulfuration pathway. Notably, the non-mevalonate pathway is repressed, whereas the rubrum-type methionine salvage pathway contributes to isoprenoid biosynthesis upon 5'-methylthioadenosine feeding. In this process, glutathione functions as a coenzyme in vivo when 1-methylthio-d-xylulose 5-phosphate (MTXu 5-P) methylsulfurylase catalyzes dethiomethylation of MTXu 5-P. These results clearly show that our analytical approach enables unexpected metabolic pathways to be uncovered.
Collapse
Affiliation(s)
- Kyuil Cho
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
| | - Bradley S. Evans
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
| | - B. McKay Wood
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Ritesh Kumar
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Tobias J. Erb
- Institute for Microbiology, Swiss Federal Institute of Technology (ETH) Zurich, CH-8093 Zurich, Switzerland
| | - Benjamin P. Warlick
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - John A. Gerlt
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
| | - Jonathan V. Sweedler
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA
- Department of Chemistry, University of Illinois at Urbana-Champaign, 600 S. Mathews Ave., Urbana, IL 61801 USA
| |
Collapse
|
37
|
Fernández-Albert F, Llorach R, Andrés-Lacueva C, Perera A. An R package to analyse LC/MS metabolomic data: MAIT (Metabolite Automatic Identification Toolkit). ACTA ACUST UNITED AC 2014; 30:1937-9. [PMID: 24642061 PMCID: PMC4071204 DOI: 10.1093/bioinformatics/btu136] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Summary: Current tools for liquid chromatography and mass spectrometry for metabolomic data cover a limited number of processing steps, whereas online tools are hard to use in a programmable fashion. This article introduces the Metabolite Automatic Identification Toolkit (MAIT) package, which makes it possible for users to perform metabolomic end-to-end liquid chromatography and mass spectrometry data analysis. MAIT is focused on improving the peak annotation stage and provides essential tools to validate statistical analysis results. MAIT generates output files with the statistical results, peak annotation and metabolite identification. Availability and implementation:http://b2slab.upc.edu/software-and-downloads/metabolite-automatic-identification-toolkit/. Contact:francesc.fernandez.albert@upc.edu Supplementary information:Supplementary data are available at Bioinformatics online
Collapse
Affiliation(s)
- Francesc Fernández-Albert
- B2SLab., Department d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Pau Gargallo, 5, 08028 Barcelona, Biomarkers & Nutrimetabolomic Lab., Department of Nutrition and Food Science-XaRTA, INSA, Faculty of Pharmacy, Food and Nutrition Torribera Campus, University of Barcelona, Av. Prat de la Riba 171, 08921, Sta Coloma de Gramenet, and INGENIO-CONSOLIDER Program, FUN-C-Food CSD2007-063, Av Joan XXIII s/n 08028, Barcelona, SpainB2SLab., Department d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Pau Gargallo, 5, 08028 Barcelona, Biomarkers & Nutrimetabolomic Lab., Department of Nutrition and Food Science-XaRTA, INSA, Faculty of Pharmacy, Food and Nutrition Torribera Campus, University of Barcelona, Av. Prat de la Riba 171, 08921, Sta Coloma de Gramenet, and INGENIO-CONSOLIDER Program, FUN-C-Food CSD2007-063, Av Joan XXIII s/n 08028, Barcelona, Spain
| | - Rafael Llorach
- B2SLab., Department d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Pau Gargallo, 5, 08028 Barcelona, Biomarkers & Nutrimetabolomic Lab., Department of Nutrition and Food Science-XaRTA, INSA, Faculty of Pharmacy, Food and Nutrition Torribera Campus, University of Barcelona, Av. Prat de la Riba 171, 08921, Sta Coloma de Gramenet, and INGENIO-CONSOLIDER Program, FUN-C-Food CSD2007-063, Av Joan XXIII s/n 08028, Barcelona, SpainB2SLab., Department d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Pau Gargallo, 5, 08028 Barcelona, Biomarkers & Nutrimetabolomic Lab., Department of Nutrition and Food Science-XaRTA, INSA, Faculty of Pharmacy, Food and Nutrition Torribera Campus, University of Barcelona, Av. Prat de la Riba 171, 08921, Sta Coloma de Gramenet, and INGENIO-CONSOLIDER Program, FUN-C-Food CSD2007-063, Av Joan XXIII s/n 08028, Barcelona, Spain
| | - Cristina Andrés-Lacueva
- B2SLab., Department d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Pau Gargallo, 5, 08028 Barcelona, Biomarkers & Nutrimetabolomic Lab., Department of Nutrition and Food Science-XaRTA, INSA, Faculty of Pharmacy, Food and Nutrition Torribera Campus, University of Barcelona, Av. Prat de la Riba 171, 08921, Sta Coloma de Gramenet, and INGENIO-CONSOLIDER Program, FUN-C-Food CSD2007-063, Av Joan XXIII s/n 08028, Barcelona, SpainB2SLab., Department d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Pau Gargallo, 5, 08028 Barcelona, Biomarkers & Nutrimetabolomic Lab., Department of Nutrition and Food Science-XaRTA, INSA, Faculty of Pharmacy, Food and Nutrition Torribera Campus, University of Barcelona, Av. Prat de la Riba 171, 08921, Sta Coloma de Gramenet, and INGENIO-CONSOLIDER Program, FUN-C-Food CSD2007-063, Av Joan XXIII s/n 08028, Barcelona, Spain
| | - Alexandre Perera
- B2SLab., Department d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Pau Gargallo, 5, 08028 Barcelona, Biomarkers & Nutrimetabolomic Lab., Department of Nutrition and Food Science-XaRTA, INSA, Faculty of Pharmacy, Food and Nutrition Torribera Campus, University of Barcelona, Av. Prat de la Riba 171, 08921, Sta Coloma de Gramenet, and INGENIO-CONSOLIDER Program, FUN-C-Food CSD2007-063, Av Joan XXIII s/n 08028, Barcelona, Spain
| |
Collapse
|
38
|
Scheubert K, Hufsky F, Böcker S. Computational mass spectrometry for small molecules. J Cheminform 2013; 5:12. [PMID: 23453222 PMCID: PMC3648359 DOI: 10.1186/1758-2946-5-12] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Accepted: 02/01/2013] [Indexed: 12/29/2022] Open
Abstract
: The identification of small molecules from mass spectrometry (MS) data remains a major challenge in the interpretation of MS data. This review covers the computational aspects of identifying small molecules, from the identification of a compound searching a reference spectral library, to the structural elucidation of unknowns. In detail, we describe the basic principles and pitfalls of searching mass spectral reference libraries. Determining the molecular formula of the compound can serve as a basis for subsequent structural elucidation; consequently, we cover different methods for molecular formula identification, focussing on isotope pattern analysis. We then discuss automated methods to deal with mass spectra of compounds that are not present in spectral libraries, and provide an insight into de novo analysis of fragmentation spectra using fragmentation trees. In addition, this review shortly covers the reconstruction of metabolic networks using MS data. Finally, we list available software for different steps of the analysis pipeline.
Collapse
Affiliation(s)
- Kerstin Scheubert
- Chair of Bioinformatics, Friedrich Schiller University, Ernst-Abbe-Platz 2, Jena, Germany.
| | | | | |
Collapse
|
39
|
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data. Metabolites 2012; 2:775-95. [PMID: 24957762 PMCID: PMC3901240 DOI: 10.3390/metabo2040775] [Citation(s) in RCA: 182] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2012] [Revised: 10/02/2012] [Accepted: 10/10/2012] [Indexed: 11/17/2022] Open
Abstract
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.
Collapse
|
40
|
Valdés A, Simó C, Ibáñez C, Rocamora-Reverte L, Ferragut JA, García-Cañas V, Cifuentes A. Effect of dietary polyphenols on K562 leukemia cells: A Foodomics approach. Electrophoresis 2012; 33:2314-27. [DOI: 10.1002/elps.201200133] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
| | | | - Clara Ibáñez
- Laboratory of Foodomics; CIAL (CSIC); Madrid; Spain
| | - Lourdes Rocamora-Reverte
- Institute of Molecular and Cellular Biology; Miguel Hernández University; Elche, Alicante; Spain
| | - José Antonio Ferragut
- Institute of Molecular and Cellular Biology; Miguel Hernández University; Elche, Alicante; Spain
| | | | | |
Collapse
|
41
|
Ibáñez C, Simó C, García-Cañas V, Gómez-Martínez Á, Ferragut JA, Cifuentes A. CE/LC-MS multiplatform for broad metabolomic analysis of dietary polyphenols effect on colon cancer cells proliferation. Electrophoresis 2012; 33:2328-36. [DOI: 10.1002/elps.201200143] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Clara Ibáñez
- Laboratory of Foodomics; CIAL (CSIC); Madrid; Spain
| | | | | | - Ángeles Gómez-Martínez
- Institute of Molecular and Cellular Biology; Miguel Hernández University; Avda. Universidad s/n; Elche; Alicante; Spain
| | - José A. Ferragut
- Institute of Molecular and Cellular Biology; Miguel Hernández University; Avda. Universidad s/n; Elche; Alicante; Spain
| | | |
Collapse
|
42
|
Kuhl C, Tautenhahn R, Böttcher C, Larson TR, Neumann S. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal Chem 2011; 84:283-9. [PMID: 22111785 DOI: 10.1021/ac202450g] [Citation(s) in RCA: 728] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Liquid chromatography coupled to mass spectrometry is routinely used for metabolomics experiments. In contrast to the fairly routine and automated data acquisition steps, subsequent compound annotation and identification require extensive manual analysis and thus form a major bottleneck in data interpretation. Here we present CAMERA, a Bioconductor package integrating algorithms to extract compound spectra, annotate isotope and adduct peaks, and propose the accurate compound mass even in highly complex data. To evaluate the algorithms, we compared the annotation of CAMERA against a manually defined annotation for a mixture of known compounds spiked into a complex matrix at different concentrations. CAMERA successfully extracted accurate masses for 89.7% and 90.3% of the annotatable compounds in positive and negative ion modes, respectively. Furthermore, we present a novel annotation approach that combines spectral information of data acquired in opposite ion modes to further improve the annotation rate. We demonstrate the utility of CAMERA in two different, easily adoptable plant metabolomics experiments, where the application of CAMERA drastically reduced the amount of manual analysis.
Collapse
Affiliation(s)
- Carsten Kuhl
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle (Saale), Germany.
| | | | | | | | | |
Collapse
|