1
|
Gaudêncio SP, Bayram E, Lukić Bilela L, Cueto M, Díaz-Marrero AR, Haznedaroglu BZ, Jimenez C, Mandalakis M, Pereira F, Reyes F, Tasdemir D. Advanced Methods for Natural Products Discovery: Bioactivity Screening, Dereplication, Metabolomics Profiling, Genomic Sequencing, Databases and Informatic Tools, and Structure Elucidation. Mar Drugs 2023; 21:md21050308. [PMID: 37233502 DOI: 10.3390/md21050308] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 05/11/2023] [Accepted: 05/12/2023] [Indexed: 05/27/2023] Open
Abstract
Natural Products (NP) are essential for the discovery of novel drugs and products for numerous biotechnological applications. The NP discovery process is expensive and time-consuming, having as major hurdles dereplication (early identification of known compounds) and structure elucidation, particularly the determination of the absolute configuration of metabolites with stereogenic centers. This review comprehensively focuses on recent technological and instrumental advances, highlighting the development of methods that alleviate these obstacles, paving the way for accelerating NP discovery towards biotechnological applications. Herein, we emphasize the most innovative high-throughput tools and methods for advancing bioactivity screening, NP chemical analysis, dereplication, metabolite profiling, metabolomics, genome sequencing and/or genomics approaches, databases, bioinformatics, chemoinformatics, and three-dimensional NP structure elucidation.
Collapse
Affiliation(s)
- Susana P Gaudêncio
- Associate Laboratory i4HB-Institute for Health and Bioeconomy, NOVA School of Science and Technology, NOVA University Lisbon, 2819-516 Caparica, Portugal
- UCIBIO-Applied Molecular Biosciences Unit, Chemistry Department, NOVA School of Science and Technology, NOVA University of Lisbon, 2819-516 Caparica, Portugal
| | - Engin Bayram
- Institute of Environmental Sciences, Room HKC-202, Hisar Campus, Bogazici University, Bebek, Istanbul 34342, Turkey
| | - Lada Lukić Bilela
- Department of Biology, Faculty of Science, University of Sarajevo, 71000 Sarajevo, Bosnia and Herzegovina
| | - Mercedes Cueto
- Instituto de Productos Naturales y Agrobiología-CSIC, 38206 La Laguna, Spain
| | - Ana R Díaz-Marrero
- Instituto de Productos Naturales y Agrobiología-CSIC, 38206 La Laguna, Spain
- Instituto Universitario de Bio-Orgánica (IUBO), Universidad de La Laguna, 38206 La Laguna, Spain
| | - Berat Z Haznedaroglu
- Institute of Environmental Sciences, Room HKC-202, Hisar Campus, Bogazici University, Bebek, Istanbul 34342, Turkey
| | - Carlos Jimenez
- CICA- Centro Interdisciplinar de Química e Bioloxía, Departamento de Química, Facultade de Ciencias, Universidade da Coruña, 15071 A Coruña, Spain
| | - Manolis Mandalakis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, HCMR Thalassocosmos, 71500 Gournes, Crete, Greece
| | - Florbela Pereira
- LAQV, REQUIMTE, Chemistry Department, NOVA School of Science and Technology, NOVA University of Lisbon, 2819-516 Caparica, Portugal
| | - Fernando Reyes
- Fundación MEDINA, Avda. del Conocimiento 34, 18016 Armilla, Spain
| | - Deniz Tasdemir
- GEOMAR Centre for Marine Biotechnology (GEOMAR-Biotech), Research Unit Marine Natural Products Chemistry, GEOMAR Helmholtz Centre for Ocean Research Kiel, Am Kiel-Kanal 44, 24106 Kiel, Germany
- Faculty of Mathematics and Natural Science, Kiel University, Christian-Albrechts-Platz 4, 24118 Kiel, Germany
| |
Collapse
|
2
|
Moreira LMG, Junker J. Sampling CASE Application for the Quality Control of Published Natural Product Structures. Molecules 2021; 26:molecules26247543. [PMID: 34946623 PMCID: PMC8708086 DOI: 10.3390/molecules26247543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 09/06/2021] [Accepted: 10/19/2021] [Indexed: 12/03/2022] Open
Abstract
Structure elucidation with NMR correlation data is dicey, as there is no way to tell how ambiguous the data set is and how reliably it will define a constitution. Many different software tools for computer assisted structure elucidation (CASE) have become available over the past decades, all of which could ensure a better quality of the elucidation process, but their use is still not common. Since 2011, WebCocon has integrated the possibility to generate theoretical NMR correlation data, starting from an existing structural proposal, allowing this theoretical data then to be used for CASE. Now, WebCocon can also read the recently presented NMReDATA format, allowing for uncomplicated access to CASE with experimental data. With these capabilities, WebCocon presents itself as an easily accessible Web-Tool for the quality control of proposed new natural products. Results of this application to several molecules from literature are shown and demonstrate how CASE can contribute to improve the reliability of Structure elucidation with NMR correlation data.
Collapse
|
3
|
Köck M, Lindel T, Junker J. Incorporation of 4J-HMBC and NOE Data into Computer-Assisted Structure Elucidation with WebCocon. Molecules 2021; 26:molecules26164846. [PMID: 34443433 PMCID: PMC8398166 DOI: 10.3390/molecules26164846] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 01/13/2023] Open
Abstract
Over the past decades, different software programs have been developed for the Computer-Assisted Structure Elucidation (CASE) with NMR data using with various approaches. WebCocon is one of them that has been continuously improved over the past 20 years. Here, we present the inclusion of 4JCH correlations (4J-HMBC) in the HMBC interpretation of Cocon and NOE data in WebCocon. The 4J-HMBC data is used during the structure generation process, while the NOE data is used in post-processing of the results. The marine natural product oxocyclostylidol was selected to demonstrate WebCocon’s enhanced HMBC data processing capabilities. A systematic study of the 4JCH correlations of oxocyclostylidol was performed. The application of NOEs in CASE is demonstrated using the NOE correlations of the diterpene pyrone asperginol A known from the literature. As a result, we obtained a conformation that corresponds very well to the existing X-ray structure.
Collapse
Affiliation(s)
- Matthias Köck
- Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, 27570 Bremerhaven, Germany
- Correspondence: (M.K.); (J.J.)
| | - Thomas Lindel
- Institute of Organic Chemistry, Technical University of Braunschweig, 38106 Braunschweig, Germany;
| | - Jochen Junker
- Oswaldo Cruz Foundation–CDTS, Rio de Janeiro 21040-900, Brazil
- Correspondence: (M.K.); (J.J.)
| |
Collapse
|
4
|
Li J, Nagamochi H, Akutsu T. Enumerating Substituted Benzene Isomers of Tree-Like Chemical Graphs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:633-646. [PMID: 28113952 DOI: 10.1109/tcbb.2016.2628888] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Enumeration of chemical structures is useful for drug design, which is one of the main targets of computational biology and bioinformatics. A chemical graph with no other cycles than benzene rings is called tree-like, and becomes a tree possibly with multiple edges if we contract each benzene ring into a single virtual atom of valence 6. All tree-like chemical graphs with a given tree representation are called the substituted benzene isomers of . When we replace each virtual atom in with a benzene ring to obtain a substituted benzene isomer, distinct isomers of are caused by the difference in arrangements of atom groups around a benzene ring. In this paper, we propose an efficient algorithm that enumerates all substituted benzene isomers of a given tree representation . Our algorithm first counts the number of all the isomers of the tree representation by a dynamic programming method. To enumerate all the isomers, for each , our algorithm then generates the th isomer by backtracking the counting phase of the dynamic programming. We also implemented our algorithm for computational experiments.
Collapse
|
5
|
Meringer M, Cleaves HJ. Computational exploration of the chemical structure space of possible reverse tricarboxylic acid cycle constituents. Sci Rep 2017; 7:17540. [PMID: 29235498 PMCID: PMC5727506 DOI: 10.1038/s41598-017-17345-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 11/23/2017] [Indexed: 11/16/2022] Open
Abstract
The reverse tricarboxylic acid (rTCA) cycle has been explored from various standpoints as an idealized primordial metabolic cycle. Its simplicity and apparent ubiquity in diverse organisms across the tree of life have been used to argue for its antiquity and its optimality. In 2000 it was proposed that chemoinformatics approaches support some of these views. Specifically, defined queries of the Beilstein database showed that the molecules of the rTCA are heavily represented in such compound databases. We explore here the chemical structure “space,” e.g. the set of organic compounds which possesses some minimal set of defining characteristics, of the rTCA cycle’s intermediates using an exhaustive structure generation method. The rTCA’s chemical space as defined by the original criteria and explored by our method is some six to seven times larger than originally considered. Acknowledging that each assumption in what is a defining criterion making the rTCA cycle special limits possible generative outcomes, there are many unrealized compounds which fulfill these criteria. That these compounds are unrealized could be due to evolutionary frozen accidents or optimization, though this optimization may also be for systems-level reasons, e.g., the way the pathway and its elements interface with other aspects of metabolism.
Collapse
Affiliation(s)
- Markus Meringer
- German Aerospace Center (DLR), Earth Observation Center (EOC), Münchner Straße 20, D-82234, Oberpfaffenhofen-Wessling, Germany
| | - H James Cleaves
- Earth-Life Science Institute, Tokyo Institute of Technology, 2-12-IE-1 Ookayama, Meguro-ku, Tokyo, 152-8551, Japan. .,The Institute for Advanced Study, 1 Einstein Drive, Princeton, NJ, 08540, USA. .,Blue Marble Space Institute of Science, 1515 Gallatin St. NW, Washington, DC, 20011, USA. .,Center for Chemical Evolution, Georgia Institute of Technology, Atlanta, GA, 30332, Georgia.
| |
Collapse
|
6
|
Allen F, Pon A, Greiner R, Wishart D. Computational Prediction of Electron Ionization Mass Spectra to Assist in GC/MS Compound Identification. Anal Chem 2016; 88:7689-97. [PMID: 27381172 DOI: 10.1021/acs.analchem.6b01622] [Citation(s) in RCA: 91] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We describe a tool, competitive fragmentation modeling for electron ionization (CFM-EI) that, given a chemical structure (e.g., in SMILES or InChI format), computationally predicts an electron ionization mass spectrum (EI-MS) (i.e., the type of mass spectrum commonly generated by gas chromatography mass spectrometry). The predicted spectra produced by this tool can be used for putative compound identification, complementing measured spectra in reference databases by expanding the range of compounds able to be considered when availability of measured spectra is limited. The tool extends CFM-ESI, a recently developed method for computational prediction of electrospray tandem mass spectra (ESI-MS/MS), but unlike CFM-ESI, CFM-EI can handle odd-electron ions and isotopes and incorporates an artificial neural network. Tests on EI-MS data from the NIST database demonstrate that CFM-EI is able to model fragmentation likelihoods in low-resolution EI-MS data, producing predicted spectra whose dot product scores are significantly better than full enumeration "bar-code" spectra. CFM-EI also outperformed previously reported results for MetFrag, MOLGEN-MS, and Mass Frontier on one compound identification task. It also outperformed MetFrag in a range of other compound identification tasks involving a much larger data set, containing both derivatized and nonderivatized compounds. While replicate EI-MS measurements of chemical standards are still a more accurate point of comparison, CFM-EI's predictions provide a much-needed alternative when no reference standard is available for measurement. CFM-EI is available at https://sourceforge.net/projects/cfm-id/ for download and http://cfmid.wishartlab.com as a web service.
Collapse
Affiliation(s)
- Felicity Allen
- Department of Computing Science, University of Alberta , Edmonton T6G 2E8, Canada
| | - Allison Pon
- Department of Computing Science, University of Alberta , Edmonton T6G 2E8, Canada
| | - Russ Greiner
- Department of Computing Science, University of Alberta , Edmonton T6G 2E8, Canada
| | - David Wishart
- Department of Computing Science, University of Alberta , Edmonton T6G 2E8, Canada
| |
Collapse
|
7
|
Ring system-based chemical graph generation for de novo molecular design. J Comput Aided Mol Des 2016; 30:425-46. [PMID: 27299746 DOI: 10.1007/s10822-016-9916-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Accepted: 05/31/2016] [Indexed: 10/21/2022]
Abstract
Generating chemical graphs in silico by combining building blocks is important and fundamental in virtual combinatorial chemistry. A premise in this area is that generated structures should be irredundant as well as exhaustive. In this study, we develop structure generation algorithms regarding combining ring systems as well as atom fragments. The proposed algorithms consist of three parts. First, chemical structures are generated through a canonical construction path. During structure generation, ring systems can be treated as reduced graphs having fewer vertices than those in the original ones. Second, diversified structures are generated by a simple rule-based generation algorithm. Third, the number of structures to be generated can be estimated with adequate accuracy without actual exhaustive generation. The proposed algorithms were implemented in structure generator Molgilla. As a practical application, Molgilla generated chemical structures mimicking rosiglitazone in terms of a two dimensional pharmacophore pattern. The strength of the algorithms lies in simplicity and flexibility. Therefore, they may be applied to various computer programs regarding structure generation by combining building blocks.
Collapse
|
8
|
Vaniya A, Fiehn O. Using fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics. Trends Analyt Chem 2015; 69:52-61. [PMID: 26213431 PMCID: PMC4509603 DOI: 10.1016/j.trac.2015.04.002] [Citation(s) in RCA: 97] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Identification of unknown metabolites is the bottleneck in advancing metabolomics, leaving interpretation of metabolomics results ambiguous. The chemical diversity of metabolism is vast, making structure identification arduous and time consuming. Currently, comprehensive analysis of mass spectra in metabolomics is limited to library matching, but tandem mass spectral libraries are small compared to the large number of compounds found in the biosphere, including xenobiotics. Resolving this bottleneck requires richer data acquisition and better computational tools. Multi-stage mass spectrometry (MSn) trees show promise to aid in this regard. Fragmentation trees explore the fragmentation process, generate fragmentation rules and aid in sub-structure identification, while mass spectral trees delineate the dependencies in multi-stage MS of collision-induced dissociations. This review covers advancements over the past 10 years as a tool for metabolite identification, including algorithms, software and databases used to build and to implement fragmentation trees and mass spectral annotations.
Collapse
Affiliation(s)
- Arpana Vaniya
- University of California Davis, Department of Chemistry, One Shields Avenue, Davis, CA 95616, USA
- University of California Davis, West Coast Metabolomics Center, Genome Center, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Oliver Fiehn
- University of California Davis, West Coast Metabolomics Center, Genome Center, 451 Health Sciences Drive, Davis, CA 95616, USA
- King Abdulaziz University, Biochemistry Department, Jeddah, Saudi Arabia
| |
Collapse
|
9
|
Huntscha S, Hofstetter TB, Schymanski EL, Spahr S, Hollender J. Biotransformation of benzotriazoles: insights from transformation product identification and compound-specific isotope analysis. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2014; 48:4435-4443. [PMID: 24621328 DOI: 10.1021/es405694z] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Benzotriazoles are widely used domestic and industrial corrosion inhibitors and have become omnipresent organic micropollutants in the aquatic environment. Here, the range of aerobic biological degradation mechanisms of benzotriazoles in activated sludge was investigated. Degradation pathways were elucidated by identifying transient and persistent transformation products in batch experiments using liquid chromatography-high-resolution tandem mass spectrometry (LC-HR-MS/MS). In addition, initial reactions were studied using compound-specific isotope analysis (CSIA). Biodegradation half-lives of 1.0 days for 1H-benzotriazole, 8.5 days for 4-methyl-1H-benzotriazole, and 0.9 days for 5-methyl-1H-benzotriazole with activated sludge confirmed their known partial persistence in conventional wastewater treatment. Major transformation products were identified as 4- and 5-hydroxy-1H-benzotriazole for the degradation of 1H-benzotriazole, and 1H-benzotriazole-5-carboxylic acid for the degradation of 5-methyl-1H-benzotriazole. These transformation products were found in wastewater effluents, showing their environmental relevance. Many other candidate transformation products, tentatively identified by interpretation of HR-MS/MS spectra, showed the broad range of possible reaction pathways including oxidation, alkylation, hydroxylation and indicate the significance of cometabolic processes for micropollutant degradation in biological wastewater treatment in general. The combination of evidence from product analysis with the significant carbon and nitrogen isotope fractionation suggests that aromatic monohydroxylation is the predominant step during the biotransformation of 1H-benzotriazole.
Collapse
Affiliation(s)
- Sebastian Huntscha
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland
| | | | | | | | | |
Collapse
|
10
|
Meringer M, Schymanski EL. Small Molecule Identification with MOLGEN and Mass Spectrometry. Metabolites 2013; 3:440-62. [PMID: 24958000 PMCID: PMC3901272 DOI: 10.3390/metabo3020440] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 05/16/2013] [Accepted: 05/17/2013] [Indexed: 12/21/2022] Open
Abstract
This paper details the MOLGEN entries for the 2012 CASMI contest for small molecule identification to demonstrate structure elucidation using structure generation approaches. Different MOLGEN programs were used for different categories, including MOLGEN-MS/MS for Category 1, MOLGEN 3.5 and 5.0 for Category 2 and MOLGEN-MS for Categories 3 and 4. A greater focus is given to Categories 1 and 2, as most CASMI participants entered these categories. The settings used and the reasons behind them are described in detail, while various evaluations are used to put these results into perspective. As one author was also an organiser of CASMI, these submissions were not part of the official CASMI competition, but this paper provides an insight into how unknown identification could be performed using structure generation approaches. The approaches are semi-automated (category dependent) and benefit greatly from user experience. Thus, the results presented and discussed here may be better than those an inexperienced user could obtain with MOLGEN programs.
Collapse
Affiliation(s)
- Markus Meringer
- DLR: German Aerospace Center, Earth Observation Center (EOC), Münchner Strasse 20, D-82234 Oberpfaffenhofen-Wessling, Germany.
| | - Emma L Schymanski
- Eawag: Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, CH-8600 Dübendorf, Switzerland.
| |
Collapse
|
11
|
CASMI: And the Winner is . . . Metabolites 2013; 3:412-39. [PMID: 24957999 PMCID: PMC3901266 DOI: 10.3390/metabo3020412] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2013] [Revised: 05/14/2013] [Accepted: 05/16/2013] [Indexed: 11/25/2022] Open
Abstract
The Critical Assessment of Small Molecule Identification (CASMI) Contest was founded in 2012 to provide scientists with a common open dataset to evaluate their identification methods. In this review, we summarize the submissions, evaluate procedures and discuss the results. We received five submissions (three external, two internal) for LC–MS Category 1 (best molecular formula) and six submissions (three external, three internal) for LC–MS Category 2 (best molecular structure). No external submissions were received for the GC–MS Categories 3 and 4. The team of Dunn et al. from Birmingham had the most answers in the 1st place for Category 1, while Category 2 was won by H. Oberacher. Despite the low number of participants, the external and internal submissions cover a broad range of identification strategies, including expert knowledge, database searching, automated methods and structure generation. The results of Category 1 show that complementing automated strategies with (manual) expert knowledge was the most successful approach, while no automated method could compete with the power of spectral searching for Category 2—if the challenge was present in a spectral library. Every participant topped at least one challenge, showing that different approaches are still necessary for interpretation diversity.
Collapse
|
12
|
Ulrich N, Schüürmann G, Brack W. Prediction of gas chromatographic retention indices as classifier in non-target analysis of environmental samples. J Chromatogr A 2013; 1285:139-47. [DOI: 10.1016/j.chroma.2013.02.037] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Revised: 02/11/2013] [Accepted: 02/12/2013] [Indexed: 10/27/2022]
|
13
|
Jeon J, Kurth D, Hollender J. Biotransformation Pathways of Biocides and Pharmaceuticals in Freshwater Crustaceans Based on Structure Elucidation of Metabolites Using High Resolution Mass Spectrometry. Chem Res Toxicol 2013; 26:313-24. [DOI: 10.1021/tx300457f] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Junho Jeon
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf,
Switzerland
| | - Denise Kurth
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf,
Switzerland
- Institute of
Biogeochemistry
and Pollutant Dynamics, ETH Zürich, CH-8092, Zürich, Switzerland
| | - Juliane Hollender
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf,
Switzerland
- Institute of
Biogeochemistry
and Pollutant Dynamics, ETH Zürich, CH-8092, Zürich, Switzerland
| |
Collapse
|
14
|
Menikarachchi LC, Cawley S, Hill DW, Hall LM, Hall L, Lai S, Wilder J, Grant DF. MolFind: a software package enabling HPLC/MS-based identification of unknown chemical structures. Anal Chem 2012; 84:9388-94. [PMID: 23039714 PMCID: PMC3523192 DOI: 10.1021/ac302048x] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
In this paper, we present MolFind, a highly multithreaded pipeline type software package for use as an aid in identifying chemical structures in complex biofluids and mixtures. MolFind is specifically designed for high-performance liquid chromatography/mass spectrometry (HPLC/MS) data inputs typical of metabolomics studies where structure identification is the ultimate goal. MolFind enables compound identification by matching HPLC/MS-based experimental data obtained for an unknown compound with computationally derived HPLC/MS values for candidate compounds downloaded from chemical databases such as PubChem. The downloaded "bins" consist of all compounds matching the monoisotopic molecular weight of the unknown. The computational HPLC/MS values predicted include retention index (RI), ECOM(50) (energy required to fragment 50% of a selected precursor ion), drift time, and collision induced dissociation (CID) spectrum. RI, ECOM(50), and drift-time models are used for filtering compounds downloaded from PubChem. The remaining candidates are then ranked based on CID spectra matching. Current RI and ECOM(50) models allow for the removal of about 28% of compounds from PubChem bins. Our estimates suggest that this could be improved to as much as 87% with additional chemical structures included in the computational models. Quantitative structure property relationship-based modeling of drift times showed a better correlation with experimentally determined drift times than did Mobcal cross-sectional areas. In 23 of 35 example cases, filtering PubChem bins with RI and ECOM(50) predictive models resulted in improved ranking of the unknown compounds compared to previous studies using CID spectra matching alone. In 19 of 35 examples, the correct candidate was ranked within the top 20 compounds in bins containing an average of 1635 compounds.
Collapse
Affiliation(s)
- Lochana C. Menikarachchi
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut, United States
| | - Shannon Cawley
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut, United States
| | - Dennis W. Hill
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut, United States
| | - L. Mark Hall
- Hall Associates Consulting, Quincy, Massachusetts, United States
| | - Lowell Hall
- Department of Chemistry, Eastern Nazarene College, Quincy, Massachusetts, United States
| | - Steven Lai
- Waters Corporation, Beverly, Massachusetts, United States
| | - Janine Wilder
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut, United States
| | - David F. Grant
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut, United States
| |
Collapse
|
15
|
Linear solvation energy relationships as classifiers in non-target analysis – A gas chromatographic approach. J Chromatogr A 2012; 1264:95-103. [DOI: 10.1016/j.chroma.2012.09.051] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2012] [Revised: 08/29/2012] [Accepted: 09/15/2012] [Indexed: 11/20/2022]
|
16
|
Meinert C, Schymanski E, Küster E, Kühne R, Schüürmann G, Brack W. Application of preparative capillary gas chromatography (pcGC), automated structure generation and mutagenicity prediction to improve effect-directed analysis of genotoxicants in a contaminated groundwater. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2010; 17:885-897. [PMID: 20119663 DOI: 10.1007/s11356-009-0286-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2009] [Accepted: 12/21/2009] [Indexed: 05/28/2023]
Abstract
BACKGROUND, AIM AND SCOPE The importance of groundwater for human life cannot be overemphasised. Besides fulfilling essential ecological functions, it is a major source of drinking water. However, in the industrial area of Bitterfeld, it is contaminated with a multitude of harmful chemicals, including genotoxicants. Therefore, recently developed methodologies including preparative capillary gas chromatography (pcGC), MOLGEN-MS structure generation and mutagenicity prediction were applied within effect-directed analysis (EDA) to reduce sample complexity and to identify candidate mutagens in the samples. A major focus was put on the added value of these tools compared to conventional EDA combining reversed-phase liquid chromatography (RP-LC) followed by GC/MS analysis and MS library search. MATERIALS AND METHODS We combined genotoxicity testing with umuC and RP-LC with pcGC fractionation to isolate genotoxic compounds from a contaminated groundwater sample. Spectral library information from the NIST05 database was combined with a computer-based structure generation tool called MOLGEN-MS for structure elucidation of unknowns. Finally, we applied a computer model for mutagenicity prediction (ChemProp) to identify candidate mutagens and genotoxicants. RESULTS AND DISCUSSION A total of 62 components were tentatively identified in genotoxic fractions. Ten of these components were predicted to be potentially mutagenic, whilst 2,4,6-trichlorophenol, 2,4-dichloro-6-methylphenol and 4-chlorobenzoic acid were confirmed as genotoxicants. CONCLUSIONS AND PERSPECTIVES The results suggest pcGC as a high-resolution fractionation tool and MOLGEN-MS to improve structure elucidation, whilst mutagenicity prediction failed in our study to predict identified genotoxicants. Genotoxicity, mutagenicity and carcinogenicity caused by chemicals are complex processes, and prediction from chemical structure still appears to be quite difficult. Progress in this field would significantly support EDA and risk assessment of environmental mixtures.
Collapse
Affiliation(s)
- Cornelia Meinert
- Department of Effect-Directed Analysis, UFZ, Helmholtz Centre for Environmental Research, Permoserstrasse 15, 04318, Leipzig, Germany.
| | | | | | | | | | | |
Collapse
|
17
|
Integrated analytical and computer tools for structure elucidation in effect-directed analysis. Trends Analyt Chem 2009. [DOI: 10.1016/j.trac.2009.03.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
18
|
Schymanski EL, Meringer M, Brack W. Matching Structures to Mass Spectra Using Fragmentation Patterns: Are the Results As Good As They Look? Anal Chem 2009; 81:3608-17. [DOI: 10.1021/ac802715e] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Emma L. Schymanski
- UFZ, Helmholtz Centre for Environmental Research-UFZ, Department of Effect-Directed Analysis, Permoserstrasse 15, D-04318 Leipzig, Germany, and DLR, German Aerospace Centre-DLR, Remote Sensing Technology Institute, Münchener Strasse 20, D-82234 Oberpfaffenhofen-Wessling, Germany
| | - Markus Meringer
- UFZ, Helmholtz Centre for Environmental Research-UFZ, Department of Effect-Directed Analysis, Permoserstrasse 15, D-04318 Leipzig, Germany, and DLR, German Aerospace Centre-DLR, Remote Sensing Technology Institute, Münchener Strasse 20, D-82234 Oberpfaffenhofen-Wessling, Germany
| | - Werner Brack
- UFZ, Helmholtz Centre for Environmental Research-UFZ, Department of Effect-Directed Analysis, Permoserstrasse 15, D-04318 Leipzig, Germany, and DLR, German Aerospace Centre-DLR, Remote Sensing Technology Institute, Münchener Strasse 20, D-82234 Oberpfaffenhofen-Wessling, Germany
| |
Collapse
|
19
|
Schymanski EL, Meinert C, Meringer M, Brack W. The use of MS classifiers and structure generation to assist in the identification of unknowns in effect-directed analysis. Anal Chim Acta 2008; 615:136-47. [PMID: 18442519 DOI: 10.1016/j.aca.2008.03.060] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2008] [Revised: 03/26/2008] [Accepted: 03/29/2008] [Indexed: 11/29/2022]
Abstract
Structure generation and mass spectral classifiers have been incorporated into a new method to gain further information from low-resolution GC-MS spectra and subsequently assist in the identification of toxic compounds isolated using effect-directed fractionation. The method has been developed for the case where little analytical information other than the mass spectrum is available, common, for example, in effect-directed analysis (EDA), where further interpretation of the mass spectra is necessary to gain additional information about unknown peaks in the chromatogram. Structure generation from a molecular formula alone rapidly leads to enormous numbers of structures; hence reduction of these numbers is necessary to focus identification or confirmation efforts. The mass spectral classifiers and structure generation procedure in the program MOLGEN-MS was enhanced by including additional classifier information available from the NIST05 database and incorporation of post-generation 'filtering criteria'. The presented method can reduce the number of possible structures matching a spectrum by several orders of magnitude, creating much more manageable data sets and increasing the chance of identification. Examples are presented to show how the method can be used to provide 'lines of evidence' for the identity of an unknown compound. This method is an alternative to library search of mass spectra and is especially valuable for unknowns where no clear library match is available.
Collapse
Affiliation(s)
- E L Schymanski
- UFZ, Helmholtz Centre for Environmental Research - UFZ, Department of Effect-Directed Analysis, Permoserstr. 15, D-04318 Leipzig, Germany.
| | | | | | | |
Collapse
|
20
|
How to confirm identified toxicants in effect-directed analysis. Anal Bioanal Chem 2008; 390:1959-73. [DOI: 10.1007/s00216-007-1808-8] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2007] [Revised: 12/10/2007] [Accepted: 12/12/2007] [Indexed: 10/22/2022]
|
21
|
Presentation, Interpretation and Validation of Analytical Results. Anal Chem 2007. [DOI: 10.1007/978-3-540-35990-6_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
22
|
Braun J, Gugisch R, Kerber A, Laue R, Meringer M, Rücker C. MOLGEN-CID--A canonizer for molecules and graphs accessible through the Internet. ACTA ACUST UNITED AC 2004; 44:542-8. [PMID: 15032534 DOI: 10.1021/ci030404l] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The MOLGEN Chemical Identifier MOLGEN-CID is a software module freely accessible via the Internet. For a molecule or graph entered in molfile format (2D) it produces, by a canonical renumbering procedure, a canonical molfile and a unique character string that is easily compared by computer to a similar string. The mode of operation of MOLGEN-CID is detailed and visualized with examples.
Collapse
Affiliation(s)
- Joachim Braun
- Department of Mathematics, Universität Bayreuth, D-95440 Bayreuth, Germany
| | | | | | | | | | | |
Collapse
|
23
|
Rücker C, Gugisch R, Kerber A. Manual Construction and Mathematics- and Computer-Aided Counting of Stereoisomers. The Example of Oligoinositols. ACTA ACUST UNITED AC 2004; 44:1654-65. [PMID: 15446823 DOI: 10.1021/ci040102z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Two methods to obtain numbers of stereoisomers and of achiral stereoisomers of a given molecular structure are detailed on the example of di- and triinositols. The first method is manual exhaustive construction free of redundance of all stereoisomers, which is rendered feasible by symmetry considerations despite the large number of isomeric triinositols (82176). The second method is counting without constructing, made possible by use of a mathematical tool, the Cauchy-Frobenius lemma, which actually is a formalized manner of considering symmetry. The results are compared to those obtained by computer-aided stereoisomer generation using the program MOLGEN 3.5. It is demonstrated that in their results all three methods agree.
Collapse
Affiliation(s)
- Christoph Rücker
- Department of Mathematics, Universität Bayreuth, D-95440 Bayreuth, Germany.
| | | | | |
Collapse
|