1
|
Farmer C, Medina H. IMPACT-4CCS: Integrated Modeling and Prediction Using Ab Initio and Trained Potentials for Collision Cross Sections. J Comput Chem 2025; 46:e70106. [PMID: 40251873 PMCID: PMC12008713 DOI: 10.1002/jcc.70106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2025] [Revised: 03/17/2025] [Accepted: 04/01/2025] [Indexed: 04/21/2025]
Abstract
Collision cross section (CCS) values can enhance the identification and classification of molecular contaminants such as per- and polyfluororoalkyl substances (PFAS). However, the computational burden required for large molecules, combined with the increasing number of potential PFAS candidates, can render existing methods incapable of providing sufficiently accurate results in a timely manner. Furthermore, machine learning methods struggle to generalize when the (de)protonated structure undergoes structural changes that are not common in the training dataset. In this study, we introduce IMPACT4-CCS (Integrated Modeling and Prediction using Ab initio and Trained potentials for Collision Cross Section), a novel computational workflow ensemble that comprises ab initio with machine learning tasks to accelerate accurate prediction of CCS for PFAS molecules. IMPACT-4CCS achieves comparable accuracy to current machine learning approaches, as validated using a test set of 100 molecules. Furthermore, IMPACT-4CCS exhibits better accuracy when implemented on some specific emerging PFAS subclasses, such as the nH-perfluoroalkyl carboxylic acids (nH-PFCA) family, for which other methods overestimate their CCS values. As far as the authors know, IMPACT-4CCS is the only existing method capable of capturing structural dynamics (i.e., hydrogen bridging) present in some large and flexible PFAS molecules. Our work demonstrates that the careful use of machine learning to accelerate traditional methods is likely to be more accurate than relying purely on machine learning on molecular graphs. Future (or recommended) work includes assessing the usefulness of IMPACT-4CCS for extending nontarget analysis to larger PFAS datasets such as the OECD (Organization for Economic Co-operation and Development) PFAS list in PubChem, which could be greater than 7 million molecules with diverse chemistry.
Collapse
Affiliation(s)
- Carson Farmer
- School of EngineeringLiberty UniversityLynchburgVirginiaUSA
| | - Hector Medina
- School of EngineeringLiberty UniversityLynchburgVirginiaUSA
| |
Collapse
|
2
|
Welters K, Thoben C, Raddatz CR, Schlottmann F, Zimmermann S, Belder D. Coupling Capillary Electrophoresis With a Shifted Inlet Potential High-Resolution Ion Mobility Spectrometer. Electrophoresis 2025. [PMID: 40292850 DOI: 10.1002/elps.8147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 02/11/2025] [Accepted: 04/08/2025] [Indexed: 04/30/2025]
Abstract
We present the coupling of capillary electrophoresis to a custom-built high-resolution ion mobility spectrometer (IMS). This system integrates a shifted inlet potential IMS configuration with a customised nanoflow ESI sheath interface. It enables the rapid analysis of quaternary ammonium compounds (QACs) and their impurities in real-world samples. It allowed the detection of six non-chromophoric compounds in about 3 min. The assignment of the IMS signals to compounds was supported by matching experimentally determined collision cross-section (CCS) values with predicted values. The system achieved a detection limit in the single-digit picogram range with IMS resolutions of over 80.
Collapse
Affiliation(s)
- Klaus Welters
- Institute of Analytical Chemistry, Leipzig University, Leipzig, Germany
| | - Christian Thoben
- Department of Sensors and Measurement Technology, Institute of Electrical Engineering and Measurement Technology, Leibniz University Hannover, Hannover, Germany
| | - Christian-Robert Raddatz
- Department of Sensors and Measurement Technology, Institute of Electrical Engineering and Measurement Technology, Leibniz University Hannover, Hannover, Germany
| | - Florian Schlottmann
- Department of Sensors and Measurement Technology, Institute of Electrical Engineering and Measurement Technology, Leibniz University Hannover, Hannover, Germany
| | - Stefan Zimmermann
- Department of Sensors and Measurement Technology, Institute of Electrical Engineering and Measurement Technology, Leibniz University Hannover, Hannover, Germany
| | - Detlev Belder
- Institute of Analytical Chemistry, Leipzig University, Leipzig, Germany
| |
Collapse
|
3
|
Bell BA, Anderson JM, Rajski SR, Bugni TS. Ion Mobility-Coupled Mass Spectrometry for Metallophore Detection. JOURNAL OF NATURAL PRODUCTS 2025; 88:306-313. [PMID: 39929196 DOI: 10.1021/acs.jnatprod.4c00911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/01/2025]
Abstract
Metal chelating small molecules (metallophores) play significant roles in microbial interactions and bacterial survival; however, current methods to identify metallophores are limited by low sensitivity, a lack of metal selectivity, and/or complicated data analysis. To overcome these limitations, we developed a novel approach for detecting metallophores in natural product extracts using ion mobility-coupled mass spectrometry (IM-MS). As a proof of concept, marine bacterial extracts containing known metallophores were analyzed by IM-MS with and without added metals, and the data were compared between conditions to identify metal-binding metabolites. Ions with changes in both mass and mobility were specific to metallophores, enabling their identification within these complex extracts. Additionally, we compared the use of direct infusion (DI) and liquid chromatography (LC) separation with IM-MS. For most samples, DI outperformed LC by minimizing the time required for data collection and simplifying analysis. However, for some samples, LC improved the detection of metallophores likely by reducing ion suppression. IM-MS was then used to identify 10 metallophores in an extract from a marine Micromonospora sp. Overall, incorporating IM-MS facilitated the rapid detection of metal-binding natural products in complex bacterial extracts through the comparison of mass and mobility data in the presence and absence of metals.
Collapse
Affiliation(s)
- Bailey A Bell
- Pharmaceutical Sciences Division, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Josephine M Anderson
- Pharmaceutical Sciences Division, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Scott R Rajski
- Pharmaceutical Sciences Division, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Tim S Bugni
- Pharmaceutical Sciences Division, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
- Small Molecule Screening Facility, UW Carbone Cancer Center, Madison, Wisconsin 53792, United States
- Lachman Institute for Pharmaceutical Development, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| |
Collapse
|
4
|
Elapavalore A, Ross DH, Grouès V, Aurich D, Krinsky AM, Kim S, Thiessen PA, Zhang J, Dodds JN, Baker ES, Bolton EE, Xu L, Schymanski EL. PubChemLite Plus Collision Cross Section (CCS) Values for Enhanced Interpretation of Nontarget Environmental Data. ENVIRONMENTAL SCIENCE & TECHNOLOGY LETTERS 2025; 12:166-174. [PMID: 39957787 PMCID: PMC11823450 DOI: 10.1021/acs.estlett.4c01003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Revised: 12/31/2024] [Accepted: 01/02/2025] [Indexed: 02/18/2025]
Abstract
Finding relevant chemicals in the vast (known) chemical space is a major challenge for environmental and exposomics studies leveraging nontarget high resolution mass spectrometry (NT-HRMS) methods. Chemical databases now contain hundreds of millions of chemicals, yet many are not relevant. This article details an extensive collaborative, open science effort to provide a dynamic collection of chemicals for environmental, metabolomics, and exposomics research, along with supporting information about their relevance to assist researchers in the interpretation of candidate hits. The PubChemLite for Exposomics collection is compiled from ten annotation categories within PubChem, enhanced with patent, literature and annotation counts, predicted partition coefficient (logP) values, as well as predicted collision cross section (CCS) values using CCSbase. Monthly versions are archived on Zenodo under a CC-BY license, supporting reproducible research, and a new interface has been developed, including historical trends of patent and literature data, for researchers to browse the collection. This article details how PubChemLite can support researchers in environmental and exposomics studies, describes efforts to increase the availability of experimental CCS values, and explores known limitations and potential for future developments. The data and code behind these efforts are openly available. PubChemLite can be browsed at https://pubchemlite.lcsb.uni.lu.
Collapse
Affiliation(s)
- Anjana Elapavalore
- Luxembourg
Centre for Systems Biomedicine (LCSB), University
of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| | - Dylan H. Ross
- Department
of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United States
- Current
Address: Biological Sciences Division, Pacific
Northwest National Laboratory, Richland, Washington 99352, United States
| | - Valentin Grouès
- Luxembourg
Centre for Systems Biomedicine (LCSB), University
of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| | - Dagny Aurich
- Luxembourg
Centre for Systems Biomedicine (LCSB), University
of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| | - Allison M. Krinsky
- Department
of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Sunghwan Kim
- National
Center for Biotechnology Information (NCBI), National Library of Medicine
(NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United States
| | - Paul A. Thiessen
- National
Center for Biotechnology Information (NCBI), National Library of Medicine
(NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United States
| | - Jian Zhang
- National
Center for Biotechnology Information (NCBI), National Library of Medicine
(NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United States
| | - James N. Dodds
- Department
of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, United States
| | - Erin S. Baker
- Department
of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, United States
| | - Evan E. Bolton
- National
Center for Biotechnology Information (NCBI), National Library of Medicine
(NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, United States
| | - Libin Xu
- Department
of Medicinal Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Emma L. Schymanski
- Luxembourg
Centre for Systems Biomedicine (LCSB), University
of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| |
Collapse
|
5
|
Engler Hart C, Preto AJ, Chanana S, Healey D, Kind T, Domingo-Fernández D. Evaluating the generalizability of graph neural networks for predicting collision cross section. J Cheminform 2024; 16:105. [PMID: 39210378 PMCID: PMC11363525 DOI: 10.1186/s13321-024-00899-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 08/19/2024] [Indexed: 09/04/2024] Open
Abstract
Ion Mobility coupled with Mass Spectrometry (IM-MS) is a promising analytical technique that enhances molecular characterization by measuring collision cross-section (CCS) values, which are indicative of the molecular size and shape. However, the effective application of CCS values in structural analysis is still constrained by the limited availability of experimental data, necessitating the development of accurate machine learning (ML) models for in silico predictions. In this study, we evaluated state-of-the-art Graph Neural Networks (GNNs), trained to predict CCS values using the largest publicly available dataset to date. Although our results confirm the high accuracy of these models within chemical spaces similar to their training environments, their performance significantly declines when applied to structurally novel regions. This discrepancy raises concerns about the reliability of in silico CCS predictions and underscores the need for releasing further publicly available CCS datasets. To mitigate this, we introduce Mol2CCS which demonstrates how generalization can be partially improved by extending models to account for additional features such as molecular fingerprints, descriptors, and the molecule types. Lastly, we also show how confidence models can support by enhancing the reliability of the CCS estimates.Scientific contributionWe have benchmarked state-of-the-art graph neural networks for predicting collision cross section. Our work highlights the accuracy of these models when trained and predicted in similar chemical spaces, but also how their accuracy drops when evaluated in structurally novel regions. Lastly, we conclude by presenting potential approaches to mitigate this issue.
Collapse
Affiliation(s)
- Chloe Engler Hart
- Enveda Biosciences, Inc., 5700 Flatiron Pkwy, Boulder, CO, 80301, USA
| | | | - Shaurya Chanana
- Enveda Biosciences, Inc., 5700 Flatiron Pkwy, Boulder, CO, 80301, USA
| | - David Healey
- Enveda Biosciences, Inc., 5700 Flatiron Pkwy, Boulder, CO, 80301, USA
| | - Tobias Kind
- Enveda Biosciences, Inc., 5700 Flatiron Pkwy, Boulder, CO, 80301, USA
| | | |
Collapse
|
6
|
Wang C, Yuan C, Wang Y, Shi Y, Zhang T, Patti GJ. Predicting Collision Cross-Section Values for Small Molecules through Chemical Class-Based Multimodal Graph Attention Network. J Chem Inf Model 2024; 64:6305-6315. [PMID: 38959055 DOI: 10.1021/acs.jcim.3c01934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Libraries of collision cross-section (CCS) values have the potential to facilitate compound identification in metabolomics. Although computational methods provide an opportunity to increase library size rapidly, accurate prediction of CCS values remains challenging due to the structural diversity of small molecules. Here, we developed a machine learning (ML) model that integrates graph attention networks and multimodal molecular representations to predict CCS values on the basis of chemical class. Our approach, referred to as MGAT-CCS, had superior performance in comparison to other ML models in CCS prediction. MGAT-CCS achieved a median relative error of 0.47%/1.14% (positive/negative mode) and 1.40%/1.63% (positive/negative mode) for lipids and metabolites, respectively. When MGAT-CCS was applied to real-world metabolomics data, it reduced the number of false metabolite candidates by roughly 25% across multiple sample types ranging from plasma and urine to cells. To facilitate its application, we developed a user-friendly stand-alone web server for MGAT-CCS that is freely available at https://mgat-ccs-web.onrender.com. This work represents a step forward in predicting CCS values and can potentially facilitate the identification of small molecules when using ion mobility spectrometry coupled with mass spectrometry.
Collapse
Affiliation(s)
- Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250000, China
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130 United States
| | - Chuang Yuan
- School of Life Sciences, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing 100871, China
- Department of Biochemistry and Biophysics, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Yahui Wang
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130 United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Yuying Shi
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250000, China
| | - Tao Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250000, China
| | - Gary J Patti
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130 United States
- Department of Medicine, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, Missouri 63130, United States
- Center for Metabolomics and Isotope Tracing, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| |
Collapse
|
7
|
Mubas-Sirah F, Gandhi VD, Latif M, Hua L, Tootchi A, Larriba-Andaluz C. Ion mobility calculations of flexible all-atom systems at arbitrary fields using two-temperature theory. Phys Chem Chem Phys 2024; 26:4118-4124. [PMID: 38226667 DOI: 10.1039/d3cp05415b] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Ion mobility spectrometry (IMS) separates and analyzes ions based on their mobility in a gas under an electric field. When the field is increased, the mobility varies in a complex way that depends on the relative velocity between gas and ion, their electrostatic potential interactions, and the effects from direct impingement. Recently, the two-temperature theory, primarily developed for monoatomic ions in monoatomic gases, has been extended to study mobilities at arbitrary fields using polyatomic ions in polyatomic gases, with some success. However, this extension poses challenges, such as inelastic collisions between gas and ion and structural modifications of ions as they heat up. These challenges become significant when working with diatomic gases and flexible molecules. In a previous study, experimental mobilities of tetraalkylammonium salts were obtained using a FAIMS instrument, showing satisfactory agreement with numerical two-temperature theory predictions. However, deviations occurred at fields greater than 100 Td. To address this issue, this paper introduces a modified high-field calculation method that accounts for the structural changes in ions due to field heating. The study focuses on tetraheptylammonium (THA+), tetradecylammonium (TDA+), and tetradodecylammonium (TDDA+) salts. Molecular structures were generated at various temperatures using MM2 forcefield. The mobility was calculated using IMoS 1.13 with two-temperature trajectory method calculations up to the fourth approximation. Multiple effective temperatures were considered, and a linear weighing system was used to create mobility vs. reduced field strength plots. The results suggest that the structural enlargement due to ion heating plays a significant role in mobility at high fields, aligning better with experimental data. FAIMS' dispersion plots also show improved agreement with experimental results. However, the contribution of inelastic collisions and energy transfer to rotational degrees of freedom in gas molecules remains a complex and challenging aspect.
Collapse
Affiliation(s)
- Farah Mubas-Sirah
- Department of Mechanical Engineering, Indiana University - Purdue University Indianapolis, Indianapolis, IN, USA.
| | - Viraj D Gandhi
- Department of Mechanical Engineering, Indiana University - Purdue University Indianapolis, Indianapolis, IN, USA.
- Department of Mechanical Engineering, Purdue University, West Lafayette, IN, USA
| | - Mohsen Latif
- Department of Mechanical Engineering, Indiana University - Purdue University Indianapolis, Indianapolis, IN, USA.
| | - Leyan Hua
- Department of Mechanical Engineering, Indiana University - Purdue University Indianapolis, Indianapolis, IN, USA.
| | - Amirreza Tootchi
- Department of Mechanical Engineering, Indiana University - Purdue University Indianapolis, Indianapolis, IN, USA.
| | - Carlos Larriba-Andaluz
- Department of Mechanical Engineering, Indiana University - Purdue University Indianapolis, Indianapolis, IN, USA.
| |
Collapse
|
8
|
Wang Y, Wei W, Du W, Cai J, Liao Y, Lu H, Kong B, Zhang Z. Deep-Learning-Based Mixture Identification for Nuclear Magnetic Resonance Spectroscopy Applied to Plant Flavors. Molecules 2023; 28:7380. [PMID: 37959799 PMCID: PMC10648966 DOI: 10.3390/molecules28217380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/25/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist in mixtures such as plant flavors. Here, we propose a deep-learning-based mixture identification method (DeepMID) that can be used to identify plant flavors (mixtures) in a formulated flavor (mixture consisting of several plant flavors) without the need to know the specific components in the plant flavors. A pseudo-Siamese convolutional neural network (pSCNN) and a spatial pyramid pooling (SPP) layer were used to solve the problems due to their high accuracy and robustness. The DeepMID model is trained, validated, and tested on an augmented data set containing 50,000 pairs of formulated and plant flavors. We demonstrate that DeepMID can achieve excellent prediction results in the augmented test set: ACC = 99.58%, TPR = 99.48%, FPR = 0.32%; and two experimentally obtained data sets: one shows ACC = 97.60%, TPR = 92.81%, FPR = 0.78% and the other shows ACC = 92.31%, TPR = 80.00%, FPR = 0.00%. In conclusion, DeepMID is a reliable method for identifying plant flavors in formulated flavors based on NMR spectroscopy, which can assist researchers in accelerating the design of flavor formulations.
Collapse
Affiliation(s)
- Yufei Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Weiwei Wei
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Wen Du
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Jiaxiao Cai
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Yuxuan Liao
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Bo Kong
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| |
Collapse
|