1
|
Charest N, Sinclair G, Eytcheson SA, Chang DT, Martin TM, Lowe CN, Paul Friedman K, Williams AJ. Combined In Vitro and In Silico Workflow to Deliver Robust, Transparent, and Contextually Rigorous Models of Bioactivity. J Chem Inf Model 2025; 65:4426-4441. [PMID: 40273369 DOI: 10.1021/acs.jcim.5c00713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
New approach methodologies (NAMs) are an increasing priority in the field of toxicology to fill data gaps and reduce time and resources in chemical safety assessment. We describe an NAMs workflow that integrates an in vitro high-throughput bioassay with an in silico computational model. In defining this workflow, we propose, as a crucial step of in silico development, the identification of explicit "purpose contexts": a priori definitions of the scope and intent of an in silico solution, which provide natural targets for the mechanistic interpretation, validation, and output design of the model. By inspecting data from an in vitro assay measuring the displacement of fluorescent probe 8-anilino-1-naphthalenesulfonic acid (ANSA) from the serum transport protein transthyretin (TTR) as a proxy for potential disruption of thyroxine (T4) binding, in collaboration with the experimenters, we developed three relevant purpose contexts for this in silico modeling effort: (1) examination and confirmation of the in vitro assay principle via orthogonal information, (2) immediate integration with the in vitro experimental cycle to reduce costs and enhance hit rates, and (3) ultimate replacement of the use of single-concentration screening as a prioritization strategy for bioactivity testing of bulk chemical libraries. From these purpose contexts, we derived the foundations of a robust and transparent quantitative structure-activity relationship (QSAR) model that is constructively fit for purpose, characterized by first-principles mechanistic analysis, strict data quality evaluation, contextually rigorous performance testing and, finally, delivery of a quantitative recommendation schedule to simultaneously improve in vitro hit rates and in silico model learning potential.
Collapse
Affiliation(s)
- Nathaniel Charest
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| | - Gabriel Sinclair
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| | - Stephanie A Eytcheson
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| | - Daniel T Chang
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| | - Todd M Martin
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| | - Charles N Lowe
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| | - Katie Paul Friedman
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| | - Antony J Williams
- Office of Research and Development, Center for Computational Toxicology and Exposure, United States Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle Park, North Carolina 27711, United States
| |
Collapse
|
2
|
Turkina V, Gringhuis JT, Boot S, Petrignani A, Corthals G, Praetorius A, O’Brien JW, Samanipour S. Prioritization of Unknown LC-HRMS Features Based on Predicted Toxicity Categories. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:8004-8015. [PMID: 40254881 PMCID: PMC12044687 DOI: 10.1021/acs.est.4c13026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2024] [Revised: 03/14/2025] [Accepted: 04/03/2025] [Indexed: 04/22/2025]
Abstract
Complex environmental samples contain a diverse array of known and unknown constituents. While liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) nontargeted analysis (NTA) has emerged as an essential tool for the comprehensive study of such samples, the identification of individual constituents remains a significant challenge, primarily due to the vast number of detected features in each sample. To address this, prioritization strategies are frequently employed to narrow the focus to the most relevant features for further analysis. In this study, we developed a novel prioritization strategy that directly links fragmentation and chromatographic data to aquatic toxicity categories, bypassing the need for identification of individual compounds. Given that features are not always well-characterized through fragmentation, we created two models: (1) a Random Forest Classification (RFC) model, which classifies fish toxicity categories based on MS1, retention, and fragmentation data─expressed as cumulative neutral losses (CNLs)─when fragmentation information is available, and (2) a Kernel Density Estimation (KDE) model that relies solely on retention time and MS1 data when fragmentation is absent. Both models demonstrated accuracy comparable to that of structure-based prediction methods. We further tested the models on a pesticide mixture in a tea extract measured by LC-HRMS, where the CNL-based RFC model achieved 0.76 accuracy and the KDE model reached 0.61, showcasing their robust performance in real-world applications.
Collapse
Affiliation(s)
- Viktoriia Turkina
- Van
‘t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands
| | - Jelle T. Gringhuis
- Van
‘t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands
| | - Sanne Boot
- Van
‘t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands
| | - Annemieke Petrignani
- Van
‘t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands
| | - Garry Corthals
- Van
‘t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands
| | - Antonia Praetorius
- Institute
for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam, 1090 GE, Amsterdam, Netherlands
| | - Jake W. O’Brien
- Van
‘t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, Queensland 4102, Australia
| | - Saer Samanipour
- Van
‘t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam 1090 GD, Netherlands
- Queensland
Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, 20 Cornwall Street, Woolloongabba, Queensland 4102, Australia
- UvA
Data Science Center, University of Amsterdam, Amsterdam 1000 GG, Netherlands
| |
Collapse
|
3
|
Teri D, Aly NA, Dodds JN, Zhang J, Thiessen PA, Bolton EE, Joseph KM, Williams AJ, Schymanski EL, Rusyn I, Baker ES. Reference Library for Suspect Non-targeted Screening of Environmental Toxicants Using Ion Mobility Spectrometry-Mass Spectrometry. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.22.639656. [PMID: 40060593 PMCID: PMC11888245 DOI: 10.1101/2025.02.22.639656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
As our health is affected by the xenobiotic chemicals we are exposed to, it is important to rapidly assess these molecules both in the environment and our bodies. Targeted analytical methods coupling either gas or liquid chromatography with mass spectrometry (GC-MS or LC-MS) are commonly utilized in current exposure assessments. While these methods are accepted as the gold standard for exposure analyses, they often require multiple sample preparation steps and more than 30 minutes per sample. This throughput limitation is a critical gap for exposure assessments and has resulted in an evolving interest in using ion mobility spectrometry and MS (IMS-MS) for non-targeted studies. IMS-MS is a unique technique due to its rapid analytical capabilities (millisecond scanning) and detection of a wide range of chemicals based on unique collision cross section (CCS) and mass-to-charge (m/z) values. To increase the availability of IMS-MS information for exposure studies, here we utilized drift tube IMS-MS to evaluate 4,685 xenobiotic chemical standards from the Environmental Protection Agency Toxicity Forecaster (ToxCast) program including pesticides, industrial chemicals, pharmaceuticals, consumer products, and per- and polyfluoroalkyl substances (PFAS). In the analyses, 3,993 [M+H]+, [M+Na]+, [M-H]- and [M+]+ ion types were observed with high confidence and reproducibility (≤1% error intra-laboratory and ≤2% inter-laboratory) from 2,140 unique chemicals. These values were then assembled into an openly available multidimensional database and uploaded to PubChem to enable rapid IMS-MS suspect screening for a wide range of environmental contaminants, faster response time in environmental exposure assessments, and assessments of xenobiotic-disease connections.
Collapse
Affiliation(s)
- Devin Teri
- Department of Veterinary Physiology and Pharmacology, Texas A&M University, College Station, Texas 77843, USA
| | - Noor A Aly
- Department of Veterinary Physiology and Pharmacology, Texas A&M University, College Station, Texas 77843, USA
| | - James N Dodds
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Jian Zhang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Paul A Thiessen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Evan E Bolton
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Kara M Joseph
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Ivan Rusyn
- Department of Veterinary Physiology and Pharmacology, Texas A&M University, College Station, Texas 77843, USA
| | - Erin S Baker
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| |
Collapse
|
4
|
Newton SR, Bowden JA, Charest N, Jackson SR, Koelmel JP, Liberatore HK, Lin AM, Lowe CN, Nieto S, Godri Pollitt KJ, Robuck AR, Rostkowski P, Townsend TG, Wallace MAG, Williams AJ. Filling the Gaps in PFAS Detection: Integrating GC-MS Non-Targeted Analysis for Comprehensive Environmental Monitoring and Exposure Assessment. ENVIRONMENTAL SCIENCE & TECHNOLOGY LETTERS 2025; 12:1-9. [PMID: 40206203 PMCID: PMC11977685 DOI: 10.1021/acs.estlett.4c00930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/11/2025]
Abstract
Per- and polyfluoroalkyl substances (PFAS) have garnered increasing attention in recent years and non-targeted analysis (NTA) has become essential for elucidating novel PFAS structures. NTA and PFAS research have been dominated by liquid chromatography - mass spectrometry (LC-MS) with gas chromatography - mass spectrometry (GC-MS) used less often as evidenced by bibliometrics. However, the performance of GC-MS in NTA studies (GC-NTA) rivals that of LC-ESI-MS and GC-MS is shown to cover a complimentary chemical space. An LC-ESI-MS amenability model applied to a list of approximately 12,000 PFAS revealed that less than 10% of known PFAS chemistry is predicted to be amenable to typical LC-MS analysis. Therefore, there is strong potential for applying GC-MS methods to more fully assess the PFAS environmental contamination landscape, uniquely shedding light on both known and novel PFAS, especially within the chemical space realm of volatile and semi-volatile PFAS. Waste streams from fluorochemical manufacturing facilities have been heavily studied using LC-MS and targeted GC-MS; however, GC-NTA is needed to discover novel PFAS that are not amenable to LC-MS emitted from facilities. Studies on the incineration of PFAS-containing materials, such as aqueous film forming foam, have focused on the destruction of parent compounds and little is known about the transformation products formed during such processes. GC-NTA holds the potential to elucidate transformation products formed when PFAS are incinerated. Wastewater treatment plants and landfills are known sources of PFAS to the environment, yet GC-NTA is needed to understand air emissions of PFAS and PFAS transformation products from these sources. Consumer products are known to lead to indoor exposures to PFAS via emissions to air and dust but research in this area has either used LC-MS or targeted GC-MS. Despite the challenges with advancing GC-NTA, we call on NTA researchers, grantors, managers, and other stakeholders to recognize the potential and necessity of GC-NTA in PFAS research so that we may face these challenges together.
Collapse
Affiliation(s)
- Seth R. Newton
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, North Carolina, 27709, USA
| | - John A. Bowden
- Department of Environmental Engineering Sciences, University of Florida, Gainesville, Florida, 32608, USA
| | - Nathaniel Charest
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, North Carolina, 27709, USA
| | - Stephen R. Jackson
- Center for Environmental Measurement and Modeling, US Environmental Protection Agency, Research Triangle Park, North Carolina, 27709, USA
| | - Jeremy P. Koelmel
- Department of Environmental Health Sciences, Yale School of Public Health, New Haven, Connecticut, 06510, USA
| | - Hannah K. Liberatore
- Center for Environmental Measurement and Modeling, US Environmental Protection Agency, Research Triangle Park, North Carolina, 27709, USA
| | - Ashley M. Lin
- Department of Environmental Engineering Sciences, University of Florida, Gainesville, Florida, 32608, USA
| | - Charles N. Lowe
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, North Carolina, 27709, USA
| | - Sofia Nieto
- Agilent Technologies, Inc, Santa Clara, California, 95051, USA
| | - Krystal J. Godri Pollitt
- Department of Environmental Health Sciences, Yale School of Public Health, New Haven, Connecticut, 06510, USA
| | - Anna R. Robuck
- Center for Environmental Measurement and Modeling, US Environmental Protection Agency, Narragansett, Rhode Island, 02882, USA
| | | | - Timothy G. Townsend
- Department of Environmental Engineering Sciences, University of Florida, Gainesville, Florida, 32608, USA
| | - M. Ariel Geer Wallace
- Center for Environmental Measurement and Modeling, US Environmental Protection Agency, Research Triangle Park, North Carolina, 27709, USA
| | - Antony John Williams
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, North Carolina, 27709, USA
| |
Collapse
|
5
|
Richard AM, Tao D, LeClair CA, Leister W, Tretyakov KV, White E, Lewis KC, Sefler A, Shinn P, Collins BJ, Nguyen DT, Ye L, Zhao T, Xu T, Williams AJ, Waidyanatha S, Thomas RS, Tice R, Simeonov A, Huang R. Analytical Quality Evaluation of the Tox21 Compound Library. Chem Res Toxicol 2025; 38:15-41. [PMID: 39829241 PMCID: PMC11752516 DOI: 10.1021/acs.chemrestox.4c00330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 12/09/2024] [Accepted: 12/12/2024] [Indexed: 01/22/2025]
Abstract
The analytical quality of compounds subjected to high-throughput screening (HTS) impacts accurate interpretation of assay results, with poor quality samples potentially leading to false negatives or positives. The Tox21 "10K" library consists of over 8900 unique compounds, spanning a diverse landscape of environmental and pharmaceutical chemicals, posing opportunities and challenges for analytical quality control (QC) determinations. Tox21 sample plates stored in DMSO at ambient conditions for 0 (T0) and/or 4 months (T4), totaling more than 13K unique sample identifiers (Tox21 IDs), were subjected to various analyses, including liquid and gas chromatography mass spectrometry (LC-MS, GC-MS) and nuclear magnetic resonance (NMR). Results for each sample at T0 or T4 underwent expert review and, where possible, a QC grade conveying purity, identity, and concentration was assigned. Herein, we relate details of the methods applied and report on the original (v0) Tox21 ID level results. Thirteen QC grades were condensed to 5 quality scores to aid global analysis, resulting in reinterpretation and improvement of >700 sample grades. Of the 92% T0 samples successfully graded, 76% exceeded 90% purity. For 76% of samples that were also tested at T4, 89% showed no evidence of sample loss or degradation. Prioritized quality bins were used to summarize thousands of replicate sample-level QC results to a compound-level QC score to support structure-based analyses. ToxPrint chemotype analysis identified structural features enriched in unstable compounds, as well as in high and low quality T0 subsets. Predicted vapor pressure was weakly correlated with low-concentration QC indicators, reflecting likely entanglement with method amenability and quality issues. Finally, an ongoing EPA effort to re-evaluate the original QC spectra is generating insights that will further modify QC grades. Tox21 QC spectra and results will be made available in a new public QC browser, facilitating further evaluation to support HTS interpretation and modeling applications.
Collapse
Affiliation(s)
- Ann M. Richard
- Center for
Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (EPA), Research Triangle Park, North Carolina 27711, United States
| | - Dingyin Tao
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Christopher A. LeClair
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - William Leister
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Kirill V. Tretyakov
- Biomolecular
Measurement Division, National Institute
of Standards and Technology (NIST), Gaithersburg, Maryland 20899, United States
| | - Edward White
- Biomolecular
Measurement Division, National Institute
of Standards and Technology (NIST), Gaithersburg, Maryland 20899, United States
| | - Ken C. Lewis
- OpAns, Durham, North Carolina 27713, United States
| | | | - Paul Shinn
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Bradley J. Collins
- Division
of Translational Toxicology (DTT), National
Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, United States
| | - Dac-Trung Nguyen
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Lin Ye
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Tongan Zhao
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Tuan Xu
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Antony J. Williams
- Center for
Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (EPA), Research Triangle Park, North Carolina 27711, United States
| | - Suramya Waidyanatha
- Division
of Translational Toxicology (DTT), National
Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, United States
| | - Russell S. Thomas
- Center for
Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (EPA), Research Triangle Park, North Carolina 27711, United States
| | - Raymond Tice
- Division
of Translational Toxicology (DTT), National
Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, United States
| | - Anton Simeonov
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| | - Ruili Huang
- Division
of Preclinical Innovation, National Center for Advancing Translational
Sciences (NCATS), National Institutes of
Health (NIH), Rockville, Maryland 20850, United States
| |
Collapse
|
6
|
Patlewicz G, Williams AJ, Adams M, Shah I, Paul-Friedman K. A Cheminformatics Workflow to Select Representative TSCA Chemicals for New Approach Methodology (NAM) Screening. Chem Res Toxicol 2025; 38:129-144. [PMID: 39655894 DOI: 10.1021/acs.chemrestox.4c00367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2025]
Abstract
The Toxic Substances Control Act (TSCA) requires the US EPA to evaluate the hazard and exposure of new and existing chemicals. New chemical notifications are typically data-poor and EPA has historically relied upon approaches including chemical categories to fill data gaps. As part of a multi-year Research Program, opportunities are being explored to leverage New Approach Methods (NAMs) in hazard and exposure assessments. Data from a battery of in vitro NAMs will be generated to form a case study for an adaptable approach to inform new chemical assessments. Herein, a cheminformatics workflow was developed to identify a set of ∼300 representative candidate chemicals for in vitro screening from the TSCA non-confidential active inventory. The freely available web application ClassyFire was used to categorize all discrete organic structures from the TSCA inventory into one of 68 primary structural categories. Large primary categories were subcategorized into smaller categories using hierarchical agglomerative clustering, ultimately yielding 180 structural terminal categories. The inventory was filtered to substances that lacked previous ToxCast bioactivity screening, were associated with physicochemical property predictions indicating non-volatile solids or liquids, and had a higher chance of procurement. Amenability predictions for liquid chromatography-mass spectrometry were also generated to provide an indication of which chemicals lent themselves to aqueous-based screening and analytical verification in solvated samples. Structures associated with transformation in solvent, potentially explosive or highly reactive, were excluded. Potential candidate substances were selected on the basis of being structurally representative of the terminal category and meeting other screenability conditions. A final set of 318 candidate chemicals were proposed to undergo analytical quality control and screening in a range of broad and targeted biological technologies for human health-relevant end points. Finally, in silico tools were applied to explore predicted hazard profiles of these candidate substances relative to the full inventory.
Collapse
Affiliation(s)
- Grace Patlewicz
- Center for Computational Toxicology & Exposure (CCTE), U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina 27709, United States
| | - Antony J Williams
- Center for Computational Toxicology & Exposure (CCTE), U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina 27709, United States
| | - Matthew Adams
- Center for Computational Toxicology & Exposure (CCTE), U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina 27709, United States
- Oak Ridge Associated Universities (ORAU), Oak Ridge, Tennessee 37830, United States
| | - Imran Shah
- Center for Computational Toxicology & Exposure (CCTE), U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina 27709, United States
| | - Katie Paul-Friedman
- Center for Computational Toxicology & Exposure (CCTE), U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina 27709, United States
| |
Collapse
|
7
|
Liu S, Dukes DA, Koelmel JP, Stelben P, Finch J, Okeme J, Lowe C, Williams A, Godri D, Rennie EE, Parry E, McDonough CA, Pollitt KJG. Expanding PFAS Identification with Transformation Product Libraries: Nontargeted Analysis Reveals Biotransformation Products in Mice. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:119-131. [PMID: 39704186 PMCID: PMC12097807 DOI: 10.1021/acs.est.4c07750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/21/2024]
Abstract
Per- and polyfluoroalkyl substances (PFAS) are widely used persistent synthetic chemicals that have been linked to adverse health effects. While the behavior of PFAS has been evaluated in the environment, our understanding of reaction products in mammalian systems is limited. This study identified biological PFAS transformation products and generated mass spectral libraries to facilitate an automated search and identification. The biological transformation products of 27 PFAS, spanning 5 chemical subclasses (alcohols, sulfonamides, carboxylic acids, ethers, and esters), were evaluated following enzymatic reaction with mouse liver S9 fractions. Four major pathways were identified by liquid chromatography-high-resolution mass spectrometry: glucuronidation, sulfation, dealkylation, and oxidation. Class-based fragmentation rules and associated PFAS transformation product libraries were generated and integrated into an automated nontargeted PFAS data analysis software (FluoroMatch). Fragmentation was additionally predicted for the potential transformation products of more than 2,500 PFAS in the EPA CompTox Chemicals Dashboard PFASSTRUCTv4. Generated mass spectral libraries were validated by applying FluoroMatch to a data set of urine from aqueous film-forming foam (AFFF)-dosed mice. Toxicity predictions showed identified PFAS transformation products to be potential developmental and mutagenic toxicants. This research enables more comprehensive PFAS characterization in biological systems, which will improve the assessment of exposures and evaluation of the associated health impacts.
Collapse
Affiliation(s)
- Sheng Liu
- Department of Environmental Health Science, Yale School of Public Health, New Haven, Connecticut 06511, United States
| | - David A. Dukes
- Department of Civil Engineering, Stony Brook University, Stony Brook, New York 11794, United States
| | - Jeremy P. Koelmel
- Department of Environmental Health Science, Yale School of Public Health, New Haven, Connecticut 06511, United States
| | - Paul Stelben
- Department of Environmental Health Science, Yale School of Public Health, New Haven, Connecticut 06511, United States
| | - Jasen Finch
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, Ceredigion SY23 3EB, U.K
| | - Joseph Okeme
- Department of Environmental Health Science, Yale School of Public Health, New Haven, Connecticut 06511, United States
| | - Charles Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Antony Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - David Godri
- Third Floor Solutions, Toronto, ON M5V 3L9, Canada
| | - Emma E. Rennie
- Agilent Technologies, Santa Clara, California 95051, United States
| | - Emily Parry
- Agilent Technologies, Santa Clara, California 95051, United States
| | - Carrie A. McDonough
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Krystal J Godri Pollitt
- Department of Environmental Health Science, Yale School of Public Health, New Haven, Connecticut 06511, United States
| |
Collapse
|
8
|
Hupatz H, Rahu I, Wang WC, Peets P, Palm EH, Kruve A. Critical review on in silico methods for structural annotation of chemicals detected with LC/HRMS non-targeted screening. Anal Bioanal Chem 2025; 417:473-493. [PMID: 39138659 DOI: 10.1007/s00216-024-05471-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/22/2024] [Accepted: 07/24/2024] [Indexed: 08/15/2024]
Abstract
Non-targeted screening with liquid chromatography coupled to high-resolution mass spectrometry (LC/HRMS) is increasingly leveraging in silico methods, including machine learning, to obtain candidate structures for structural annotation of LC/HRMS features and their further prioritization. Candidate structures are commonly retrieved based on the tandem mass spectral information either from spectral or structural databases; however, the vast majority of the detected LC/HRMS features remain unannotated, constituting what we refer to as a part of the unknown chemical space. Recently, the exploration of this chemical space has become accessible through generative models. Furthermore, the evaluation of the candidate structures benefits from the complementary empirical analytical information such as retention time, collision cross section values, and ionization type. In this critical review, we provide an overview of the current approaches for retrieving and prioritizing candidate structures. These approaches come with their own set of advantages and limitations, as we showcase in the example of structural annotation of ten known and ten unknown LC/HRMS features. We emphasize that these limitations stem from both experimental and computational considerations. Finally, we highlight three key considerations for the future development of in silico methods.
Collapse
Affiliation(s)
- Henrik Hupatz
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden
- Stockholm University Center for Circular and Sustainable Systems (SUCCeSS), Stockholm University, 106 91, Stockholm, Sweden
| | - Ida Rahu
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden.
| | - Wei-Chieh Wang
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden
| | - Pilleriin Peets
- Institute of Biodiversity, Faculty of Biological Science, Cluster of Excellence Balance of the Microverse, Friedrich Schiller University Jena, 07743, Jena, Germany
| | - Emma H Palm
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367, Belvaux, Luxembourg
| | - Anneli Kruve
- Department of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 114 18, Stockholm, Sweden.
- Stockholm University Center for Circular and Sustainable Systems (SUCCeSS), Stockholm University, 106 91, Stockholm, Sweden.
- Department of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 114 18, Stockholm, Sweden.
| |
Collapse
|
9
|
Batt AL, Brunelle LD, Quinete NS, Stebel EK, Ng B, Gardinali P, Chao A, Huba AK, Glassmeyer ST, Alvarez DA, Kolpin DW, Furlong ET, Mills MA. Investigating the chemical space coverage of multiple chromatographic and ionization methods using non-targeted analysis on surface and drinking water collected using passive sampling. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 955:176922. [PMID: 39426538 DOI: 10.1016/j.scitotenv.2024.176922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 10/10/2024] [Accepted: 10/12/2024] [Indexed: 10/21/2024]
Abstract
Multiple non-targeted analysis tools were used to look for a broad range of possible chemical contaminants present in surface and drinking water using liquid chromatography separation and high-resolution mass spectrometry detection, including both quadrupole time of flight (Q-ToF) and Orbitrap instruments. Two chromatographic techniques were evaluated on an LC-Q-ToF with electrospray ionization in both positive and negative modes: (1) the traditionally used reverse phase C18 and (2) the hydrophilic interaction liquid chromatography (HILIC) aimed to capture more polar contaminants that may be present in water. Multiple ionization modes were evaluated with an LC-Orbitrap, including electrospray (ESI) and atmospheric pressure chemical ionization (APCI), also in both positive and negative modes. A suspect screening library of over 1300 possible environmental contaminants, including pesticides, pharmaceuticals, personal care products, illicit drugs/drugs of abuse, and various anthropogenic markers was made with experimentally collected data with the LC-Q-ToF with both column types, with 227 chemicals being retained by the HILIC column. The non-targeted methods using multiple chromatographic and ionization modes were applied to environmental water samples collected with polar organic chemical integrative samplers (POCIS), including surface water upstream and downstream from wastewater effluent discharge, and the downstream drinking water intake and treated drinking water for three distinct sampling events. For the LC-Q-ToF, 442 chemical features were detected on the C18 column and 91 with the HILIC column in the POCIS extracts, while 556 features were found on the Orbitrap workflow by ESI and 131 features detected by APCI. Over 100 chemicals were tentatively identified by suspect screening and database searching. The comprehensive and systematic evaluation of these methods serve as a step in characterizing the chemical space covered when utilizing different chromatography and ionization methods, or different instrument workflows on complex environmental mixtures.
Collapse
Affiliation(s)
- Angela L Batt
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Solutions and Emergency Response, Cincinnati, OH 45268, United States.
| | - Laura D Brunelle
- Oak Ridge Institute for Science and Education (ORISE) Participant at the U.S. Environmental Protection Agency, 26 W. Martin Luther King Dr, Cincinnati, OH 45268, United States
| | - Natalia S Quinete
- Florida International University, Institute of Environment, Department of Chemistry & Biochemistry, North Miami, FL 33181, United States
| | - Eva K Stebel
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Solutions and Emergency Response, Cincinnati, OH 45268, United States
| | - Brian Ng
- Florida International University, Institute of Environment, Department of Chemistry & Biochemistry, North Miami, FL 33181, United States
| | - Piero Gardinali
- Florida International University, Institute of Environment, Department of Chemistry & Biochemistry, North Miami, FL 33181, United States
| | - Alex Chao
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC 27709, United States
| | - Anna K Huba
- Florida International University, Institute of Environment, Department of Chemistry & Biochemistry, North Miami, FL 33181, United States
| | - Susan T Glassmeyer
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Solutions and Emergency Response, Cincinnati, OH 45268, United States
| | - David A Alvarez
- U.S. Geological Survey, Columbia Environmental Research Center, Columbia, MO 65201, United States
| | - Dana W Kolpin
- U.S. Geological Survey, Central Midwest Water Science Center, Iowa City, IA 52240, United States
| | - Edward T Furlong
- U.S. Geological Survey, Strategic Laboratory Services Branch, Laboratory Analytical Services Division, Denver, CO 80225, United States
| | - Marc A Mills
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Solutions and Emergency Response, Cincinnati, OH 45268, United States
| |
Collapse
|
10
|
Banerjee A, Kar S, Roy K, Patlewicz G, Charest N, Benfenati E, Cronin MTD. Molecular similarity in chemical informatics and predictive toxicity modeling: from quantitative read-across (q-RA) to quantitative read-across structure-activity relationship (q-RASAR) with the application of machine learning. Crit Rev Toxicol 2024; 54:659-684. [PMID: 39225123 PMCID: PMC12010357 DOI: 10.1080/10408444.2024.2386260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/25/2024] [Accepted: 07/25/2024] [Indexed: 09/04/2024]
Abstract
This article aims to provide a comprehensive critical, yet readable, review of general interest to the chemistry community on molecular similarity as applied to chemical informatics and predictive modeling with a special focus on read-across (RA) and read-across structure-activity relationships (RASAR). Molecular similarity-based computational tools, such as quantitative structure-activity relationships (QSARs) and RA, are routinely used to fill the data gaps for a wide range of properties including toxicity endpoints for regulatory purposes. This review will explore the background of RA starting from how structural information has been used through to how other similarity contexts such as physicochemical, absorption, distribution, metabolism, and elimination (ADME) properties, and biological aspects are being characterized. More recent developments of RA's integration with QSAR have resulted in the emergence of novel models such as ToxRead, generalized read-across (GenRA), and quantitative RASAR (q-RASAR). Conventional QSAR techniques have been excluded from this review except where necessary for context.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Department of Pharmaceutical Technology, Drug Theoretics and Cheminformatics (DTC) Laboratory, Jadavpur University, Kolkata, India
| | - Supratik Kar
- Department of Chemistry and Physics, Chemometrics & Molecular Modeling Laboratory, Kean University, Union, NJ, USA
| | - Kunal Roy
- Department of Pharmaceutical Technology, Drug Theoretics and Cheminformatics (DTC) Laboratory, Jadavpur University, Kolkata, India
| | - Grace Patlewicz
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Nathaniel Charest
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Emilio Benfenati
- Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Mark T. D. Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
11
|
Charest N, Lowe CN, Ramsland C, Meyer B, Samano V, Williams AJ. Improving predictions of compound amenability for liquid chromatography-mass spectrometry to enhance non-targeted analysis. Anal Bioanal Chem 2024; 416:2565-2579. [PMID: 38530399 PMCID: PMC11228616 DOI: 10.1007/s00216-024-05229-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 02/14/2024] [Accepted: 02/16/2024] [Indexed: 03/28/2024]
Abstract
Mass-spectrometry-based non-targeted analysis (NTA), in which mass spectrometric signals are assigned chemical identities based on a systematic collation of evidence, is a growing area of interest for toxicological risk assessment. Successful NTA results in better identification of potentially hazardous pollutants within the environment, facilitating the development of targeted analytical strategies to best characterize risks to human and ecological health. A supporting component of the NTA process involves assessing whether suspected chemicals are amenable to the mass spectrometric method, which is necessary in order to assign an observed signal to the chemical structure. Prior work from this group involved the development of a random forest model for predicting the amenability of 5517 unique chemical structures to liquid chromatography-mass spectrometry (LC-MS). This work improves the interpretability of the group's prior model of the same endpoint, as well as integrating 1348 more data points across negative and positive ionization modes. We enhance interpretability by feature engineering, a machine learning practice that reduces the input dimensionality while attempting to preserve performance statistics. We emphasize the importance of interpretable machine learning models within the context of building confidence in NTA identification. The novel data were curated by the labeling of compounds as amenable or unamenable by expert curators, resulting in an enhanced set of chemical compounds to expand the applicability domain of the prior model. The balanced accuracy benchmark of the newly developed model is comparable to performance previously reported (mean CV BA is 0.84 vs. 0.82 in positive mode, and 0.85 vs. 0.82 in negative mode), while on a novel external set, derived from this work's data, the Matthews correlation coefficients (MCC) for the novel models are 0.66 and 0.68 for positive and negative mode, respectively. Our group's prior published models scored MCC of 0.55 and 0.54 on the same external sets. This demonstrates appreciable improvement over the chemical space captured by the expanded dataset. This work forms part of our ongoing efforts to develop models with higher interpretability and higher performance to support NTA efforts.
Collapse
Affiliation(s)
- Nathaniel Charest
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA.
| | - Charles N Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | | | - Brian Meyer
- Senior Environmental Employment Program, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | - Vicente Samano
- Senior Environmental Employment Program, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| |
Collapse
|
12
|
Szabo D, Falconer TM, Fisher CM, Heise T, Phillips AL, Vas G, Williams AJ, Kruve A. Online and Offline Prioritization of Chemicals of Interest in Suspect Screening and Non-targeted Screening with High-Resolution Mass Spectrometry. Anal Chem 2024; 96:3707-3716. [PMID: 38380899 PMCID: PMC10918621 DOI: 10.1021/acs.analchem.3c05705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/30/2024] [Accepted: 02/07/2024] [Indexed: 02/22/2024]
Abstract
Recent advances in high-resolution mass spectrometry (HRMS) have enabled the detection of thousands of chemicals from a single sample, while computational methods have improved the identification and quantification of these chemicals in the absence of reference standards typically required in targeted analysis. However, to determine the presence of chemicals of interest that may pose an overall impact on ecological and human health, prioritization strategies must be used to effectively and efficiently highlight chemicals for further investigation. Prioritization can be based on a chemical's physicochemical properties, structure, exposure, and toxicity, in addition to its regulatory status. This Perspective aims to provide a framework for the strategies used for chemical prioritization that can be implemented to facilitate high-quality research and communication of results. These strategies are categorized as either "online" or "offline" prioritization techniques. Online prioritization techniques trigger the isolation and fragmentation of ions from the low-energy mass spectra in real time, with user-defined parameters. Offline prioritization techniques, in contrast, highlight chemicals of interest after the data has been acquired; detected features can be filtered and ranked based on the relative abundance or the predicted structure, toxicity, and concentration imputed from the tandem mass spectrum (MS2). Here we provide an overview of these prioritization techniques and how they have been successfully implemented and reported in the literature to find chemicals of elevated risk to human and ecological environments. A complete list of software and tools is available from https://nontargetedanalysis.org/.
Collapse
Affiliation(s)
- Drew Szabo
- Department
of Materials and Environmental Chemistry, Stockholm University, Stockholm 106 91, Sweden
| | - Travis M. Falconer
- Forensic
Chemistry Center, Office of Regulatory Science, Office of Regulatory
Affairs, US Food and Drug Administration, Cincinnati, Ohio 45237, United States
| | - Christine M. Fisher
- Center
for Food Safety and Applied Nutrition, US Food and Drug Administration, College Park, Maryland 20740, United States
| | - Ted Heise
- MED
Institute Inc, West Lafayette, Indiana 47906, United States
| | - Allison L. Phillips
- Center
for Public Health and Environmental Assessment, US Environmental Protection Agency, Corvallis, Oregon 97333, United States
| | - Gyorgy Vas
- VasAnalytical, Flemington, New Jersey 08822, United States
- Intertek
Pharmaceutical Services, Whitehouse, New Jersey 08888, United States
| | - Antony J. Williams
- Center
for Computational Toxicology and Exposure, Office of Research and
Development, US Environmental Protection
Agency, Durham, North Carolina 27711, United States
| | - Anneli Kruve
- Department
of Materials and Environmental Chemistry, Stockholm University, Stockholm 106 91, Sweden
- Department
of Environmental Science, Stockholm University, Stockholm 106 91, Sweden
| |
Collapse
|
13
|
Lauria MZ, Sepman H, Ledbetter T, Plassmann M, Roos AM, Simon M, Benskin JP, Kruve A. Closing the Organofluorine Mass Balance in Marine Mammals Using Suspect Screening and Machine Learning-Based Quantification. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:2458-2467. [PMID: 38270113 PMCID: PMC10851419 DOI: 10.1021/acs.est.3c07220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/28/2023] [Accepted: 12/22/2023] [Indexed: 01/26/2024]
Abstract
High-resolution mass spectrometry (HRMS)-based suspect and nontarget screening has identified a growing number of novel per- and polyfluoroalkyl substances (PFASs) in the environment. However, without analytical standards, the fraction of overall PFAS exposure accounted for by these suspects remains ambiguous. Fortunately, recent developments in ionization efficiency (IE) prediction using machine learning offer the possibility to quantify suspects lacking analytical standards. In the present work, a gradient boosted tree-based model for predicting log IE in negative mode was trained and then validated using 33 PFAS standards. The root-mean-square errors were 0.79 (for the entire test set) and 0.29 (for the 7 PFASs in the test set) log IE units. Thereafter, the model was applied to samples of liver from pilot whales (n = 5; East Greenland) and white beaked dolphins (n = 5, West Greenland; n = 3, Sweden) which contained a significant fraction (up to 70%) of unidentified organofluorine and 35 unquantified suspect PFASs (confidence level 2-4). IE-based quantification reduced the fraction of unidentified extractable organofluorine to 0-27%, demonstrating the utility of the method for closing the fluorine mass balance in the absence of analytical standards.
Collapse
Affiliation(s)
- Mélanie Z. Lauria
- Department
of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 10691 Stockholm, Sweden
| | - Helen Sepman
- Department
of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 10691 Stockholm, Sweden
- Department
of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 106
91 Stockholm, Sweden
| | - Thomas Ledbetter
- Department
of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 10691 Stockholm, Sweden
- Department
of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 106
91 Stockholm, Sweden
| | - Merle Plassmann
- Department
of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 10691 Stockholm, Sweden
| | - Anna M. Roos
- Department
of Environmental Research and Monitoring, Swedish Museum of Natural History, 104 05 Stockholm, Sweden
| | - Malene Simon
- Greenland
Climate Research Centre, Greenland Institute
of Natural Resources, 3900 Nuuk, Greenland
| | - Jonathan P. Benskin
- Department
of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 10691 Stockholm, Sweden
| | - Anneli Kruve
- Department
of Environmental Science, Stockholm University, Svante Arrhenius Väg 8, 10691 Stockholm, Sweden
- Department
of Materials and Environmental Chemistry, Stockholm University, Svante Arrhenius Väg 16, 106
91 Stockholm, Sweden
| |
Collapse
|
14
|
Cui S, Gao Y, Huang Y, Shen L, Zhao Q, Pan Y, Zhuang S. Advances and applications of machine learning and deep learning in environmental ecology and health. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 335:122358. [PMID: 37567408 DOI: 10.1016/j.envpol.2023.122358] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 08/02/2023] [Accepted: 08/08/2023] [Indexed: 08/13/2023]
Abstract
Machine learning (ML) and deep learning (DL) possess excellent advantages in data analysis (e.g., feature extraction, clustering, classification, regression, image recognition and prediction) and risk assessment and management in environmental ecology and health (EEH). Considering the rapid growth and increasing complexity of data in EEH, it is of significance to summarize recent advances and applications of ML and DL in EEH. This review summarized the basic processes and fundamental algorithms of the ML and DL modeling, and indicated the urgent needs of ML and DL in EEH. Recent research hotspots such as environmental ecology and restoration, environmental fate of new pollutants, chemical exposures and risks, chemical hazard identification and control were highlighted. Various applications of ML and DL in EEH demonstrate their versatility and technological revolution, and present some challenges. The perspective of ML and DL in EEH were further outlined to promote the innovative analysis and cultivation of the ML-driven research paradigm.
Collapse
Affiliation(s)
- Shixuan Cui
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China; Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, 310006, China
| | - Yuchen Gao
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yizhou Huang
- Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, 310006, China
| | - Lilai Shen
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Qiming Zhao
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yaru Pan
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Shulin Zhuang
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China; Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, 310006, China.
| |
Collapse
|
15
|
Buckley TJ, Egeghy PP, Isaacs K, Richard AM, Ring C, Sayre RR, Sobus JR, Thomas RS, Ulrich EM, Wambaugh JF, Williams AJ. Cutting-edge computational chemical exposure research at the U.S. Environmental Protection Agency. ENVIRONMENT INTERNATIONAL 2023; 178:108097. [PMID: 37478680 PMCID: PMC10588682 DOI: 10.1016/j.envint.2023.108097] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 06/05/2023] [Accepted: 07/12/2023] [Indexed: 07/23/2023]
Abstract
Exposure science is evolving from its traditional "after the fact" and "one chemical at a time" approach to forecasting chemical exposures rapidly enough to keep pace with the constantly expanding landscape of chemicals and exposures. In this article, we provide an overview of the approaches, accomplishments, and plans for advancing computational exposure science within the U.S. Environmental Protection Agency's Office of Research and Development (EPA/ORD). First, to characterize the universe of chemicals in commerce and the environment, a carefully curated, web-accessible chemical resource has been created. This DSSTox database unambiguously identifies >1.2 million unique substances reflecting potential environmental and human exposures and includes computationally accessible links to each compound's corresponding data resources. Next, EPA is developing, applying, and evaluating predictive exposure models. These models increasingly rely on data, computational tools like quantitative structure activity relationship (QSAR) models, and machine learning/artificial intelligence to provide timely and efficient prediction of chemical exposure (and associated uncertainty) for thousands of chemicals at a time. Integral to this modeling effort, EPA is developing data resources across the exposure continuum that includes application of high-resolution mass spectrometry (HRMS) non-targeted analysis (NTA) methods providing measurement capability at scale with the number of chemicals in commerce. These research efforts are integrated and well-tailored to support population exposure assessment to prioritize chemicals for exposure as a critical input to risk management. In addition, the exposure forecasts will allow a wide variety of stakeholders to explore sustainable initiatives like green chemistry to achieve economic, social, and environmental prosperity and protection of future generations.
Collapse
Affiliation(s)
- Timothy J Buckley
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States.
| | - Peter P Egeghy
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Kristin Isaacs
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Ann M Richard
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Caroline Ring
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Risa R Sayre
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Jon R Sobus
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Russell S Thomas
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Elin M Ulrich
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - John F Wambaugh
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Antony J Williams
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| |
Collapse
|
16
|
Lowe CN, Charest N, Ramsland C, Chang DT, Martin TM, Williams AJ. Transparency in Modeling through Careful Application of OECD's QSAR/QSPR Principles via a Curated Water Solubility Data Set. Chem Res Toxicol 2023; 36:465-478. [PMID: 36877669 PMCID: PMC10357388 DOI: 10.1021/acs.chemrestox.2c00379] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
Abstract
The need for careful assembly, training, and validation of quantitative structure-activity/property models (QSAR/QSPR) is more significant than ever as data sets become larger and sophisticated machine learning tools become increasingly ubiquitous and accessible to the scientific community. Regulatory agencies such as the United States Environmental Protection Agency must carefully scrutinize each aspect of a resulting QSAR/QSPR model to determine its potential use in environmental exposure and hazard assessment. Herein, we revisit the goals of the Organisation for Economic Cooperation and Development (OECD) in our application and discuss the validation principles for structure-activity models. We apply these principles to a model for predicting water solubility of organic compounds derived using random forest regression, a common machine learning approach in the QSA/PR literature. Using public sources, we carefully assembled and curated a data set consisting of 10,200 unique chemical structures with associated water solubility measurements. This data set was then used as a focal narrative to methodically consider the OECD's QSA/PR principles and how they can be applied to random forests. Despite some expert, mechanistically informed supervision of descriptor selection to enhance model interpretability, we achieved a model of water solubility with comparable performance to previously published models (5-fold cross validated performance 0.81 R2 and 0.98 RMSE). We hope this work will catalyze a necessary conversation around the importance of cautiously modernizing and explicitly leveraging OECD principles while pursuing state-of-the-art machine learning approaches to derive QSA/PR models suitable for regulatory consideration.
Collapse
Affiliation(s)
- Charles N. Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Nathaniel Charest
- ORAU Student Services Contractor to Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Christian Ramsland
- ORAU Student Services Contractor to Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Daniel T. Chang
- Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Todd M. Martin
- Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Antony J. Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| |
Collapse
|
17
|
Boyce M, Favela KA, Bonzo JA, Chao A, Lizarraga LE, Moody LR, Owens EO, Patlewicz G, Shah I, Sobus JR, Thomas RS, Williams AJ, Yau A, Wambaugh JF. Identifying xenobiotic metabolites with in silico prediction tools and LCMS suspect screening analysis. FRONTIERS IN TOXICOLOGY 2023; 5:1051483. [PMID: 36742129 PMCID: PMC9889941 DOI: 10.3389/ftox.2023.1051483] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 01/03/2023] [Indexed: 01/19/2023] Open
Abstract
Understanding the metabolic fate of a xenobiotic substance can help inform its potential health risks and allow for the identification of signature metabolites associated with exposure. The need to characterize metabolites of poorly studied or novel substances has shifted exposure studies towards non-targeted analysis (NTA), which often aims to profile many compounds within a sample using high-resolution liquid-chromatography mass-spectrometry (LCMS). Here we evaluate the suitability of suspect screening analysis (SSA) liquid-chromatography mass-spectrometry to inform xenobiotic chemical metabolism. Given a lack of knowledge of true metabolites for most chemicals, predictive tools were used to generate potential metabolites as suspect screening lists to guide the identification of selected xenobiotic substances and their associated metabolites. Thirty-three substances were selected to represent a diverse array of pharmaceutical, agrochemical, and industrial chemicals from Environmental Protection Agency's ToxCast chemical library. The compounds were incubated in a metabolically-active in vitro assay using primary hepatocytes and the resulting supernatant and lysate fractions were analyzed with high-resolution LCMS. Metabolites were simulated for each compound structure using software and then combined to serve as the suspect screening list. The exact masses of the predicted metabolites were then used to select LCMS features for fragmentation via tandem mass spectrometry (MS/MS). Of the starting chemicals, 12 were measured in at least one sample in either positive or negative ion mode and a subset of these were used to develop the analysis workflow. We implemented a screening level workflow for background subtraction and the incorporation of time-varying kinetics into the identification of likely metabolites. We used haloperidol as a case study to perform an in-depth analysis, which resulted in identifying five known metabolites and five molecular features that represent potential novel metabolites, two of which were assigned discrete structures based on in silico predictions. This workflow was applied to five additional test chemicals, and 15 molecular features were selected as either reported metabolites, predicted metabolites, or potential metabolites without a structural assignment. This study demonstrates that in some-but not all-cases, suspect screening analysis methods provide a means to rapidly identify and characterize metabolites of xenobiotic chemicals.
Collapse
Affiliation(s)
- Matthew Boyce
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | | | - Jessica A. Bonzo
- Thermo Fisher Scientific, South San Francisco, CA, United States
| | - Alex Chao
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Lucina E. Lizarraga
- Center for Public Health and Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Cincinnati, OH, United States
| | - Laura R. Moody
- Thermo Fisher Scientific, South San Francisco, CA, United States
| | - Elizabeth O. Owens
- Center for Public Health and Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Cincinnati, OH, United States
| | - Grace Patlewicz
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Imran Shah
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Jon R. Sobus
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Russell S. Thomas
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Antony J. Williams
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Alice Yau
- Southwest Research Institute, San Antonio, TX, United States
| | - John F. Wambaugh
- Center for Computational Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States,*Correspondence: John F. Wambaugh,
| |
Collapse
|
18
|
Black G, Lowe C, Anumol T, Bade J, Favela K, Feng YL, Knolhoff A, Mceachran A, Nuñez J, Fisher C, Peter K, Quinete NS, Sobus J, Sussman E, Watson W, Wickramasekara S, Williams A, Young T. Exploring chemical space in non-targeted analysis: a proposed ChemSpace tool. Anal Bioanal Chem 2023; 415:35-44. [PMID: 36435841 PMCID: PMC10010115 DOI: 10.1007/s00216-022-04434-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/30/2022] [Accepted: 11/09/2022] [Indexed: 11/28/2022]
Abstract
Non-targeted analysis (NTA) using high-resolution mass spectrometry allows scientists to detect and identify a broad range of compounds in diverse matrices for monitoring exposure and toxicological evaluation without a priori chemical knowledge. NTA methods present an opportunity to describe the constituents of a sample across a multidimensional swath of chemical properties, referred to as "chemical space." Understanding and communicating which region of chemical space is extractable and detectable by an NTA workflow, however, remains challenging and non-standardized. For example, many sample processing and data analysis steps influence the types of chemicals that can be detected and identified. Accordingly, it is challenging to assess whether analyte non-detection in an NTA study indicates true absence in a sample (above a detection limit) or is a false negative driven by workflow limitations. Here, we describe the need for accessible approaches that enable chemical space mapping in NTA studies, propose a tool to address this need, and highlight the different ways in which it could be implemented in NTA workflows. We identify a suite of existing predictive and analytical tools that can be used in combination to generate scores that describe the likelihood a compound will be detected and identified by a given NTA workflow based on the predicted chemical space of that workflow. Higher scores correspond to a higher likelihood of compound detection and identification in a given workflow (based on sample extraction, data acquisition, and data analysis parameters). Lower scores indicate a lower probability of detection, even if the compound is truly present in the samples of interest. Understanding the constraints of NTA workflows can be useful for stakeholders when results from NTA studies are used in real-world applications and for NTA researchers working to improve their workflow performance. The hypothetical ChemSpaceTool suggested herein could be used in both a prospective and retrospective sense. Prospectively, the tool can be used to further curate screening libraries and set identification thresholds. Retrospectively, false detections can be filtered by the plausibility of the compound identification by the selected NTA method, increasing the confidence of unknown identifications. Lastly, this work highlights the chemometric needs to make such a tool robust and usable across a wide range of NTA disciplines and invites others who are working on various models to participate in the development of the ChemSpaceTool. Ultimately, the development of a chemical space mapping tool strives to enable further standardization of NTA by improving method transparency and communication around false detection rates, thus allowing for more direct method comparisons between studies and improved reproducibility. This, in turn, is expected to promote further widespread applications of NTA beyond research-oriented settings.
Collapse
Affiliation(s)
- Gabrielle Black
- Department of Civil & Environmental Engineering, University of California Davis, Davis, CA, USA.
| | - Charles Lowe
- U.S. EPA, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC, USA
| | - Tarun Anumol
- Agilent Technologies, Inc., Santa Clara, CA, USA
| | - Jessica Bade
- Pacific Northwest National Laboratory, Richland, WA, USA
| | | | - Yong-Lai Feng
- Exposure and Biomonitoring Division, Environmental Health Science and Research Bureau, Health Canada, Ottawa, ON, Canada
| | - Ann Knolhoff
- U.S. Food and Drug Administration, Center for Food Safety and Applied Nutrition, College Park, MD, USA
| | | | - Jamie Nuñez
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Christine Fisher
- U.S. Food and Drug Administration, Center for Food Safety and Applied Nutrition, College Park, MD, USA
| | - Kathy Peter
- Center for Urban Waters, University of Washington Tacoma, Tacoma, WA, 98421, USA
| | - Natalia Soares Quinete
- Department of Chemistry and Biochemistry, Institute of Environment, Florida International University, North Miami, FL, USA
| | - Jon Sobus
- U.S. EPA, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC, USA
| | | | | | - Samanthi Wickramasekara
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Silver Spring, MD, USA
| | - Antony Williams
- U.S. EPA, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, NC, USA
| | - Tom Young
- Department of Civil & Environmental Engineering, University of California Davis, Davis, CA, USA
| |
Collapse
|
19
|
Wambaugh JF, Rager JE. Exposure forecasting - ExpoCast - for data-poor chemicals in commerce and the environment. JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2022; 32:783-793. [PMID: 36347934 PMCID: PMC9742338 DOI: 10.1038/s41370-022-00492-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/21/2022] [Accepted: 10/21/2022] [Indexed: 05/10/2023]
Abstract
Estimates of exposure are critical to prioritize and assess chemicals based on risk posed to public health and the environment. The U.S. Environmental Protection Agency (EPA) is responsible for regulating thousands of chemicals in commerce and the environment for which exposure data are limited. Since 2009 the EPA's ExpoCast ("Exposure Forecasting") project has sought to develop the data, tools, and evaluation approaches required to generate rapid and scientifically defensible exposure predictions for the full universe of existing and proposed commercial chemicals. This review article aims to summarize issues in exposure science that have been addressed through initiatives affiliated with ExpoCast. ExpoCast research has generally focused on chemical exposure as a statistical systems problem intended to inform thousands of chemicals. The project exists as a companion to EPA's ToxCast ("Toxicity Forecasting") project which has used in vitro high-throughput screening technologies to characterize potential hazard posed by thousands of chemicals for which there are limited toxicity data. Rapid prediction of chemical exposures and in vitro-in vivo extrapolation (IVIVE) of ToxCast data allow for prioritization based upon risk of adverse outcomes due to environmental chemical exposure. ExpoCast has developed (1) integrated modeling approaches to reliably predict exposure and IVIVE dose, (2) highly efficient screening tools for chemical prioritization, (3) efficient and affordable tools for generating new exposure and dose data, and (4) easily accessible exposure databases. The development of new exposure models and databases along with the application of technologies like non-targeted analysis and machine learning have transformed exposure science for data-poor chemicals. By developing high-throughput tools for chemical exposure analytics and translating those tools into public health decisions ExpoCast research has served as a crucible for identifying and addressing exposure science knowledge gaps.
Collapse
Affiliation(s)
- John F Wambaugh
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. EPA, Research Triangle Park, NC, USA.
- Department of Environmental Sciences & Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Julia E Rager
- Department of Environmental Sciences & Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
20
|
Aalizadeh R, Nikolopoulou V, Alygizakis NA, Thomaidis NS. First Novel Workflow for Semiquantification of Emerging Contaminants in Environmental Samples Analyzed by Gas Chromatography-Atmospheric Pressure Chemical Ionization-Quadrupole Time of Flight-Mass Spectrometry. Anal Chem 2022; 94:9766-9774. [PMID: 35760399 PMCID: PMC9280717 DOI: 10.1021/acs.analchem.2c01432] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
![]()
The ionization efficiency
of emerging contaminants was modeled
for the first time in gas chromatography-high-resolution mass spectrometry
(GC-HRMS) which is coupled to an atmospheric pressure chemical ionization
source (APCI). The recent chemical space has been expanded in environmental
samples such as soil, indoor dust, and sediments thanks to recent
use of high-resolution mass spectrometric techniques; however, many
of these chemicals have remained unquantified. Chemical exposure in
dust can pose potential risk to human health, and semiquantitative
analysis is potentially of need to semiquantify these newly identified
substances and assist with their risk assessment and environmental
fate. In this study, a rigorously tested semiquantification workflow
was proposed based on GC-APCI-HRMS ionization efficiency measurements
of 78 emerging contaminants. The mechanism of ionization of compounds
in the APCI source was discussed via a simple connectivity index and
topological structure. The quantitative structure–property
relationship (QSPR)-based model was also built to predict the APCI
ionization efficiencies of unknowns and later use it for their quantification
analyses. The proposed semiquantification method could be transferred
into the household indoor dust sample matrix, and it could include
the effect of recovery and matrix in the predictions of actual concentrations
of analytes. A suspect compound, which falls inside the application
domain of the tool, can be semiquantified by an online web application,
free of access at http://trams.chem.uoa.gr/semiquantification/.
Collapse
Affiliation(s)
- Reza Aalizadeh
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
| | - Varvara Nikolopoulou
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
| | - Nikiforos A Alygizakis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece.,Environmental Institute, Okružná 784/42, 97241 Koš, Slovak Republic
| | - Nikolaos S Thomaidis
- Laboratory of Analytical Chemistry, Department of Chemistry, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15771 Athens, Greece
| |
Collapse
|
21
|
A Multi-Label Classifier for Predicting the Most Appropriate Instrumental Method for the Analysis of Contaminants of Emerging Concern. Metabolites 2022; 12:metabo12030199. [PMID: 35323641 PMCID: PMC8949148 DOI: 10.3390/metabo12030199] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 02/19/2022] [Accepted: 02/21/2022] [Indexed: 02/04/2023] Open
Abstract
Liquid chromatography-high resolution mass spectrometry (LC-HRMS) and gas chromatography-high resolution mass spectrometry (GC-HRMS) have revolutionized analytical chemistry among many other disciplines. These advanced instrumentations allow to theoretically capture the whole chemical universe that is contained in samples, giving unimaginable opportunities to the scientific community. Laboratories equipped with these instruments produce a lot of data daily that can be digitally archived. Digital storage of data opens up the opportunity for retrospective suspect screening investigations for the occurrence of chemicals in the stored chromatograms. The first step of this approach involves the prediction of which data is more appropriate to be searched. In this study, we built an optimized multi-label classifier for predicting the most appropriate instrumental method (LC-HRMS or GC-HRMS or both) for the analysis of chemicals in digital specimens. The approach involved the generation of a baseline model based on the knowledge that an expert would use and the generation of an optimized machine learning model. A multi-step feature selection approach, a model selection strategy, and optimization of the classifier’s hyperparameters led to a model with accuracy that outperformed the baseline implementation. The models were used to predict the most appropriate instrumental technique for new substances. The scripts are available at GitHub and the dataset at Zenodo.
Collapse
|
22
|
McCord JP, Groff LC, Sobus JR. Quantitative non-targeted analysis: Bridging the gap between contaminant discovery and risk characterization. ENVIRONMENT INTERNATIONAL 2022; 158:107011. [PMID: 35386928 PMCID: PMC8979303 DOI: 10.1016/j.envint.2021.107011] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Chemical risk assessments follow a long-standing paradigm that integrates hazard, dose-response, and exposure information to facilitate quantitative risk characterization. Targeted analytical measurement data directly support risk assessment activities, as well as downstream risk management and compliance monitoring efforts. Yet, targeted methods have struggled to keep pace with the demands for data regarding the vast, and growing, number of known chemicals. Many contemporary monitoring studies therefore utilize non-targeted analysis (NTA) methods to screen for known chemicals with limited risk information. Qualitative NTA data has enabled identification of previously unknown compounds and characterization of data-poor compounds in support of hazard identification and exposure assessment efforts. In spite of this, NTA data have seen limited use in risk-based decision making due to uncertainties surrounding their quantitative interpretation. Significant efforts have been made in recent years to bridge this quantitative gap. Based on these advancements, quantitative NTA data, when coupled with other high-throughput data streams and predictive models, are poised to directly support 21st-century risk-based decisions. This article highlights components of the chemical risk assessment process that are influenced by NTA data, surveys the existing literature for approaches to derive quantitative estimates of chemicals from NTA measurements, and presents a conceptual framework for incorporating NTA data into contemporary risk assessment frameworks.
Collapse
Affiliation(s)
- James P. McCord
- Center for Environmental Measurement and Modeling, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
- Corresponding author. (J.P. McCord)
| | - Louis C. Groff
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
- Oak Ridge Institute for Science and Education (ORISE) Participant, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
| | - Jon R. Sobus
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
| |
Collapse
|