1
|
Canchola A, Tran LN, Woo W, Tian L, Lin YH, Chou WC. Advancing non-target analysis of emerging environmental contaminants with machine learning: Current status and future implications. ENVIRONMENT INTERNATIONAL 2025; 198:109404. [PMID: 40139034 DOI: 10.1016/j.envint.2025.109404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 03/03/2025] [Accepted: 03/18/2025] [Indexed: 03/29/2025]
Abstract
Emerging environmental contaminants (EECs) such as pharmaceuticals, pesticides, and industrial chemicals pose significant challenges for detection and identification due to their structural diversity and lack of analytical standards. Traditional targeted screening methods often fail to detect these compounds, making non-target analysis (NTA) using high-resolution mass spectrometry (HRMS) essential for identifying unknown or suspected contaminants. However, interpreting the vast datasets generated by HRMS is complex and requires advanced data processing techniques. Recent advancements in machine learning (ML) models offer great potential for enhancing NTA applications. As such, we reviewed key developments, including optimizing workflows using computational tools, improved chemical structure identification, advanced quantification methods, and enhanced toxicity prediction capabilities. It also discusses challenges and future perspectives in the field, such as refining ML tools for complex mixtures, improving inter-laboratory validation, and further integrating computational models into environmental risk assessment frameworks. By addressing these challenges, ML-assisted NTA can significantly enhance the detection, quantification, and evaluation of EECs, ultimately contributing to more effective environmental monitoring and public health protection.
Collapse
Affiliation(s)
- Alexa Canchola
- Environmental Toxicology Graduate Program, University of California, Riverside, CA 92521, United States; Department of Environmental Sciences, College of Natural & Agricultural Sciences, University of California, Riverside, CA 92521, United States
| | - Lillian N Tran
- Environmental Toxicology Graduate Program, University of California, Riverside, CA 92521, United States
| | - Wonsik Woo
- Environmental Toxicology Graduate Program, University of California, Riverside, CA 92521, United States
| | - Linhui Tian
- Department of Environmental Sciences, College of Natural & Agricultural Sciences, University of California, Riverside, CA 92521, United States
| | - Ying-Hsuan Lin
- Environmental Toxicology Graduate Program, University of California, Riverside, CA 92521, United States; Department of Environmental Sciences, College of Natural & Agricultural Sciences, University of California, Riverside, CA 92521, United States.
| | - Wei-Chun Chou
- Environmental Toxicology Graduate Program, University of California, Riverside, CA 92521, United States; Department of Environmental Sciences, College of Natural & Agricultural Sciences, University of California, Riverside, CA 92521, United States.
| |
Collapse
|
2
|
Abrahamsson D, Koronaiou LA, Johnson T, Yang J, Ji X, Lambropoulou DA. Modeling the relative response factor of small molecules in positive electrospray ionization. RSC Adv 2024; 14:37470-37482. [PMID: 39582938 PMCID: PMC11583891 DOI: 10.1039/d4ra06695b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Accepted: 11/15/2024] [Indexed: 11/26/2024] Open
Abstract
Technological advancements in liquid chromatography (LC) electrospray ionization (ESI) high-resolution mass spectrometry (HRMS) have made it an increasingly popular analytical technique in non-targeted analysis (NTA) of environmental and biological samples. One critical limitation of current methods in NTA is the lack of available analytical standards for many of the compounds detected in biological and environmental samples. Computational approaches can provide estimates of concentrations by modeling the relative response factor of a compound (RRF) expressed as the peak area of a given peak divided by its concentration. In this paper, we explore the application of molecular dynamics (MD) in the development of a computational workflow for predicting RRF. We obtained measurements of RRF for 48 compounds with LC - quadrupole time-of-flight (QTOF) MS and calculated their RRF. We used the CGenFF force field to generate the topologies and GROMACS to conduct the (MD) simulations. We calculated the Lennard-Jones and Coulomb interactions between the analytes and all other molecules in the ESI droplet, which were then sampled to construct a multilinear regression model for predicting RRF using Monte Carlo simulations. The best performing model showed a coefficient of determination (R 2) of 0.82 and a mean absolute error (MAE) of 0.13 log units. This performance is comparable to other predictive models including machine learning models. While there is a need for further evaluation of diverse chemical structures, our approach showed promise in predictions of RRF.
Collapse
Affiliation(s)
- Dimitri Abrahamsson
- Department of Pediatrics, New York University Grossman School of Medicine New York 10016 USA
- Department of Obstetrics, Gynecology and Reproductive Sciences, School of Medicine, University of California San Francisco California 94158 USA
| | - Lelouda-Athanasia Koronaiou
- Laboratory of Environmental Pollution Control, Department of Chemistry, Aristotle University of Thessaloniki University Campus 54124 Thessaloniki Greece
- Center for Interdisciplinary Research and Innovation (CIRI-AUTH), Balkan Center Thessaloniki 57001 Greece
| | - Trevor Johnson
- Department of Pediatrics, New York University Grossman School of Medicine New York 10016 USA
| | - Junjie Yang
- Department of Obstetrics, Gynecology and Reproductive Sciences, School of Medicine, University of California San Francisco California 94158 USA
| | - Xiaowen Ji
- Department of Pediatrics, New York University Grossman School of Medicine New York 10016 USA
| | - Dimitra A Lambropoulou
- Laboratory of Environmental Pollution Control, Department of Chemistry, Aristotle University of Thessaloniki University Campus 54124 Thessaloniki Greece
- Center for Interdisciplinary Research and Innovation (CIRI-AUTH), Balkan Center Thessaloniki 57001 Greece
| |
Collapse
|
3
|
Brueck CL, Xin X, Lupolt SN, Kim BF, Santo RE, Lyu Q, Williams AJ, Nachman KE, Prasse C. (Non)targeted Chemical Analysis and Risk Assessment of Organic Contaminants in Darkibor Kale Grown at Rural and Urban Farms. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:3690-3701. [PMID: 38350027 PMCID: PMC11293618 DOI: 10.1021/acs.est.3c09106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/15/2024]
Abstract
This study investigated the presence and human hazards associated with pesticides and other anthropogenic chemicals identified in kale grown in urban and rural environments. Pesticides and related compounds (i.e., surfactants and metabolites) in kale samples were evaluated using a nontargeted data acquisition for targeted analysis method which utilized a pesticide mixture containing >1,000 compounds for suspect screening and quantification. We modeled population-level exposures and assessed noncancer hazards to DEET, piperonyl butoxide, prometon, secbumeton, terbumeton, and spinosyn A using nationally representative estimates of kale consumption across life stages in the US. Our findings indicate even sensitive populations (e.g., pregnant women and children) are not likely to experience hazards from these select compounds were they to consume kale from this study. However, a strictly nontargeted chemical analytical approach identified a total of 1,822 features across all samples, and principal component analysis revealed that the kale chemical composition may have been impacted by agricultural growing practices and environmental factors. Confidence level 2 compounds that were ≥5 times more abundant in the urban samples than in rural samples (p < 0.05) included chemicals categorized as "flavoring and nutrients" and "surfactants" in the EPA's Chemicals and Products Database. Using the US-EPA's Cheminformatics Hazard Module, we identified that many of the nontarget compounds have predicted toxicity scores of "very high" for several end points related to human health. These aspects would have been overlooked using traditional targeted analysis methods, although more information is needed to ascertain whether the compounds identified through nontargeted analysis are of environmental or human health concern. As such, our approach enabled the identification of potentially hazardous compounds that, based on their hazard assessment score, merit follow-up investigations.
Collapse
Affiliation(s)
- Christopher L. Brueck
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
| | - Xiaoyue Xin
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
| | - Sara N. Lupolt
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
- Risk Sciences and Public Policy Institute, Johns Hopkins University, MD, USA
- Center for a Livable Future, Johns Hopkins University, MD, USA
| | - Brent F. Kim
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
- Center for a Livable Future, Johns Hopkins University, MD, USA
| | - Raychel E. Santo
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
- Center for a Livable Future, Johns Hopkins University, MD, USA
| | - Q. Lyu
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
| | - Antony J. Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, NC, USA
| | - Keeve E. Nachman
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
- Risk Sciences and Public Policy Institute, Johns Hopkins University, MD, USA
- Center for a Livable Future, Johns Hopkins University, MD, USA
| | - Carsten Prasse
- Department of Environmental Health and Engineering, Johns Hopkins University, MD, USA
- Risk Sciences and Public Policy Institute, Johns Hopkins University, MD, USA
| |
Collapse
|
4
|
Johnson TA, Abrahamsson DP. Quantification of chemicals in non-targeted analysis without analytical standards - Understanding the mechanism of electrospray ionization and making predictions. CURRENT OPINION IN ENVIRONMENTAL SCIENCE & HEALTH 2024; 37:100529. [PMID: 38312491 PMCID: PMC10836048 DOI: 10.1016/j.coesh.2023.100529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2024]
Abstract
The constant creation and release of new chemicals to the environment is forming an ever-widening gap between available analytical standards and known chemicals. Developing non-targeted analysis (NTA) methods that have the ability to detect a broad spectrum of compounds is critical for research and analysis of emerging contaminants. There is a need to develop methods that make it possible to identify compound structures from their MS and MS/MS information and quantify them without analytical standards. Method refinements that utilize machine learning algorithms and chemical descriptors to estimate the instrument response of particular compounds have made progress in recent years. This narrative review seeks to summarize the current state of the field of non-targeted analysis (NTA) toward quantification of unknowns without the use of analytical standards. Despite the limited accumulation of validation studies on real samples, the ongoing enhancement in data processing and refinement of machine learning tools could lead to more comprehensive chemical coverage of NTA and validated quantitative NTA methods, thus boosting confidence in their usage and enhancing the utility of quantitative NTA.
Collapse
Affiliation(s)
- Trevor A Johnson
- Division of Environmental Pediatrics, Department of Pediatrics, Grossman School of Medicine, New York University
| | - Dimitri P Abrahamsson
- Division of Environmental Pediatrics, Department of Pediatrics, Grossman School of Medicine, New York University
| |
Collapse
|